Lexicographic Refinements in Possibilistic **Decision** **Trees** Nahla Ben Amor 1 and Zeineb El Khalfi 2 and Helene Fargier ´ ` 3 and Regis Sabbadin ´ 4
Abstract. Possibilistic **decision** theory has been proposed twenty
years ago and has had several extensions since then. Because of the lack of **decision** power of possibilistic **decision** theory, several refine- ments have then been proposed. Unfortunately, these refinements do not allow to circumvent the difficulty when the **decision** problem is sequential. In this article, we propose to extend lexicographic refine- ments to possibilistic **decision** **trees**. We show, in particular, that they still benefit from an Expected Utility (EU) grounding. We also pro- vide qualitative dynamic programming algorithms to compute lexi- cographic optimal strategies. The paper is completed with an exper- imental study that shows the feasibility and the interest of the ap- proach.

En savoir plus
Since its inception in 2009, Bitcoin has been mired in con- troversies for providing a haven for illegal activities. Several types of illicit users hide behind the blanket of anonymity. Uncovering these entities is key for forensic investigations. Current methods utilize machine learning for identifying these illicit entities. However, the existing approaches only focus on a limited category of illicit users. The current paper proposes to address the issue by implementing an ensemble of **decision** **trees** for supervised learning. More parameters allow the ensemble model to learn discriminating features that can categorize multiple groups of illicit users from licit users. To evaluate the model, a dataset of 2059 real-life en- tities on Bitcoin was extracted from the Blockchain. Nine features were engineered to train the model for segregat- ing 28 different licit-illicit categories of users. The proposed model provided a reliable tool for forensic study. Empirical evaluation of the proposed model vis-a-vis three existing benchmark models was performed to highlight its efficacy. Experiments showed that the specificity and sensitivity of the proposed model were comparable to other models. CCS Concepts: • Computing methodologies → Artifi- cial intelligence; Boosting.

En savoir plus
4.1 The Overfitting Phenomenon
The first set of experiments aims to show the existence of the overfitting behaviour for the SAT method of learning optimal size **decision** **trees** that classify every example in the train- ing set. For this, we learn **decision** **trees** with our MaxSAT encoding by increasing the size of the **trees** starting from 3 until we find the size that classifies correctly every ex- ample in the training set. The final **decision** tree obtained with the proposed MaxSAT model corresponds to the SAT model of [Narodytska et al., 2018]. We use the hold-out method to split training set and testing set for all datasets used. Following [Narodytska et al., 2018], we choose (only in this experiment) three (small) different splitting ratios r = {0.05, 0.1, 0.2} to generate training sets. And remain- ing examples are used for testing. This process is repeated 10 times using different randomisation seeds. We use the RC2 MaxSAT solver [Ignatiev et al., 2019]. In the context of these experiments, the solver is left with no limit until it finds the optimal solution (in terms of training accuracy).

En savoir plus
essghaier.fatma@gmail.com
2
IRIT, Toulouse, France, email: fargier@irit.fr
Abstract. This paper raises the question of solving multi-criteria se- quential **decision** problems under uncertainty. It proposes to extend to possibilistic **decision** **trees** the **decision** rules presented in [1] for non se- quential problems. It present a series of algorithms for this new frame- work: Dynamic Programming can be used and provide an optimal strat- egy for rules that satisfy the property of monotonicity. There is no guar- antee of optimality for those that do not - hence the definition of ded- icated algorithms. This paper concludes by an empirical comparison of the algorithms.

En savoir plus
request) whether the call should be challenged or not. The performance of the policy is evaluated through the feed-back of the end-users. Real-time and adaptive reﬁnement of the protection policy requires machine learning techniques that are both: (1) adaptive: face to the continuous change of both the attack strategies and the legal activities, and (2) expressive: i.e. deﬁnes concrete signatures. In particular, the **decision** **trees** are known for their high expressivity but they support only slight changes over time. Although pro- posals for adaptive **decision** **trees** exist ([10, 2]), we propose a more appropriate approach similarly to [3]: we extract signatures from **decision** **trees** learnt in parallel then the sig- natures are combined with conﬂict resolution.

En savoir plus
Gembloux Agricultural University, Gembloux, Belgium
Email: brostaux.y@fsagx.ac.be
Abstract
Random forests have been introduced by Leo Breiman (2001) as a new learning algorithm, extend- ing the capabilities of **decision** **trees** by aggregating and randomising them. We explored the effects of the introduction of noise and irrelevant variables in the training set on the learning curve of a ran- dom forest classifier and compared them to the results of a classical **decision** tree algorithm inspired by Breiman's CART (1984). This study was realized by simulating 23 artificial binary concepts pre- senting a wide range of complexity and dimension (4 to 10 relevant variables), adding different noise and irrelevant variables rates to learning samples of various sizes (50 to 5000 examples). It ap- peared that random forests and individual **decision** **trees** have different sensitivities to those pertur- bation factors. The initial slope of the learning curve is more affected by irrelevant variables than by noise on both algorithms, but counterintuitively random forests show a greater sensitivity to noise than **decision** **trees** for this parameter. Globally, average learning speed is quite similar between the two algorithms but random forests better exploit both small and big samples : their learning curve starts lower and is not affected by the asymptotical limitation showed by single **decision** **trees**.

En savoir plus
Accurate probability estimation is important for problems like classification and probability-based ranking. Provost and Domingos’ Probability Estimation **Trees** (PETs) [7] turn off pruning and collapsing in C4.5 to keep some branches that may not be useful for classification but are crucial for accurate probability estimation. The final version is called C4.4. PETs also use Laplace smoothing to deal with the pure nodes that contain samples from the same class. Instead of assigning a probability of 1 or 0, smoothing methods try to give a more modest estimation. Other smoothing approaches, such as m-Branch [2] and Ling&Yan’s algorithm [6], are also developed. Using a probability density estimator at each leaf is another improvement to tackle the “uniform probability distribution” problem of **decision** **trees**. Kohavi [4] proposed an Na¨ıve Bayes Tree (NBTree), in which a na¨ıve Bayes is deployed at each leaf to produce probabilities. The intuition behind it is to take advantage of leaf attributes A l (l) for probability estimation. Therefore, p(c|e t ) ≈ p(c|A p (l), A l (l)).

En savoir plus
Abstract
Learning from uncertain data has been drawing increasing attention in recent years. In this paper, we propose a tree induction approach which can not only handle uncertain data, but also furthermore reduce epistemic uncertainty by querying the most valuable uncertain instances within the learning procedure. We extend classical **decision** **trees** to the framework of belief functions to deal with a variety of uncertainties in the data. In particular, we use entropy intervals extracted from the evidential likelihood to query selected uncertain querying training instances when needed, in order to improve the selection of the splitting attribute. Our experiments show the good performances of proposed active belief **decision** **trees** under different conditions.

En savoir plus
We propose a method for the generic problem of image annotation based on random subwindows extraction and ensembles of **decision** **trees** with multiple outputs. The method is evaluated on several datasets representing various types of images (microscope imaging, photographs of natural scenes, etc.).
Image Annotation
B Goal

2. Optimizing Probability Estimation
In a **decision** boundary-based theory, an explicit **decision** boundary is induced from a set of labeled samples, and an unlabeled sample s t is categorized into class c j if s t falls into the **decision** area corresponding to c j . However, tradi- tional **decision** **trees**, such as C4.5 [20] and ID3 [19], have been observed to produce poor probability estimation [18]. Normally, **decision** **trees** produce probabilities by comput- ing the class frequencies from the sample sets at leaves. For example, assuming there are 30 samples at a leaf, 20 of which are in the positive class and others belong to the negative class. Therefore, each unlabeled sample that falls into that leaf will be assigned the same probability estimates ( p ˆ + (+|s t ) = 0.67 (20/30) and ˆ p − (−|s t ) = 0.33(10/30)). Equation 4 gives a formal expression.

En savoir plus
Keywords: classification; **decision** **trees**; rubber quality; hevea; belief functions; algorithm EM.
1 Introduction
Learning a classifier from uncertain data necessitates an adequate modelling of this uncertainty, however learning with uncertain data is rarely straightforward. As data un- certainty is of epistemic nature, the standard probabilistic framework is not necessarily the best framework to deal with it. More general frameworks have therefore been pro- posed [1–3] that provide more adequate model for this type of uncertainty. Different classifier learning techniques [4–6] using these models have then been developed.

En savoir plus
essghaier.fatma@gmail.com
2
IRIT, Toulouse, France, email: fargier@irit.fr
Abstract. This paper raises the question of solving multi-criteria se- quential **decision** problems under uncertainty. It proposes to extend to possibilistic **decision** **trees** the **decision** rules presented in [1] for non se- quential problems. It present a series of algorithms for this new frame- work: Dynamic Programming can be used and provide an optimal strat- egy for rules that satisfy the property of monotonicity. There is no guar- antee of optimality for those that do not - hence the definition of ded- icated algorithms. This paper concludes by an empirical comparison of the algorithms.

En savoir plus
Key words: Machine learning, **decision** **trees**, regression **trees**, tree ensembles, Random forest
1 Introduction
This chapter focuses on a popular family of machine learning algorithms, called de- cision **trees**. The goal of tree-based algorithms is to learn a model, in the form of a **decision** tree or an ensemble of **decision** **trees**, that is able to predict the value of an output variable given the values of some input variables. Tree-based methods have been widely used to solve diverse problems in computational biology, such as DNA sequence annotation or biomarker discovery (see [1–3] for reviews). In particular, several approaches based on **decision** **trees** have been developed for the inference of gene regulatory networks (GRNs) from expression data. **Decision** **trees** have indeed several advantages that make them attractive for tackling this problem. First, they are potentially able to detect multivariate interacting effects between variables, which make them well suited for modelling gene regulation, as the regulation of the expres-

En savoir plus
main objective to help the coders and notify them whenever a secondary diagnosis is missing by counting on structured data only. Our hypothesis of fixing the primary diagnosis has helped to enhance the prediction performance.
We raised secondary issues to verify their impact on the **decision** tree. Concerning the granularity level, as the codification of the diagnoses belongs to a hierarchical classification, it is possible to use different levels of description: either coarse level with 19 features (which correspond to general chapters) or fine level of diagnoses with 126 features (more specific chapters). We compared the performances of two **decision** **trees**, each one is built using different level of diagnoses granularity. The results showed that by using the fine level of granularity we can enhance on average 5% to 10% all the quality measures regardless of the predicted diagnosis code. The prediction power seems to be related to the preciseness of the medical information.

En savoir plus
Platt’s method to a multi-class context, it is thus in principle possible to build the confidence index that we need while keeping the demonstrated performances of hard classifiers.
On the basis of Madzarov [10] and Platt [4] algorithms, we present Probabilistic **Decision** **Trees** (PDT) as an original approach to the multi-class probabilistic classification problem. The proposed PDT algorithm takes advantage of the **decision** tree architecture and of the classification posterior probability provided by PSVM. The PDT will provide fast classification (logarithmic complexity) along with associated posterior probabilities ( | ). At each node of the PDT, SVM classification associated with a sigmoid function is performed to estimate the probability of membership to each sub-group. A probability function is then built for each leaf, by following the path that the PDT has generated for it. In the context of automobile diagnosis, the PDT is used to generate a **decision** diagnostic tree where from known incidents database the most probable failure will be provided. The performances of the PDT will be measured on samples from benchmark databases and on an artificial database emulating actual car breakdown reports database. These tests show that the proposed technique is suitable for the automotive problem.

En savoir plus
b /RJT, UPS-CNRS, I /8 route de Narbonne, 3/062 Toulous e, France c MIAT, UR 875, Universilé de Toulouse, INRA, F-3/320 Castanet-Tolosan, France
Abstract
Possibilistic **decision** theory has been proposed twenty years ago and has had several extensions since then. Even though ap pealing for its ability to handle qualitative **decision** problems, possibilistic **decision** theory suffers from an important drawback. Qualitative possibilistic utility criteria compare acts through min and max operators, which Jeads to a drowning effect. To over come this Jack of **decision** power of the theory, several refinements have been proposed. Lexicographie refinements are particularly appealing since they allow to benefit from the Expected Utility background, while remaining qualitative. This article aims at extend ing lexicographie refinements to sequential **decision** problems i.e., to possibilistic **decision** **trees** and possibilistic Markov **decision** processes, when the horizon is finite. We present two criteria that refine qualitative possibilistic utilities and provide dynarnic prograrnming algorithrns for calculating Jexicographically optimal policies.

En savoir plus
Ensemble Bagged **Trees** (EDT Bagged)
EDT Bagged is a hybrid model which is a combination of bagging algorithm and **decision** **trees** [ 44 ]. Out of these methods, bagging method uses the bootstrap sampling to optimize the input training data for learning the **decision** **trees** [ 45 ]. This process can be carried out in several steps such as (i) it takes a bootstrap sample from the original training dataset, (ii) it fits the **decision** **trees** with the extracted dataset, (iii) it then finds the best fit models with the optimized dataset, (iv) it thereafter uses each of the fitted models with the optimize dataset to predict the results, and (v) it finally averages the prediction results of all the models [ 46 ]. EDT Bagged can significantly improve the prediction accuracy as bootstrap aggregation used in bagging method can reduced the variance in individual prediction methods like **decision** **trees**. EDT Bagged has been applied to solve many real world problems namely travel time prediction [ 44 ], disease prediction [ 45 ], and bankruptcy prediction [ 47 ]. In this study, we have used the EDT Bagged for prediction of bubble dissolution time when manufacturing solid structures by SLS technique.

En savoir plus
Lexicographic Refinements in Possibilistic **Decision** **Trees** Nahla Ben Amor 1 and Zeineb El Khalfi 2 and Helene Fargier ´ ` 3 and Regis Sabbadin ´ 4
Abstract. Possibilistic **decision** theory has been proposed twenty years ago and has had several extensions since then. Because of the lack of **decision** power of possibilistic **decision** theory, several refine- ments have then been proposed. Unfortunately, these refinements do not allow to circumvent the difficulty when the **decision** problem is sequential. In this article, we propose to extend lexicographic refine- ments to possibilistic **decision** **trees**. We show, in particular, that they still benefit from an Expected Utility (EU) grounding. We also pro- vide qualitative dynamic programming algorithms to compute lexi- cographic optimal strategies. The paper is completed with an exper- imental study that shows the feasibility and the interest of the ap- proach.

En savoir plus
Figure 6 displays the evolution of the cutting point parameter variance with the growing set size at the root node, for omib database and the fuzzy class definition for ULG **decision** **trees**, refitted (R) and backfitted (B) SDTs. For this dataset, all the **trees** have chosen nat- urally the same attribute in the root node. Due to the adopted fuzzy partitioning ap- proach, the chosen attribute in the root node of a non-backfitted soft **decision** tree and its α value will always coincide with the ones cho- sen in the root node of a CART regression tree. For this reason, the cutting point vari- ance is identical for CART and refitted SDT in the root node. Once backfitted, a soft deci- sion tree changes its thresholds in all the tree nodes, and thus also its parameter variance. One may observe from figure 6 that a non- backfitted soft **decision** tree, identically to a CART regression tree, presents less param- eter variance in the root node than a ULG **decision** tree. By backfitting, parameter vari- ance in a root node of a SDT increases with respect to the non-backfitted version. The ex- planation resides in the fact that by globally optimizing, the location parameters are not any more restricted to fall in the range [0,1] and therefore they are more variable with re- spect to the average.

En savoir plus
at the time the **decision** is made. Hence we have as many rows as positive learning examples as the number of comparisons made at various **decision** points at each machine as per the solution generated by the optimization module.
Finally, the **decision** tree induced using the learning algorithm can be applied directly to the same JSSP to validate the explored knowledge and as a predictive model to predict the target concept. A set of scheduling problem instances chosen from the database in accordance with their similarity indices is to be used as a test dataset for the scheduling knowledge discovered. The overall sequence of operations obtained by these rules is translated to a schedule using a schedule generator. Thus, the tree will, given any two jobs, predict which job should be dispatched first and can be thought of as a new, previously unknown rule. In addition to the prediction, **decision** **trees** and **decision** rules reveal insightful structural knowledge that can be used to further enhance the scheduling decisions.

En savoir plus