Classification Methods - Predicting head and neck cancer in patients using epigenomics data and

Before a machine learning algorithm can be used to make data projections, it must first learn. Learning implies that the algorithm must be trained on many data examples before making any predictions. The number of examples fed to the algorithm will be in the thousands or even millions. Once the machine learning algorithm has learned from the training feature, it can be used to make forecasts on other data. The machine learning algorithm attempts to sort data into various classes after learning the patterns hidden in the features in a classification model like ours, where the goal is to predict whether the patient has cancer or not. An introduction to the machine learning algorithms that we have used in this research are explained in this chapter.

4.3.1 Decision Trees

Decision tree learning is a data mining method that maps observations of input data using a decision tree as a predictive model. These observations aid in reaching a conclusion about the data's target value. A classification tree is a tree model that accepts a finite set of values, while a regression tree is a decision tree that accepts constant values. A decision tree does not constitute a decision

explicitly; rather, the classification tree that results can be used as a decision-making input. In a decision tree model, a leaf represents a class label, and a branch represents a grouping of characteristics that leads to the class labels.

Xgboost is a gradient boosting algorithm that uses a decision-tree-based ensemble Machine Learning algorithm. Gradient boosting is a supervised learning algorithm that combines the estimates of a series of simpler, weaker models (decision tree) to attempt to correctly predict a target variable. This is used for better speed and performance when decision tree behaves as a weak performer.

4.3.2 Random Forests

A tree may be trained or taught by splitting the source data set into subsets based on attribute value tests. To get the complete model of the decision tree, this procedure must be repeated until a subset at a node has only one value, or until splitting no longer adds value to the forecasts. This technique is known as recursive partitioning, and it is the most widely used approach for training decision trees. Many versions of a straightforward decision tree use a combination of several decision trees. Random Forests, Rotation Forests, and Bagging Decision Trees are examples of algorithms. There are also different algorithms for implementing decision trees. They are quite easy to understand and to reason with. They also require comparatively little preparation of data. An additional advantage is that, even when using large datasets, the necessary computing resources are quite low.

4.3.3 Support Vector Machines

Support vector machines, or SVMs, are a supervised learning algorithm that provides an alternate view of logistic regression, the most basic classification algorithm. Support vector motors look for a model that separates the classes exactly with the same amount of margin on both sides, with the support vectors referred to as samples on the margin. In SVM, three kernels are commonly used: linear, radial basis function (RBG), and polynomial.

4.3.4 Naive Bayes

The Bayes theorem assumes that qualities are independent, so naive Bayes classifiers are straightforward probabilistic classifiers. Given a vector representing some traits, the naive Bayes will assign a probability to all possible outcomes (also known as classes) given this vector on an abstract level. It is a conditional probability model. Based on Bayes' theorem, a credible and computable model for all the possibilities that need to be created can be built. The naive Bayes probability model can then be used to create a classifier. The naive Bayes classifier normally combines a probability model with a law of choice. Which theory should be selected is determined by this law. The most common rule is to choose the one with the greatest probability, or "maximum posteriori" (MAP). The simplicity of naive Bayes classifiers is also one of their advantages. It will also converge faster than other models if the function's conditional independence assumptions are fully satisfied. One of the drawbacks of conditional independence is that it is unable to appropriately model feature-to-feature relationships.

4.3.5 K-Nearest Neighbors

The K-Nearest Neighbors or KNN algorithm is a supervised learning algorithm that computes classification by looking at the groups of the K-Nearest neighbors of the training data. While the algorithm is being trained, the input data and classes are processed. There are a variety of techniques for determining the distance between records. Euclidean distance can be used with continuous data.

For finite variables, another metric, such as the Hamming interval, can be used. The following are some of the most prevalent distance measuring methods.

Euclidean Distance

Squared Euclidean Distance Manhattan Distance

Chessboard Distance or Chebyshev distance.

Increasing the value of k reduces the impact of noise information on prediction. The broad values of k, on the other hand, mean that groups would have less different boundaries between them. The appearance of noise, in general, reduces the algorithm's accuracy considerably. Another issue that may limit the algorithm's precision is irrelevant traits. There are a variety of methods for selecting and scaling features to maximize accuracy. One method is to use evolutionary algorithms.

4.3.6 Multilayer Perceptron (Neural Network)

The multilayer perceptron is one kind of artificial neural network. Artificial neural networks are based on a simplified model of how humans think. They make node layers, each with its own set of input and output values. The nodes are turned on by an activation feature (also called neurons). This capability would be activated by a neuron mixture or a sequence of inputs in a variety of ways. Until the activation mechanism is activated, the neuron should send its output signal through the outgoing channels. This will then activate the next row of activation functions in the network's next layer before the network output is obtained.

A multilayer perceptron in a forward-directed map is made up of multiple layers, each of which is only connected to the next. Every neuron in the layer has a non-linear activation role to correctly model the way neurons in the biological brain are stimulated. A multilayer perceptron has at least three layers: an input, an output, and one or two unseen layers. A deep neural network is one that still uses at least one hidden layer. A multilayer perceptron is equipped using backpropagation. At the start of the training stage, all of the neurons' weights are set to a default value. The neuron weights are adjusted based on the performance errors of each training data set.

4.3.7 Artificial Neural Network

The artificial neural network (ANN) is a data processing model based on the biological neuron system. To solve problems, it is made up of a vast number of strongly integrated computing components known as neurons. It takes a non-linear direction and processes data in parallel through

all nodes. A neural network is a complex, adaptive system. It is adaptive if it can change the weights of inputs and change its internal structure. The neural network was created to solve problems that are simple for humans but complex for computers, such as recognizing cat and dog pictures and recognizing numbered pictures. Pattern identification is a term used to describe these concerns.

Artificial neural networks are divided into two types: feedforward and feedback artificial neural networks. A feedforward neural network is a non-recursive network. This layer's neurons are only linked to neurons in the next layer and do not form a cycle. Signals in a feedforward system can pass in one direction, to the output layer. Cycles are present in feedback neural networks. By inserting loops in the network, signals will propagate in both directions. The network's behavior may change over time because of the feedback loops. Recurrent neural networks are also known as feedback neural networks.

4.3.8 Recurrent Neural Network

Recurrent neural networks (RNN) are more difficult to understand. They save the output of processing nodes and feed it back into the model as a result (they did not pass the information in one direction only). The model is said to learn to predict the outcome of a layer in this way. Each node in the RNN model serves as a memory cell, allowing computation and execution to continue.

During backpropagation, if the network's forecast is incorrect, the machine self-learns and operates against the right prediction. The sequential information in the input data is recorded by RNN. The parameters of RNNs are spread through time measures. This is generally referred to as parameter sharing. Therefore, there are less parameters to train and the computing cost are smaller.

Dans le document Predicting head and neck cancer in patients using epigenomics data and advanced machine learning methods (Page 34-40)