This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| en:iot-reloaded:classification_models [2024/12/09 13:43] – [One out] pczekalski | en:iot-reloaded:classification_models [2024/12/10 23:38] (current) – pczekalski | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| + | ====== Decision Tree-based Classification Models ====== | ||
| + | |||
| + | |||
| + | ===== Introduction ===== | ||
| + | |||
| + | Classification assigns a class mark to a given object, indicating that the object belongs to the selected class or group. In contrast to clustering, classes should be pre-existent. In many cases, clustering might be a prior step to classification. Classification might be slightly understood differently in different contexts. However, in the context of this book, it will be used to describe a process of assigning marks of pre-existing classes to objects depending on their features. | ||
| + | |||
| + | Classification is used in almost all domains of modern data analysis, including medicine, signal processing, pattern recognition, | ||
| + | |||
| + | ===== Interpretation of the model output ===== | ||
| + | |||
| + | |||
| + | The classification process consists of two steps: first, an existing data sample is used to train the classification model, and then, in the second step, the model is used to classify unseen objects, thereby predicting to which class the object belongs. As with any other prediction, in classification, | ||
| + | |||
| + | Depending on a particular output, several cases might be identified: | ||
| + | * True positive (TP) – the object belongs to the class and is classified as a class member. | ||
| + | |||
| + | **Example: | ||
| + | |||
| + | * False positive (FP) – the object that does not belong to the class is classified as a class member. | ||
| + | |||
| + | **Example: | ||
| + | |||
| + | * True negative (TN) – the object that is classified as not being a member of the class, in fact, is not a member. | ||
| + | **Example: | ||
| + | * False negative (FN) – the object that belongs to the class is classified as not belonging to it. | ||
| + | **Example: | ||
| + | |||
| + | While training the model and counting the number of training samples falling into the mentioned cases, it is possible to describe its accuracy mathematically. Here are the most commonly used statistics: | ||
| + | * Sensitivity | ||
| + | * Specificity | ||
| + | * Positive predictive value = TP/(TP+FP) | ||
| + | * Negative predictive value = TN/(TN+FN) | ||
| + | * Accuracy = (TP+TN)/ | ||
| + | |||
| + | |||
| + | ===== Training the models ===== | ||
| + | |||
| + | The classification model is trained using the initial sample data, which is split into training and testing subsamples. Usually, the training is done using the following steps: | ||
| + | - The sample is split into training and testing subsamples. | ||
| + | - Training subsample is used to train the model. | ||
| + | - Test subsample is used to acquire accuracy statistics as described earlier. | ||
| + | - Steps 1 – 3 are repeated several times (usually at least 10 – 25) to acquire average model statistics. | ||
| + | The average statistics are used to describe the model. | ||
| + | |||
| + | The model' | ||
| + | |||
| + | Unfortunately, | ||
| + | |||
| + | ===== Random sample ===== | ||
| + | |||
| + | <figure Randomsample> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| + | |||
| + | Most of the data is used for training in random sample cases (figure {{ref> | ||
| + | |||
| + | ===== K-folds ===== | ||
| + | |||
| + | <figure K-folds> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| + | |||
| + | This approach splits the training set into smaller sets called splits (in the figure {{ref> | ||
| + | * Model is trained using k-1 folds; in the picture above (figure {{ref> | ||
| + | * The model' | ||
| + | The overall performance for the k-fold cross-validation is the average performance of the individual performances computed for each split. It requires extra computing but respects data scarcity, which is why it is used in practical applications. | ||
| + | |||
| + | ===== One out ===== | ||
| + | |||
| + | <figure One_out> | ||
| + | {{ : | ||
| + | < | ||
| + | </ | ||
| + | |||
| + | This approach splits the training set into smaller sets called splits in the same way as previous methods described here (in the figure {{ref> | ||
| + | * The model is trained using n-1 samples, and only one sample is used for testing the model' | ||
| + | * The overall performance for the one-out cross-validation is the average performance of the individual performances computed for each split. It requires extra computing but respects data scarcity, which is why it is used in practical applications. | ||
| + | This method requires many iterations due to the limitations of the testing set. | ||
| + | |||
| + | |||
| + | <WRAP excludefrompdf> | ||
| + | Within the following sub-chapters, | ||
| + | |||
| + | * [[en: | ||
| + | * [[en: | ||
| + | </ | ||