- Mousavi, Seyed Muhammad Hossein, Vincent Charles, and Tatiana Gherman. "An Evolutionary Pentagon Support Vector Finder Method." Expert Systems with Applications 150 (2020): 113284.
- https://www.sciencedirect.com/science/article/pii/S0957417420301093
This method is designed to improve classification tasks by reducing data size, removing outliers,
and identifying support vectors using evolutionary algorithms and geometric computations.
Below is a detailed explanation of the method's steps.
-
Load the Dataset:
- Begin with a dataset that contains features and labels.
- Split the dataset into training and testing subsets to evaluate the method on unseen data.
-
Evolutionary Clustering (ABC + FCM):
- Use Artificial Bee Colony (ABC) for optimization. Bees simulate data points in the clustering process.
- Fuzzy C Means (FCM) is used for soft clustering, assigning probabilities to data points belonging to clusters.
- Replace Euclidean distance in FCM with Manhattan distance to improve clustering performance.
-
Label Clusters with K-Nearest Neighbors (K-NN):
- After clustering, the data points in each cluster need to be labeled for classification.
- Use K-NN to assign labels to clusters based on the proximity of their centers to the original training data.
-
Outlier Removal Using Pentagon Area and Angles:
- Identify outliers by constructing a pentagon:
- Select one sample from the current class and four samples from other classes.
- Compute the area of the pentagon using the coordinates of its vertices.
- Calculate the internal angles of the pentagon.
- Apply thresholds:
- Identify outliers by constructing a pentagon:
-
Final Classification:
- Use Support Vector Machine (SVM) for classification.
- Train the SVM on the reduced dataset (after clustering and outlier removal).
- Compare the classification performance on:
- The original dataset.
- The reduced dataset (processed by the PSV method).
-
Validation:
- Perform classification on benchmark datasets like Iris, Wine, and EEG Eye State.
- Compare metrics such as accuracy, precision, recall, and runtime.
- Analyze improvements in classification speed and accuracy.
- Reduces computational load by removing unnecessary data points (outliers).
- Retains classification accuracy or improves it on certain datasets.
- Incorporates geometrical and evolutionary computations for robust data processing.