-
Notifications
You must be signed in to change notification settings - Fork 7
2. Examples: anomaly detection
Fedot.Industrial offers a comprehensive set of implemented approaches for an anomaly detection task.
The anomaly detection task generally focuses on one-class classification, identifying point anomalies (outlier detection), and changepoint detection, where the exact moment when data begins to exhibit abnormal behavior can be pinpointed.
As for now Fedot.Industrial supports:
-
'stat_detector'
- statistical detector -
'arima_detector'
- ARIMA fault detector -
'iforest_detector'
- isolation forest detector -
'conv_ae_detector'
- convolutional autoencoder detector -
'lstm_ae_detector'
- LSTM autoencoder detector
The Waico development team has collected 34 datasets containing both point and group anomalies to test and compare anomaly detection algorithms. They plan to expand their SKAB (Skoltech Anomaly Benchmark) repository to 300 industrial datasets, making it one of the most comprehensive resources for anomaly detection.
For a randomly selected file from the SKAB datasets, the values of the outlier anomaly and the changepoints are as follows (a non-zero value at a particular point indicates an anomaly at that point for the outlier detection task or marks the beginning of a cluster of anomalies for the changepoint prediction task):
from sklearn.model_selection import train_test_split
df = pd.read_csv('https://raw.githubusercontent.com/waico/SKAB/master/data/valve1/1.csv',
index_col='datetime', sep=';', parse_dates=True)
train_data, test_data = train_test_split(df, train_size=0.9, shuffle=True)
train_data = train_data.iloc[:, :-2].values, train_data.iloc[:, -2].values
test_data = test_data.iloc[:, :-2].values, test_data.iloc[:, -2].values
If we wish to define an anomaly detector, we must specify a set of parameters for the FedotIndustrial
object:
api_params = dict(
problem='classification', # general task (since AD is one-class classification in core)
industrial_strategy='anomaly_detection', # defines a set of appropriate methods for a task
industrial_task_params={
'detection_window': 10, # same as forecast_length, detection sliding window
'data_type': 'time_series', # data type definition
},
metric='accuracy', # metric for a Fedot API
pop_size=10, # initial population size for EvoOptimizer
timeout=1, # time for model design (in minutes)
with_tuning=False, # whether apply tuning for a model or not
n_jobs=2 # number of jobs for parallelization
)
from fedot_ind.api.main import FedotIndustrial
detector = FedotIndustrial(**api_params)
detector.fit(train_data)
labels = detector.predict(test_data)
probs = detector.predict_proba(test_data)
metrics = detector.get_metrics(target=test_data[1],
rounding_order=3,
metric_names=('nab', 'accuracy'))
result_dict = dict(industrial_model=detector, labels=labels, metrics=metrics)
If one wishes to check out a particular implemented linear pipeline, the following steps can be taken:
from fedot_ind.core.architecture.pipelines.abstract_pipeline import AbstractPipeline
from fedot_ind.core.repository.constanst_repository import VALID_LINEAR_DETECTION_PIPELINE
pipeline_label = 'iforest_detector' # for example
node_list = VALID_LINEAR_DETECTION_PIPELINE[pipeline_label]
data_dict = dict(benchmark='valve1', dataset='1')
result = AbstractPipeline(task='classification',
task_params=dict(industrial_strategy='anomaly_detection',
detection_window=10)).evaluate_pipeline(node_list, data_dict)