This repository is the official implementation of Federated Learning for Internet of Things: A Federated Learning Framework for On-device Anomaly Data Detection. Read our paper here: https://arxiv.org/abs/2106.07976
Due to the heterogeneity, diversity, and personalization of IoT networks, Federated Learning (FL) has a promising future in the IoT cybersecurity field. As a result, we present the FedIoT, an open research platform and benchmark to facilitate FL research in the IoT field. In particular, we propose an autoencoder based trainer to IoT traffic data for anomaly detection. In addition, with the application of federated learning approach for aggregating, we propose an efficient and practical model for the anomaly detection in various types of devices, while preserving the data privacy for each device. What is more, our platform supports three diverse computing paradigms: 1) on-device training for IoT edge devices, 2) distributed computing, and 3) single-machine simulation to meet algorithmic and system-level research requirements under different system deployment scenarios. We hope FedIoT could provide an efficient and reproducible means for developing the implementation of FL in the IoT field.
Check our slides here: https://docs.google.com/presentation/d/1aW0GlOhKOl35jMl1KBDjKafJcYjWB-T9fiUsbdBySd4/edit?usp=sharing
After git clone
-ing this repository, please run the following command to install our dependencies.
conda create -n fediot python=3.7
conda activate fediot
pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
git submodule add https://github.com/FedML-AI/FedML
cd FedML; git submodule init; git submodule update; cd ../;
For the FedML package installation, please check http://doc.fedml.ai/#/installation-distributed-computing
Run sh download.sh
under data/UCI-MLR
to download dataset.
We select the N_BaIoT data as our evaluation dataset. For the detailed of our data, please look at the data_readme.md
in the data folder.
-
FedML
: a soft repository link generated usinggit submodule add https://github.com/FedML-AI/FedML
. -
data
: provide data downloading scripts and store the downloaded datasets. Note that inFedML/data
, there also exists datasets for research, but these datasets are used for evaluating federated optimizers (e.g., FedAvg) and platforms. FedNLP supports more advanced datasets and models. -
data_preprocessing
: data loaders, partition methods and utility functions -
model
: IoT models. For example, VAE for outlier detection. -
training
: please define your owntrainer.py
by inheriting the base class inFedML/fedml-core/trainer/fedavg_trainer.py
. Some tasks can share the same trainer. -
experiments/distributed
:
experiments
is the entry point for training. It contains experiments in different platforms. We start fromdistributed
.- Every experiment integrates FOUR building blocks
FedML
(federated optimizers),data_preprocessing
,model
,trainer
. - To develop new experiments, please refer the code at
experiments/distributed/main_uci_vae.py
.
experiments/Raspberry Pi
:
- It is the code designed for the implementation on the Raspberry Pi 4b.
- It contains two blocks,
main_uci_rp.py
should be implemented on the edge device andapp.py
should be implemented on the server. - For the more detailed running setup, please look at the
Resberry_Pi Readme.md
at theexperiments/Resberry_Pi
.
experiments/centralized
:
- please provide centralized training script in this directory.
- This is used to get the reference model accuracy for FL.
- You may need to accelerate your training through distributed training on multi-GPUs and multi-machines. Please refer the code at
experiments/centralized/ae_cen_glb_test.py
.
Please read the experiment section in our paper. The main
function under the distributed now sets only for model training, please use the fl_test
to evaluate the performance.
Please note that you should change the aggregator w = 1/9 for a better result. It could be changed in FedML/fedml_api/distributed/fedavg/FedAVGAggregator.py line 77
Please cite our FedIoT and FedML paper if it helps your research. You can describe us in your paper like this: "We develop our experiments based on FedIoT [1] and FedML [2]".
@article{Zhang2021FederatedLF, title={Federated Learning for Internet of Things: A Federated Learning Framework for On-device Anomaly Data Detection}, author={Tuo Zhang and Chaoyang He and Tian-Shya Ma and Mark Ma and S. Avestimehr}, journal={ArXiv}, year={2021}, volume={abs/2106.07976} }
The corresponding author is:
Tuo Zhang [email protected]
Chaoyang He [email protected] http://chaoyanghe.com
Special Thanks to Tianhao Ma!