Skip to content

claudiocapanema/Multi-model-Federated-Learning

Repository files navigation

Multi-model Federated Learning

This project extends PFLlib (Personalized Federated Learning Algorithm Library) to train multiple models simultaneously.

License: GPL v2 arXiv

Figure 1: An Example for FedAvg. You can create a scenario using generate_DATA.py and run an algorithm using main.py, clientNAME.py, and serverNAME.py.

We expose this user-friendly algorithm library (with an integrated evaluation platform) for beginners who intend to start federated learning (FL) study.

  • 34 traditional FL (tFL) or personalized FL (pFL) algorithms, 3 scenarios, and 20 datasets.

  • Some experimental results are avalible here.

  • Refer to this guide to learn how to use it.

  • This library can simulate scenarios using the 4-layer CNN on Cifar100 for 500 clients on one NVIDIA GeForce RTX 3090 GPU card with only 5.08GB GPU memory cost.

  • PFLlib primarily focuses on data (statistical) heterogeneity. For algorithms and an evaluation platform that address both data and model heterogeneity, please refer to our extended project Heterogeneous Federated Learning (HtFL).

  • As we strive to meet diverse user demands, frequent updates to the project may alter default settings and scenario creation codes, affecting experimental results.

  • Closed issues may help you a lot.

  • When submitting pull requests, please provide sufficient instructions and examples in the comment box.

The origin of the statistical heterogeneity phenomenon is the personalization of users, who generate non-IID (not Independent and Identically Distributed) and unbalanced data. With statistical heterogeneity existing in the FL scenario, a myriad of approaches have been proposed to crack this hard nut. In contrast, the personalized FL (pFL) may take advantage of the statistically heterogeneous data to learn the personalized model for each user.

Thanks to @Stonesjtu, this library can also record the GPU memory usage for the model. By using the package opacus, we introduce DP (differential privacy) into this library (please refer to ./system/flcore/clients/clientavg.py for example). Following FedCG, we also introduce the DLG (Deep Leakage from Gradients) attack and PSNR (Peak Signal-to-Noise Ratio) metric to evaluate the privacy-preserving ability of tFL/pFL algorithms (please refer to ./system/flcore/servers/serveravg.py for example). Now we can train on some clients and evaluate performance on other new clients by setting args.num_new_clients in ./system/main.py. Note that not all the tFL/pFL algorithms support this feature.

Citation

@article{zhang2023pfllib,
  title={PFLlib: Personalized Federated Learning Algorithm Library},
  author={Zhang, Jianqing and Liu, Yang and Hua, Yang and Wang, Hao and Song, Tao and Xue, Zhengui and Ma, Ruhui and Cao, Jian},
  journal={arXiv preprint arXiv:2312.04992},
  year={2023}
}

Algorithms with code (updating)

Traditional FL (tFL)

Personalized FL (pFL)

Datasets and scenarios (updating)

For the label skew scenario, we introduce 14 famous datasets: MNIST, EMNIST, Fashion-MNIST, Cifar10, Cifar100, AG News, Sogou News, Tiny-ImageNet, Country211, Flowers102, GTSRB, Shakespeare, and Stanford Cars, they can be easy split into IID and non-IID version. Since some codes for generating datasets such as splitting are the same for all datasets, we move these codes into ./dataset/utils/dataset_utils.py. In the non-IID scenario, 2 situations exist. The first one is the pathological non-IID scenario, the second one is the practical non-IID scenario. In the pathological non-IID scenario, for example, the data on each client only contains the specific number of labels (maybe only 2 labels), though the data on all clients contains 10 labels such as the MNIST dataset. In the practical non-IID scenario, Dirichlet distribution is utilized (please refer to this paper for details). We can input balance for the iid scenario, where the data are uniformly distributed.

For the feature shift scenario, we use 3 datasets that are widely used in Domain Adaptation: Amazon Review (fetch raw data from this site), Digit5 (fetch raw data from this site), and DomainNet.

For the real-world (or IoT) scenario, we also introduce 3 naturally separated datasets: Omniglot (20 clients, 50 labels), HAR (Human Activity Recognition) (30 clients, 6 labels), PAMAP2 (9 clients, 12 labels). For the details of datasets and FL algorithms in IoT, please refer to my FL-IoT repo.

If you need another data set, just write another code to download it and then use the utils.

Simulate multiple datasets

  • MNIST

    cd ./dataset
    # python generate_MNIST.py iid - - # for iid and unbalanced scenario
    # python generate_MNIST.py iid balance - # for iid and balanced scenario
    # python generate_MNIST.py noniid - pat # for pathological noniid and unbalanced scenario
    python generate_MNIST.py noniid - dir # for practical noniid and unbalanced scenario
    # python generate_MNIST.py noniid - exdir # for Extended Dirichlet strategy 
    
  • CIFAR-10

    python generate_Cifar10.py noniid - dir
    

The output of python generate_MNIST.py noniid - dir

Number of classes: 10
Client 0         Size of data: 2630      Labels:  [0 1 4 5 7 8 9]
                 Samples of labels:  [(0, 140), (1, 890), (4, 1), (5, 319), (7, 29), (8, 1067), (9, 184)]
--------------------------------------------------
Client 1         Size of data: 499       Labels:  [0 2 5 6 8 9]
                 Samples of labels:  [(0, 5), (2, 27), (5, 19), (6, 335), (8, 6), (9, 107)]
--------------------------------------------------
Client 2         Size of data: 1630      Labels:  [0 3 6 9]
                 Samples of labels:  [(0, 3), (3, 143), (6, 1461), (9, 23)]
--------------------------------------------------
Show more
Client 3         Size of data: 2541      Labels:  [0 4 7 8]
                 Samples of labels:  [(0, 155), (4, 1), (7, 2381), (8, 4)]
--------------------------------------------------
Client 4         Size of data: 1917      Labels:  [0 1 3 5 6 8 9]
                 Samples of labels:  [(0, 71), (1, 13), (3, 207), (5, 1129), (6, 6), (8, 40), (9, 451)]
--------------------------------------------------
Client 5         Size of data: 6189      Labels:  [1 3 4 8 9]
                 Samples of labels:  [(1, 38), (3, 1), (4, 39), (8, 25), (9, 6086)]
--------------------------------------------------
Client 6         Size of data: 1256      Labels:  [1 2 3 6 8 9]
                 Samples of labels:  [(1, 873), (2, 176), (3, 46), (6, 42), (8, 13), (9, 106)]
--------------------------------------------------
Client 7         Size of data: 1269      Labels:  [1 2 3 5 7 8]
                 Samples of labels:  [(1, 21), (2, 5), (3, 11), (5, 787), (7, 4), (8, 441)]
--------------------------------------------------
Client 8         Size of data: 3600      Labels:  [0 1]
                 Samples of labels:  [(0, 1), (1, 3599)]
--------------------------------------------------
Client 9         Size of data: 4006      Labels:  [0 1 2 4 6]
                 Samples of labels:  [(0, 633), (1, 1997), (2, 89), (4, 519), (6, 768)]
--------------------------------------------------
Client 10        Size of data: 3116      Labels:  [0 1 2 3 4 5]
                 Samples of labels:  [(0, 920), (1, 2), (2, 1450), (3, 513), (4, 134), (5, 97)]
--------------------------------------------------
Client 11        Size of data: 3772      Labels:  [2 3 5]
                 Samples of labels:  [(2, 159), (3, 3055), (5, 558)]
--------------------------------------------------
Client 12        Size of data: 3613      Labels:  [0 1 2 5]
                 Samples of labels:  [(0, 8), (1, 180), (2, 3277), (5, 148)]
--------------------------------------------------
Client 13        Size of data: 2134      Labels:  [1 2 4 5 7]
                 Samples of labels:  [(1, 237), (2, 343), (4, 6), (5, 453), (7, 1095)]
--------------------------------------------------
Client 14        Size of data: 5730      Labels:  [5 7]
                 Samples of labels:  [(5, 2719), (7, 3011)]
--------------------------------------------------
Client 15        Size of data: 5448      Labels:  [0 3 5 6 7 8]
                 Samples of labels:  [(0, 31), (3, 1785), (5, 16), (6, 4), (7, 756), (8, 2856)]
--------------------------------------------------
Client 16        Size of data: 3628      Labels:  [0]
                 Samples of labels:  [(0, 3628)]
--------------------------------------------------
Client 17        Size of data: 5653      Labels:  [1 2 3 4 5 7 8]
                 Samples of labels:  [(1, 26), (2, 1463), (3, 1379), (4, 335), (5, 60), (7, 17), (8, 2373)]
--------------------------------------------------
Client 18        Size of data: 5266      Labels:  [0 5 6]
                 Samples of labels:  [(0, 998), (5, 8), (6, 4260)]
--------------------------------------------------
Client 19        Size of data: 6103      Labels:  [0 1 2 3 4 9]
                 Samples of labels:  [(0, 310), (1, 1), (2, 1), (3, 1), (4, 5789), (9, 1)]
--------------------------------------------------
Total number of samples: 70000
The number of train samples: [1972, 374, 1222, 1905, 1437, 4641, 942, 951, 2700, 3004, 2337, 2829, 2709, 1600, 4297, 4086, 2721, 4239, 3949, 4577]
The number of test samples: [658, 125, 408, 636, 480, 1548, 314, 318, 900, 1002, 779, 943, 904, 534, 1433, 1362, 907, 1414, 1317, 1526]

Saving to disk.

Finish generating dataset.

Models

Environments

Install CUDA.

Install conda and activate conda.

conda env create -f env_cuda_latest.yaml # You may need to downgrade the torch using pip to match the CUDA version

How to start simulating (examples for Multi-model FedAvg)

  • Create proper environments (see Environments).

  • Download this project to an appropriate location using git.

    git clone https://github.com/TsingZ0/PFLlib.git
  • Build evaluation scenarios (see Datasets and scenarios (updating)).

  • Run evaluation:

    cd ./system
    python main.py -data EMNIST -data Cifar10 -al 0.1 -al 5.0 -m dnn -m cnn -algo FedAvg -gr 10 -did 0 -jr 0.4 # using the MNIST and Cifar-10 datasets, the FedAvg algorithm, and the one DNN and one CNN models

Note: It is preferable to tune algorithm-specific hyper-parameters before using any algorithm on a new machine.

Practical situations

If you need to simulate FL under practical situations, which includes client dropout, slow trainers, slow senders, and network TTL, you can set the following parameters to realize it.

  • -cdr: The dropout rate for total clients. The selected clients will randomly drop at each training round.
  • -tsr and -ssr: The rates for slow trainers and slow senders among all clients. Once a client is selected as a "slow trainer"/"slow sender", for example, it will always train/send slower than the original one.
  • -tth: The threshold for network TTL (ms).

Easy to extend

It is easy to add new algorithms and datasets to this library.

  • To add a new dataset into this library, all you need to do is write the download code and use the utils which is similar to ./dataset/generate_MNIST.py (you can also consider it as the template).

  • To add a new algorithm, you can utilize the class Server and class Client, which are wrote in ./system/flcore/servers/serverbase.py and ./system/flcore/clients/clientbase.py, respectively.

  • To add a new model, just add it into ./system/flcore/trainmodel/models.py.

  • If you have a new optimizer while training, please add it into ./system/flcore/optimizers/fedoptimizer.py

  • The evaluation platform is also convenient for users to build a new platform for specific applications, such as our FL-IoT and HtFL.

Citing

f FedPredict has been useful to you, please cite our papers: IEEE DCOSS-IoT 2023 and IEEE DOCSS-IoT 2024(to appear). The BibTeX is presented as follows:

@inproceedings{capanema2023fedpredict,
  title={FedPredict: Combining Global and Local Parameters in the Prediction Step of Federated Learning},
  author={Capanema, Cl{\'a}udio GS and de Souza, Allan M and Silva, Fabr{\'\i}cio A and Villas, Leandro A and Loureiro, Antonio AF},
  booktitle={2023 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)},
  pages={17--24},
  year={2023},
  doi={https://doi.org/10.1109/DCOSS-IoT58021.2023.00012},
  organization={IEEE}
}
@inproceedings{capanema2024modular,
  title={A Modular Plugin for Concept Drift in Federated Learning},
  author={Capanema, Cl{\'a}udio GS and de Souza, Joahannes B D da Costa, Fabr{\'\i}cio A and Villas, Leandro A and Loureiro, Antonio AF Loureiro},
  booktitle={2024 20th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)},
  pubstate={inpress},
  year={2024},
  organization={IEEE}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages