Skip to content

Latest commit

 

History

History
174 lines (130 loc) · 10.3 KB

README.md

File metadata and controls

174 lines (130 loc) · 10.3 KB

NoisyGL

Official code for NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise accepted by NeurIPS 2024. NoisyGL is a comprehensive benchmark for Graph Neural Networks under label noise (GLN). GLN is a family of robust Graph Neural Network (GNN) models, with a particular focus on performance in the presence of label noise.

Overview of the Benchmark

NoisyGL provides a fair and comprehensive platform to evaluate existing LLN and GLN works and facilitate future GLN research.

timeline

Why NoisyGL ?

NoisyGL offers the following features:

  1. A unified data loader module for diverse datasets. You can customize the configuration file of the dataset (located in config/_dataset) to modify data splitting and preprocessing strategies.
  2. Generic noise injection schemes. These schemes (utils.labelnoise), widely used in previous studies, can comprehensively evaluate the robustness of each method.
  3. Generic Base_predictor class. NoisyGL provides a generic implementation template and API for different GLN predictors (predictors.Base_predictor). You can develop your methods by overriding specific methods.
  4. Integrated hyperparameter optimization tool. NoisyGL integrates Neural Network Intelligence (NNI) provided by Microsoft (hyperparam_opt.py). You can easily optimize and update hyperparameters for each method based on the instructions in the README.

The above features provide you with convenience and freedom when using our library. You can modify the implementation details of specific methods, or add new modules to implement your novel methods within the framework we provide easily.

Installation

Note: NoisyGL depends on PyTorch, PyTorch Geometric, PyTorch Sparse and PyTorch Cluster. To streamline the installation, NoisyGL does NOT install these libraries for you. Please install them from the above links for running NoisyGL.

Required Dependencies:

  • Python 3.11+
  • torch>=2.1.0
  • pyg>=2.5.0
  • torch_sparse>=0.6.18
  • torch_cluster>=1.6.2
  • pandas
  • scipy
  • scikit-learn
  • ruamel
  • ruamel.yaml
  • nni
  • matplotlib
  • numpy
  • xlsxwriter

Quick Start

Run comprehensive benchmark.

python total_exp.py --runs 10 --methods gcn gin --datasets cora citeseer pubmed --noise_type clean uniform pair --noise_rate 0.1 0.2 --device cuda:0 --seed 3000

By running the command above, two methods 'gcn' and 'gin' will be tested on three datasets 'cora', 'citeseer', and 'pubmed' under different types and rates of label noise. Each experiment will run 10 times and the total results will be saved at ./log and named by the current timestamp. You can customize the combination of method, data, noise type, and noise rate by changing the corresponding arguments.

Run single experiment.

python single_exp.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --seed 3000

This command runs a single experiment in debug mode and is usually used for debugging. By running this, detailed experiment information will be printed on the terminal, which can be used to locate the problem.

When designing your customized predictor, you can add code blocks that only execute in debug mode in the following way:

if self.conf.training['debug']:
    print("break point")

Hyperparameter optimization.

python hyperparam_opt.py --method gcn --data cora --noise_type uniform --noise_rate 0.1 --device cuda:0 --max_trial_number 20 --trial_concurrency 4 --port 8081 --update_config True

By running the command above, an NNI manager will run on http://localhost:8081, then automatically run 20 HPO trails, each trail call 'single_exp.py' with different hyperparameters. After all HPO trials are finished, a new config file with optimized hyperparameters will overwrite the original one at "./config/gcn/gcn_cora.yaml". You can optimize hyperparameters for different methods on various datasets and noise types by changing the corresponding arguments.

Method availablegcn, smodel, forward, backward, coteaching, sce, jocor, apl, dgnn, cp, nrgnn, unionnet, rtgnn, clnode, cgnn, pignn, rncgln, crgnn, lcat

Dataset availablecora, citeseer, pubmed, amazoncom, amazonpho, dblp, blogcatalog, flickr, amazon-ratings, roman-empire

Dataset # Nodes # Edges # Feat. # Classes # Homophily Avg. # degree
Cora 2,708 5,278 1,433 7 0.81 3.90
Citeseer 3,327 4,552 3,703 6 0.74 2.74
Pubmed 19,717 44,324 500 3 0.80 4.50
Amazon-Computers 13,752 491,722 767 10 0.78 35.8
Amazon-Photos 7,650 238,162 745 8 0.83 31.1
DBLP 17,716 105,734 1,639 4 0.83 5.97
BlogCatalog 5,196 343,486 8,189 6 0.40 66.1
Flickr 7,575 239,738 12,047 9 0.24 63.3
Amazon-ratings 24,492 93,050 300 5 0.38 7.60
Roman-empire 22,662 32,927 300 18 0.05 2.90

noise typeclean, pair, uniform, random (new)

Performance overview

Test accuracy of LLN and GLN methods on DBLP dataset under 30% pair and uniform noise, respectively (10 Runs). performance overview DBLP

Citation

If our work could help your research, please cite: NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise

@article{wang2024noisygl,
      title={NoisyGL: A Comprehensive Benchmark for Graph Neural Networks under Label Noise}, 
      author={Zhonghao Wang and Danyu Sun and Sheng Zhou and Haobo Wang and Jiapei Fan and Longtao Huang and Jiajun Bu},
      year={2024},
      eprint={2406.04299},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2406.04299}, 
}

Reference

LLN:

ID Paper Method Conference/Journal
1 Training deep neural-networks using a noise adaptation layer S-model ICLR 2017
2 Making deep neural networks robust to label noise: A loss correction approach Forward CVPR 2017
3 Making deep neural networks robust to label noise: A loss correction approach Backward CVPR 2017
4 Co-teaching: Robust training of deep neural networks with extremely noisy labels Co-teaching NeurIPS 2018,
5 Symmetric Cross Entropy for Robust Learning With Noisy Labels SCE ICCV 2019
6 Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization JoCoR CVPR 2020
7 Normalized Loss Functions for Deep Learning with Noisy Labels APL ICLR 2020

GLN:

ID Paper Method Conference/Journal
1 Learning Graph Neural Networks with Noisy Labels D-GNN ICLR 2019
2 Adversarial label-flipping attack and defense for graph neural networks LafAK/CP ICDM 2020
3 NRGNN: Learning a Label Noise Resistant Graph Neural Network on Sparsely and Noisily Labeled Graphs NRGNN KDD 2021
4 Unified Robust Training for Graph Neural Networks Against Label Noise Union-Net PAKDD 2021
5 Robust training of graph neural networks via noise governance RTGNN WSDM 2023
6 CLNode: Curriculum Learning for Node Classification CLNode WSDM 2023
7 Learning on Graphs under Label Noise CGNN ICASSP 2023
8 Noise-robust Graph Learning by Estimating and Leveraging Pairwise Interactions PIGNN TMLR 2023
9 Robust Node Classification on Graph Data with Graph and Label Noise RNCGLN AAAI 2024
10 Contrastive learning of graphs under label noise CRGNN Neural Netw. 2024