Fix `glem` example #9903

xnuohz · 2024-12-29T12:19:32Z

Closes #9899.

xnuohz · 2024-12-29T12:21:28Z

Namespace(gpu=0, num_runs=10, num_em_iters=1, dataset='arxiv', pl_ratio=0.5, hf_model='prajjwal1/bert-tiny', gnn_model='SAGE', gnn_hidden_channels=256, gnn_num_layers=3, gat_heads=4, lm_batch_size=256, gnn_batch_size=1024, external_pred_path=None, alpha=0.5, beta=0.5, lm_epochs=10, gnn_epochs=50, gnn_lr=0.002, lm_lr=0.001, patience=3, verbose=False, em_order='lm', lm_use_lora=False, token_on_disk=True, out_dir='output/', train_without_ext_pred=True)
Running on: NVIDIA GeForce RTX 3090
/home/ubuntu/Softwares/anaconda3/envs/pyg-dev/lib/python3.9/site-packages/ogb/nodeproppred/dataset_pyg.py:69: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  self.data, self.slices = torch.load(self.processed_paths[0])
/home/ubuntu/Projects/pytorch_geometric/torch_geometric/data/in_memory_dataset.py:300: UserWarning: It is not recommended to directly access the internal storage format `data` of an 'InMemoryDataset'. If you are absolutely certain what you are doing, access the internal storage via `InMemoryDataset._data` instead to suppress this warning. Alternatively, you can access stacked individual attributes of every graph via `dataset.{attr_name}`.
  warnings.warn(msg)
Processing...
Done!
Found tokenized file, loading may take several minutes...
40 ['node-feat.csv.gz', 'node-label.csv.gz', 'ogbn-arxiv.csv', 'num-edge-list.csv.gz', 'num-node-list.csv.gz', 'node-gpt-response.csv.gz', 'edge.csv.gz', 'node_year.csv.gz', 'node-text.csv.gz']
train_idx: 136411, gold_idx: 90941, pseudo labels ratio: 0.5, 0.49999450192982264
Building language model dataloader...-->done
GPU memory usage -- data to gpu: 0.10 GB
build GNN dataloader(GraphSAGE NeighborLoader)--># GNN Params: 217640
2024-12-29 20:10:39.539031: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-12-29 20:10:39.556333: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-12-29 20:10:39.556357: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-12-29 20:10:39.556369: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-12-29 20:10:39.559845: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-12-29 20:10:39.939306: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at prajjwal1/bert-tiny and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
# LM Params: 4391080
pretraining gnn to generate pseudo labels
Epoch: 01 Loss: 2.1609 Approx. Train: 0.4124
Epoch: 02 Loss: 1.5087 Approx. Train: 0.5615
Epoch: 03 Loss: 1.3920 Approx. Train: 0.5874
Epoch: 04 Loss: 1.3226 Approx. Train: 0.6054
Epoch: 05 Loss: 1.2771 Approx. Train: 0.6165
Train: 0.6071, Val: 0.5839
Epoch: 06 Loss: 1.2425 Approx. Train: 0.6260
Train: 0.6161, Val: 0.5913
Epoch: 07 Loss: 1.2123 Approx. Train: 0.6328
Train: 0.6206, Val: 0.5975
Epoch: 08 Loss: 1.1876 Approx. Train: 0.6383
Train: 0.6222, Val: 0.5898
Epoch: 09 Loss: 1.1627 Approx. Train: 0.6455
Train: 0.6310, Val: 0.6027
Epoch: 10 Loss: 1.1414 Approx. Train: 0.6518
Train: 0.6303, Val: 0.6014
Epoch: 11 Loss: 1.1195 Approx. Train: 0.6568
Train: 0.6427, Val: 0.5998
Epoch: 12 Loss: 1.0970 Approx. Train: 0.6620
Train: 0.6413, Val: 0.6035
Epoch: 13 Loss: 1.0788 Approx. Train: 0.6693
Train: 0.6499, Val: 0.6049
Epoch: 14 Loss: 1.0633 Approx. Train: 0.6713
Train: 0.6530, Val: 0.6076
Epoch: 15 Loss: 1.0447 Approx. Train: 0.6768
Train: 0.6590, Val: 0.6068
Pretrain Early stopped by Epoch: 15
Pretrain gnn time: 12.69s
Saved predictions to output/preds/arxiv/gnn_pretrain.pt
Pretraining acc: 0.6590, Val: 0.6068, Test: 0.5470
EM iteration: 1, EM phase: lm
Move lm model from cpu memory
Epoch 01 Loss: 1.5445 Approx. Train: 0.5880
Epoch 02 Loss: 1.0698 Approx. Train: 0.6829
Epoch 03 Loss: 0.8852 Approx. Train: 0.7028
Epoch 04 Loss: 0.7309 Approx. Train: 0.7183
Epoch 05 Loss: 0.6036 Approx. Train: 0.7326
Train: 0.8627, Val: 0.6461,
Epoch 06 Loss: 0.4984 Approx. Train: 0.7450
Train: 0.8854, Val: 0.6458,
Epoch 07 Loss: 0.4191 Approx. Train: 0.7582
Train: 0.9061, Val: 0.6488,
Epoch 08 Loss: 0.3559 Approx. Train: 0.7688
Train: 0.9240, Val: 0.6469,
Epoch 09 Loss: 0.3080 Approx. Train: 0.7767
Train: 0.9327, Val: 0.6338,
Epoch 10 Loss: 0.2675 Approx. Train: 0.7852
Train: 0.9408, Val: 0.6358,
Early stopped by Epoch: 10,                             Best acc: 0.6488472767542535
EM iteration: 2, EM phase: gnn
Move gnn model from cpu memory
Epoch: 01 Loss: 0.8052 Approx. Train: 0.6347
Epoch: 02 Loss: 0.7703 Approx. Train: 0.6367
Epoch: 03 Loss: 0.7489 Approx. Train: 0.6395
Epoch: 04 Loss: 0.7351 Approx. Train: 0.6409
Epoch: 05 Loss: 0.7227 Approx. Train: 0.6441
Train: 0.6630, Val: 0.6099,
Epoch: 06 Loss: 0.7121 Approx. Train: 0.6454
Train: 0.6626, Val: 0.6030,
Epoch: 07 Loss: 0.7043 Approx. Train: 0.6474
Train: 0.6655, Val: 0.6085,
Epoch: 08 Loss: 0.6893 Approx. Train: 0.6496
Train: 0.6652, Val: 0.6050,
Epoch: 09 Loss: 0.6791 Approx. Train: 0.6517
Train: 0.6744, Val: 0.6033,
Early stopped by Epoch: 9,                             Best acc: 0.6098526796201215
Best GNN validation acc: 0.6098526796201215,LM validation acc: 0.6488472767542535
============================
Best test acc: 0.5541633232516512, model: lm
Total running time: 0.08 hours

xnuohz · 2024-12-29T12:22:28Z

cc @puririshi98 @akihironitta

puririshi98

LGTM thanks for catching this

update

6b1c556

xnuohz requested review from wsad1 and EdisonLeeeee as code owners December 29, 2024 12:19

akihironitta assigned puririshi98 Dec 29, 2024

akihironitta added the example label Dec 29, 2024

Merge branch 'master' into fix/glem-example

6873beb

xnuohz mentioned this pull request Jan 7, 2025

Add llm generated explanations to TAGDataset #9918

Open

Merge branch 'master' into fix/glem-example

d21280d

puririshi98 approved these changes Jan 7, 2025

View reviewed changes

puririshi98 added the auto-skip-changelog label Jan 7, 2025

puririshi98 merged commit cb424a6 into pyg-team:master Jan 7, 2025
16 of 17 checks passed

xnuohz deleted the fix/glem-example branch January 8, 2025 01:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `glem` example #9903

Fix `glem` example #9903

xnuohz commented Dec 29, 2024 •

edited by akihironitta

Loading

xnuohz commented Dec 29, 2024

xnuohz commented Dec 29, 2024

puririshi98 left a comment

Fix glem example #9903

Fix glem example #9903

Conversation

xnuohz commented Dec 29, 2024 • edited by akihironitta Loading

xnuohz commented Dec 29, 2024

xnuohz commented Dec 29, 2024

puririshi98 left a comment

Choose a reason for hiding this comment

Fix `glem` example #9903

Fix `glem` example #9903

xnuohz commented Dec 29, 2024 •

edited by akihironitta

Loading