LLMs_SDOH_Integration

This repo contains our code for the paper Large Language Models for Integrating Social Determinant of Health Data: A Case Study on Heart Failure 30-Day Readmission Prediction.

Requirements

Heart failure 30-day hospital readmission prediction (HF_readmission_prediction):

python 3.9
imblearn==0.0
joblib==1.2.0
numpy==1.24.4
pandas==2.0.0
pymongo==4.7.0
scikit_learn==1.4.2
shap==0.45.0
tqdm==4.65.0
xgboost==1.7.6

LLMs to annotate SDOH variables (LLM_SDOH_annotation):

python 3.9
datasets==2.11.0
huggingface_hub==0.17.3
numpy==1.24.4
pandas==2.0.0
peft==0.10.0
torch==2.0.0
tqdm==4.65.0
transformers==4.34.1

Datasets

The social determinants of health (SDOH) datasets used in this study can be found below:

Dataset	Number of SDOH variables Used
NaNDA	223
AHRQ SDOHD	506

Reproducibility

1. LLM Experiments

For zero-shot and 1-shot inference of SDOH Domains for AHRQ and NaNDA variables, please use the commands in LLM_SDOH_annotation/commands folder for experiments. For example, to perform one round of inference with the following arguments run:

python general_LLM_inference_rel_extraction_col_type.py --base_model='meta-llama/Llama-2-7b-chat-hf' --feat_set='a' --num_shots=0 --input_data_file='INPUT_AHRQ_tract_2010-2018.csv' --output_data_file='a_zeroshot_llama7b-chat_domain_AHRQ_outputs.csv'

Language model: Llama-2-7b-chat-hf. Feature set: A (SDOH variable name), Number of shots (inference): 0 (i.e., zero-shot), Input file: AHRQ variables, Output file (optional): will be automatically named based on other arguments.

2. Heart Failure (HF) Readmission Prediction

The patient dataset is unavailable due to privacy reasons --- however the following commands demonstrate the steps we used to train and evaluate binary classification models (using clinical and public SDOH data):

To train binary classification models on HF 30-day hospital readmission prediction (in file, choose classification algorithm, features):

python bal_allfeats_nosmote_sgs_evaluate_baselines_nestKfold.py

To analyze results of HF models:

python sgs_analyze_baseline.py

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
HF_readmission_prediction		HF_readmission_prediction
LLM_SDOH_annotation		LLM_SDOH_annotation
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs_SDOH_Integration

Requirements

Datasets

Datasets

Reproducibility

1. LLM Experiments

2. Heart Failure (HF) Readmission Prediction

All Related Documents:

About

Releases

Packages

Languages

License

fensorechase/LLMs_SDOH_Integration

Folders and files

Latest commit

History

Repository files navigation

LLMs_SDOH_Integration

Requirements

Datasets

Datasets

Reproducibility

1. LLM Experiments

2. Heart Failure (HF) Readmission Prediction

All Related Documents:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages