This is a python library to conduct a dynamic treatment regime (DTR), pydtr
.
A DTR is a paradigm that attempts to select optimal treatments adaptively for individual patients.
Pydtr enables you to implement DTR methods easily by using sklearn-based interfaces.
Method | Single binary treatment | Multiple treatments | Multinomial treatment | Continuous treatment | Modeling flexibility | Interpretability |
---|---|---|---|---|---|---|
IqLearnReg (with sklearn) |
✅ | ✅ | ✅ (with pipeline) |
✅ (with arbitrary regression models) |
||
IqLearnReg (with statsmodels) |
✅ | ✅ | ✅ | limited to OLS | ✅ (with confidence intervals) |
|
GEstimation | WIP | WIP | WIP | WIP | WIP |
IqLearnReg
means a regression method of iterative q-learning.
When there are categorical independent variables and you use a sklearn model as a regression function, you need to encode the categorical variables before using the model.
We recommend to encode categorical variables by category_encoders
and combine the encoders with the sklearn model by sklearn.pipeline
.
G-estimation, a famous method of DTR, is now unavailable.
- python>=3.6
- pandas>=1.1.2
- scikit-learn>=0.23.2
- numpy>=1.19.2
- statsmodels>=0.12.0
pip install pydtr
git clone https://github.com/fullflu/pydtr.git
cd pydtr
python setup.py install
You need to import libraries and prepare data.
# import
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from pydtr.iqlearn.regression import IqLearnReg
# create sample dataframe
n = 10
thres = int(n / 2)
df = pd.DataFrame()
df["L1"] = np.arange(n)
df["A1"] = [0, 1] * int(n / 2)
df["A2"] = [0] * int(n / 2) + [1] * int(n / 2)
df["Y1"] = np.zeros(n)
df["Y2"] = np.zeros(n)
You can use sklearn-based models.
# set model info
model_info = [
{
"model": RandomForestRegressor(),
"action_dict": {"A1": [0, 1]},
"feature": ["L1", "A1"],
"outcome": "Y1"
},
{
"model": RandomForestRegressor(),
"action_dict": {"A2": [0, 1]},
"feature": ["L1", "A1", "Y1", "A2"],
"outcome": "Y2"
}
]
# fit model
dtr_model = IqLearnReg(
n_stages=2,
model_info=model_info
)
dtr_model.fit(df)
# predict optimal atcions
opt_action_stage_1 = dtr_model.predict(df, 0)
opt_action_stage_2 = dtr_model.predict(df, 1)
opt_action_all_stages = dtr_model.predict_all_stages(df)
You can also use statsmodels-based models.
# set model info
model_info = [
{
"model": "p_outcome ~ L1 * A1",
"action_dict": {"A1": [0, 1]},
"feature": ["L1", "A1"],
"outcome": "Y1"
},
{
"model": "p_outcome ~ L1 + A1 + Y1 * A2",
"action_dict": {"A2": [0, 1]},
"feature": ["L1", "A1", "Y1", "A2"],
"outcome": "Y2"
}
]
# fit model
dtr_model = IqLearnReg(
n_stages=2,
model_info=model_info
)
dtr_model.fit(df)
# predict optimal atcions
opt_action_stage_1 = dtr_model.predict(df, 0)
opt_action_stage_2 = dtr_model.predict(df, 1)
opt_action_all_stages = dtr_model.predict_all_stages(df)
Please see examples to get more information.
Please feel free to create issues or to send pull-requests!
If all checkes have passed in pull-requests, I will merge and release them.
├── .circleci
│ ├── config.yml
├── .github
│ ├── CODEOWNERS
├── LICENSE
├── MANIFEST.IN
├── Makefile
├── README.md
├── examples
│ ├── ...several notebooks...
├── setup.cfg
├── setup.py
├── src
│ ├── pydtr
│ │ ├── __init__.py
│ │ └── iqlearn
│ │ ├── __init__.py
│ │ ├── base.py
│ │ └── regression.py
└── tests
├── test_iqlearn_sklearn_predict.py
└── test_iqlearn_sm_predict.py
- Chakraborty, B, Moodie, EE. Statistical Methods for Dynamic Treatment Regimes. Springer, New York, 2013.