Skip to content

Commit

Permalink
Merge pull request #51 from databricks-industry-solutions/main
Browse files Browse the repository at this point in the history
updating notebook samples
  • Loading branch information
zavoraad authored Sep 24, 2024
2 parents ca15a87 + 7bd3089 commit dab698f
Show file tree
Hide file tree
Showing 10 changed files with 133 additions and 158 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ This library is designed to provide a low friction entry to performing analytics
[Ex. Reading non-compliant FHIR Data](#ex-reading-non-compliant-fhir-data)
[Ex. Hospital Patient Flow](#usage-seeing-a-patient-flow-in-a-hospital-in-real-time)

[Writing FHIR Data](#usage-writing-fhir-data)
[Writing FHIR Data](#usage-writing-fhir-data-using-no-codelow-code)


[Omop Common Data Model](dbignite/omop)

Expand Down Expand Up @@ -340,7 +341,7 @@ result.map(lambda x: json.loads(x)).foreach(lambda x: print(json.dumps(x, indent
"""
```

For limitations and more advanced usage, see [sample notebook](https://github.com/databrickslabs/dbignite/tree/main/dbignite/writer](https://github.com/databrickslabs/dbignite/blob/main/notebooks/dbignite_patient_sample.py)
For limitations and more advanced usage, see [sample notebook](notebooks/dbignite_patient_sample.py#L461-L576)



Expand Down
10 changes: 3 additions & 7 deletions dbignite/omop/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,12 @@ cdm_database='dbignite_demo'
```
create data model objects:
```
fhir_model=FhirBundles(BUNDLE_PATH)
fhir_model=FhirBundles(path=TEST_BUNDLE_PATH)
cdm_model=OmopCdm(cdm_database)
```
create a transformer:
create a transformer and transform from FHIR to OMOP CDM:
```
fhir2omop_transformer=FhirBundlesToCdm(spark)
```
transform from FHIR to your CDM:
```
fhir2omop_transformer.transform(fhir_model,cdm_model)
FhirBundlesToCdm().transform(fhir_model, cdm_model, True)
```

The returned value of the `cdm` is an OmopCDM object with an associated database (`dbignite_demo`), containing the following tables:
Expand Down
1 change: 1 addition & 0 deletions dbignite/schemas/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
All FHIR Resources definitions from HL7 are translated in this directory. The version of schemas in this repository is [IG 6.0.0](https://hl7.org/fhir/us/core/history.html). To change schemas to a different version see [this](https://github.com/databricks-industry-solutions/json2spark-schema/blob/main/01_healthcare_FHIR_demo.py) public notebook.
2 changes: 1 addition & 1 deletion dbignite/version.py
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
__version__ = "0.2.3"
__version__ = "0.2.4"

20 changes: 7 additions & 13 deletions notebooks/dbignite-demo.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# Databricks notebook source
# MAGIC %md
# MAGIC # Analysis of FHIR Bundles using SQL and Python
# MAGIC
# MAGIC
# MAGIC <img src="http://hl7.org/fhir/assets/images/fhir-logo-www.png" width = 10%>
# MAGIC
# MAGIC
# MAGIC In this demo:
# MAGIC 1. We use datarbicks `dbignite` package to ingest FHIR bundles (in `json` format) into deltalake
# MAGIC 2. Create a patient-level dashboard from the bundles
Expand All @@ -13,7 +13,7 @@
# MAGIC <br>
# MAGIC </br>
# MAGIC <img src="https://hls-eng-data-public.s3.amazonaws.com/img/FHIR-RA.png" width = 70%>
# MAGIC
# MAGIC
# MAGIC ### Data
# MAGIC The data used in this demo is generated using [synthea](https://synthetichealth.github.io/synthea/). We used [covid infections module](https://github.com/synthetichealth/synthea/blob/master/src/main/resources/modules/covid19/infection.json), which incorporates patient risk factors such as diabetes, hypertension and SDOH in determining outcomes. The data is available at `s3://hls-eng-data-public/data/synthea/fhir/fhir/`.

Expand Down Expand Up @@ -89,23 +89,17 @@
# COMMAND ----------

# DBTITLE 1,define fhir and cdm models
fhir_model=FhirBundles(BUNDLE_PATH)
fhir_model=FhirBundles(path=TEST_BUNDLE_PATH)
cdm_model=OmopCdm(cdm_database)

# COMMAND ----------

# DBTITLE 1,define transformer
fhir2omop_transformer=FhirBundlesToCdm(spark)

# COMMAND ----------

# DBTITLE 1,transform from FHIR to CDM
fhir2omop_transformer.transform(fhir_model,cdm_model)
FhirBundlesToCdm().transform(fhir_model, cdm_model, True)

# COMMAND ----------

# DBTITLE 1,Transform from CDM to a patient dashboard
cdm2dash_transformer=CdmToPersonDashboard(spark)
cdm2dash_transformer=CdmToPersonDashboard()
dash_model=PersonDashboard()
cdm2dash_transformer.transform(cdm_model,dash_model)
person_dashboard_df = dash_model.summary()
Expand Down Expand Up @@ -226,7 +220,7 @@
# MAGIC %md
# MAGIC ## License
# MAGIC Copyright / License info of the notebook. Copyright Databricks, Inc. [2021]. The source in this notebook is provided subject to the [Databricks License](https://databricks.com/db-license-source). All included or referenced third party libraries are subject to the licenses set forth below.
# MAGIC
# MAGIC
# MAGIC |Library Name|Library License|Library License URL|Library Source URL|
# MAGIC | :-: | :-:| :-: | :-:|
# MAGIC |Synthea|Apache License 2.0|https://github.com/synthetichealth/synthea/blob/master/LICENSE| https://github.com/synthetichealth/synthea|
Expand Down
117 changes: 117 additions & 0 deletions notebooks/dbignite_patient_sample.py
Original file line number Diff line number Diff line change
Expand Up @@ -457,3 +457,120 @@
# MAGIC on patient.bundleUUID = adt.bundleUUID
# MAGIC order by ssn desc, timestamp desc
# MAGIC limit 10

# COMMAND ----------

# MAGIC %md # Writing FHIR Data
# MAGIC
# MAGIC Using CMS SynPUF
# MAGIC
# MAGIC

# COMMAND ----------

from dbignite.writer.fhir_encoder import *
from dbignite.writer.bundler import *
import json

data = spark.sql("""
select
--Patient info
b.DESYNPUF_ID, --Patient.id
b.BENE_BIRTH_DT, --Patient.birthDate
b.BENE_COUNTY_CD, --Patient.address.postalCode
c.CLM_ID, --Claim.id
c.HCPCS_CD_1, --Claim.procedure.procedureCodeableConcept.coding.code
c.HCPCS_CD_2, --Claim.procedure.procedureCodeableConcept.coding.code
c.ICD9_DGNS_CD_1, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
c.ICD9_DGNS_CD_2, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
"http://www.cms.gov/Medicare/Coding/HCPCSReleaseCodeSets" as hcpcs_cdset
from hls_healthcare.hls_cms_synpuf.ben_sum b
inner join hls_healthcare.hls_cms_synpuf.car_claims c
on c.DESYNPUF_ID = b.DESYNPUF_ID
""")

# COMMAND ----------

maps = [Mapping('DESYNPUF_ID', 'Patient.id'),
Mapping('BENE_BIRTH_DT', 'Patient.birthDate'),
Mapping('BENE_COUNTY_CD', 'Patient.address.postalCode'),
Mapping('CLM_ID', 'Claim.id'),
Mapping('HCPCS_CD_1', 'Claim.procedure.procedureCodeableConcept.coding.code'),
Mapping('HCPCS_CD_2', 'Claim.procedure.procedureCodeableConcept.coding.code'),
#hardcoded values for system of HCPCS
Mapping('ICD9_DGNS_CD_1', 'Claim.diagnosis.diagnosisCodeableConcept.coding.code'),
Mapping('ICD9_DGNS_CD_2', 'Claim.diagnosis.diagnosisCodeableConcept.coding.code')
]

#For the complex mapping of multiple diagnosis and procedure codes, we override the standard mapping functions
em = FhirEncoderManager(
override_encoders ={
"Claim.procedure.procedureCodeableConcept.coding":
FhirEncoder(False, False, lambda x: [{"code": y, "system": "http://www.cms.gov/Medicare/Coding/HCPCSReleaseCodeSets"}
for y in x[0].get("code").split(",")]),
"Claim.diagnosis.diagnosisCodeableConcept.coding":
FhirEncoder(False, False, lambda x: [{"code": y, "system": "http://terminology.hl7.org/CodeSystem/icd9cm"} for y in x[0].get("code").split(",")])
})
m = MappingManager(maps, data.schema, em)
b = Bundle(m)
result = b.df_to_fhir(data)

# COMMAND ----------

#pretty print 10 values
print('\n'.join([str(x) for x in
result.map(lambda x: json.loads(x)).map(lambda x: json.dumps(x, indent=4)).take(10)]))

# COMMAND ----------

# MAGIC %md ## Inspect a single value

# COMMAND ----------

# MAGIC %sql
# MAGIC select
# MAGIC --Patient info
# MAGIC b.DESYNPUF_ID, --Patient.id
# MAGIC b.BENE_BIRTH_DT, --Patient.birthDate
# MAGIC b.BENE_COUNTY_CD, --Patient.address.postalCode
# MAGIC c.CLM_ID, --Claim.id
# MAGIC c.HCPCS_CD_1, --Claim.procedure.procedureCodeableConcept.coding.code
# MAGIC c.HCPCS_CD_2, --Claim.procedure.procedureCodeableConcept.coding.code
# MAGIC c.ICD9_DGNS_CD_1, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
# MAGIC c.ICD9_DGNS_CD_2, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
# MAGIC "http://www.cms.gov/Medicare/Coding/HCPCSReleaseCodeSets" as hcpcs_cdset
# MAGIC from hls_healthcare.hls_cms_synpuf.ben_sum b
# MAGIC inner join hls_healthcare.hls_cms_synpuf.car_claims c
# MAGIC on c.DESYNPUF_ID = b.DESYNPUF_ID
# MAGIC where c.CLM_ID = 737363357976870

# COMMAND ----------


data = spark.sql("""
select
--Patient info
b.DESYNPUF_ID, --Patient.id
b.BENE_BIRTH_DT, --Patient.birthDate
b.BENE_COUNTY_CD, --Patient.address.postalCode
c.CLM_ID, --Claim.id
c.HCPCS_CD_1, --Claim.procedure.procedureCodeableConcept.coding.code
c.HCPCS_CD_2, --Claim.procedure.procedureCodeableConcept.coding.code
c.ICD9_DGNS_CD_1, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
c.ICD9_DGNS_CD_2, --Claim.diagnosis.diagnosisCodeableConcept.coding.code
"http://www.cms.gov/Medicare/Coding/HCPCSReleaseCodeSets" as hcpcs_cdset
from hls_healthcare.hls_cms_synpuf.ben_sum b
inner join hls_healthcare.hls_cms_synpuf.car_claims c
on c.DESYNPUF_ID = b.DESYNPUF_ID
where c.CLM_ID = 737363357976870
""")

m = MappingManager(maps, data.schema, em)
b = Bundle(m)
result = b.df_to_fhir(data)

# COMMAND ----------

#pretty print 10 values
print('\n'.join([str(x) for x in
result.map(lambda x: json.loads(x)).map(lambda x: json.dumps(x, indent=4)).take(1)]))
77 changes: 0 additions & 77 deletions notebooks/fhir-mapping-demo.py

This file was deleted.

24 changes: 0 additions & 24 deletions notebooks/fhir_us_core.py

This file was deleted.

33 changes: 0 additions & 33 deletions notebooks/save-fhir-schemas.py

This file was deleted.

2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"License :: Other/Proprietary License",
"Operating System :: OS Independent",
],
packages=['dbignite', 'dbignite.omop', 'dbignite.hosp_feeds'],
packages=['dbignite', 'dbignite.omop', 'dbignite.hosp_feeds', 'dbignite.writer'],
package_data={'': ["schemas/*.json"]},
py_modules=['dbignite.data_model']
)

0 comments on commit dab698f

Please sign in to comment.