-
Notifications
You must be signed in to change notification settings - Fork 91
Build a dashboard with FHIR Data Pipes and Apache Superset
Generating dashboards is a common task for many digital health projects.
This tutorial shows you how to use Apache Superset with the FHIR Data Pipes Single Machine Deployment to build a dynamic dashboard, but the principles should apply to any dashboard that supports Apache Spark SQL.
For example, here is a qwiklab that uses the open source version of Metabase for querying and visualziing data from the Spark SQL DWH
- A FHIR Data Pipes single machine deployment using local test
servers as the data source.
- Run the full pipeline at least once before following this guide.
There are several ways to install Superset. A simple option is to install
Superset locally using docker-compose. Once it is set-up, log in by visiting
http://localhost:8088/ and using the default credentials username: admin
password: admin
.
- In the upper-right corner, select Settings > Database Connections.
- In the upper-right corner, select + Database.
-
Under Supported Databases, select Apache Spark SQL.
-
Under Display Name, enter "Single Machine Test Data" or another name your choice.
-
Under SQLAlchemy URI, enter the following:
hive://hive@<IP_address_of_docker_network_gateway>:10001
To find the IP address, run:
docker network inspect bridge --format='{{json .IPAM.Config}}'
-
Select Test Connection. You should see a pop-up in the bottom-right corner that says "Connection looks good!"
- If you get an error, double-check the IP address is correct. You may also need to install the Apache Spark SQL database drivers.
-
Select Connect. Although you see an error: "An error occurred while creating databases: Fatal error" the connection is still created.
-
Close the Connect a database window by clicking outside of it and refresh the Databases page. The database you just created should appear.
Create an empty dashboard so you can add charts to it as they are created.
- At the top of the Superset page select the Dashboard tab, then click + Dashboard.
- Select the title area which reads [ untitled dashboard ] and give the dashboard a name. The examples below use "Sample Dashboard".
- In the upper-right corner, select Save.
- From the top tabs, select SQL > SQL Lab.
- If this is your first time using SQL Lab, you should be in a blank, empty
tab. If not, click New tab or press
Control+T
to create one. - On the left side under Database, select the database you created earlier, for example Single Machine Test Data.
This first chart shows the total number of patients as a "Big Number".
-
Replace the placeholder query
SELECT ...
with the following:SELECT COUNT(*) FROM Patient;
-
Click Run. After a moment, the results of the query should appear.
-
In the Results section, click Create Chart. This brings you to a new chart page.
-
Under the Chart Source pane, click Create a dataset and name it "Patient Count Dataset".
-
Under the Data tab, switch to the 4k (Big Number) chart type.
-
In the Query section, click the Metric section and set the following values:
- Column:
count(1)
- Aggregate:
MAX
- Column:
-
Click Update Chart at the bottom of the pane.
The chart updates to show a single large number of the count of patients.
- Click Save to bring up the Save chart dialog. Name the chart "Patient Count" and under Add to Dashboard select the dashboard you created previously.
The next chart shows the split of patients by gender as a pie chart.
-
Navigate back to SQL Lab. You should see the previous patient count query.
- If you'd like to keep the patient count query, make a new tab and set the database to "Single Machine Test Data" or the name you chose.
-
Enter the following as the query:
SELECT `gender` AS `gender`, count(`gender`) AS `COUNT(gender)` FROM Patient AS P GROUP BY `gender` ORDER BY `COUNT(gender)` DESC;
-
Click Run. After a moment, the results of the query should appear.
-
In the Results section, click Create Chart. This brings you to a new chart page.
-
Under the Chart Source pane, click Create a dataset and name it "Gender Split Dataset".
-
Under the Data tab, switch to the Pie Chart chart type.
-
In the Query section, set the following values:
- Dimensions:
- Column:
gender
- Column:
- Metric:
2. Column:
COUNT(gender)
3. Aggregate:MAX
- Dimensions:
-
Click Update Chart at the bottom of the pane. The chart updates to show a pie chart of patients by gender.
-
Click Save to bring up the Save chart dialog. Name the chart "Gender Split" and under Add to Dashboard select the dashboard you created previously.
The next chart shows number of patients on HIV treatment by year and display that as a bar chart over time. The query is provided below.
- The codes 1255% and 1266% refer to HIV_Tx codes in the dataset.
Learn more about querying data in SQL-on-FHIR.
-
Navigate back to SQL Lab. You should see the previous patient count query.
- If you'd like to keep the patient count query, make a new tab and set the database to "Single Machine Test Data" or the name you chose.
-
Enter the following as the query:
SELECT COUNT(*), YEAR(O.effective.dateTime) FROM Observation AS O LATERAL VIEW EXPLODE(code.coding) AS OCC LATERAL VIEW EXPLODE(O.value.codeableConcept.coding) AS OVCC WHERE OCC.code LIKE '1255%' AND OVCC.code LIKE "1256%" AND YEAR(O.effective.dateTime) < 2023 GROUP BY YEAR(O.effective.dateTime) ORDER BY YEAR(O.effective.dateTime) ASC
NOTE: Using the modifiable pre-defined flat views ), the same query can be simplified to the following:
SELECT COUNT(*), YEAR(o.obs_date)
FROM observation_flat as o
WHERE o.code LIKE '1255%'
AND o.val_code LIKE "1256%"
AND YEAR(o.obs_date) < 2023
GROUP BY YEAR(o.obs_date)
ORDER BY YEAR(o.obs_date) ASC
- Click Run. After a moment, the results of the query should appear.
- In the Results section, click Create Chart. This brings you to a new chart page.
- Under the Chart Source pane, click Create a dataset and name it "HIV Treatment Dataset".
- Under the Data tab, switch to the Bar Chart chart type.
- In the Query section, set the following values:
2. X-Axis:
1. Custom SQL:
YEAR(date\_time)
3. Metric: 2. Column:patient\_id
3. Aggregate:COUNT
- Click Update Chart at the bottom of the pane. The chart updates to show a bar chart of patients undergoing HIV treatment per year.
- Click Save to bring up the Save chart dialog. Name the chart "HIV Treatment" and under Add to Dashboard select the dashboard you created previously.
Navigate to the Dashboards section and select the Sample Dashboard you created. You should see something like this with the charts you created.
Create a dashboard with 2 rows and a header. First click Edit Dashboard then:
- In the right pane select the Layout Elements tab.
- Add a second row by dragging a Row element onto the canvas below the existing row.
- Arrange the charts to have the Patient Count and Gender Split in the top row, and HIV Treatment in the bottom row.
- Resize the top charts to use half of the width by dragging the edge of the chart.
- Resize the HIV Treatment chart to take up the whole width of the row.
- Add a header element at the top and enter "Key Program Metrics".
Your dashboard now looks something like this:
To check that the dashboard is working properly, add a new patient to the FHIR Store and run the incremental pipeline.
-
Add a new patient to the server:
curl -X POST -H "Content-Type: application/fhir+json; charset=utf-8" \ 'http://localhost:8091/fhir/Patient/' -d '{"resourceType": "Patient"}'
-
Alternatively, use the HAPI FHIR server tester; go to the Patient resource tester, click the CRUD Operations tab, paste the following resource into the Contents field of the Create section, then click Create.
{"resourceType": "Patient"}
-
Open the Pipeline Controller at http://localhost:8090 and run the incremental pipeline.
-
Once the pipeline has finished, go to Superset and open your dashboard. Click the ... button next to Edit Dashboard and then click Refresh dashboard. You should see the patient count increase by 1.
For more information about Superset, go to the Superset documentation.