Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add realistic test data #24

Merged
merged 38 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
64716e3
Insert dummy data to MEASUREMENT and OBSERVATION for the test db. Cre…
BaptisteBR Aug 16, 2024
e4018f8
Clean CSV files.
BaptisteBR Aug 16, 2024
81172aa
Regenerate realistic test data after cleaning dummy CSVs.
BaptisteBR Aug 16, 2024
18d1c38
Add script to produce the realistic test data.
BaptisteBR Aug 16, 2024
f230e49
Merge branch 'main' into baptistebr/realistic-test-data
milanmlft Aug 19, 2024
a4a1fe6
Update dev/test_db/produce_test_data.R
BaptisteBR Aug 20, 2024
79dcdfe
Update README to explain how the reproduce test data.
BaptisteBR Aug 20, 2024
61db7f5
Merge branch 'main' into baptistebr/realistic-test-data
milanmlft Aug 21, 2024
dcc0ab0
Add 'ORDER BY' clause to produce results.
BaptisteBR Aug 22, 2024
76490cb
Update README with test dataset creation process.
BaptisteBR Aug 22, 2024
ff77490
Merge branch 'baptistebr/realistic-test-data' of https://github.com/U…
BaptisteBR Aug 22, 2024
3e0f7f1
Update README with test dataset creation process.
BaptisteBR Aug 22, 2024
dd205c3
Update README with test dataset creation process.
BaptisteBR Aug 22, 2024
9440093
Merge branch 'main' into baptistebr/realistic-test-data
milanmlft Aug 23, 2024
f8fe889
Make sure DB connections are always closed
milanmlft Aug 23, 2024
667b711
Remove clean up steps
milanmlft Aug 23, 2024
ed59327
Remove more cleanup steps
milanmlft Aug 23, 2024
8043aae
Use test data in data getters
milanmlft Aug 23, 2024
ea859f9
Test consistency of test_data files
milanmlft Aug 23, 2024
40a0d92
Fix indentation
milanmlft Aug 23, 2024
0101496
Improve logging
milanmlft Aug 23, 2024
db91ab6
Update test data
milanmlft Aug 23, 2024
bcd5bfc
Update test data
milanmlft Aug 23, 2024
5baa410
Overwrite existing data to ensure consistency
milanmlft Aug 23, 2024
3907a2d
Add sanity checks for dummy data
milanmlft Aug 23, 2024
490caff
Explicitly specify column types for dummy data to avoid errors later on
milanmlft Aug 23, 2024
3958ec5
Overwrite database tables for summary results as well
milanmlft Aug 23, 2024
3c8499a
Update test data (should be consistent now)
milanmlft Aug 23, 2024
801e811
Use `readr` to write data as we're dealing with tibbles
milanmlft Aug 23, 2024
0aa56af
Update snapshot test data
milanmlft Aug 23, 2024
dc6dd43
Just do a simple `arrange()` across all columns to order data
milanmlft Aug 23, 2024
9997728
Clean up code
milanmlft Aug 23, 2024
92b91e4
Reduce test data size
milanmlft Aug 23, 2024
ba89fbe
Readd sorting of tables
milanmlft Aug 23, 2024
c797006
Update test data (really the last time now)
milanmlft Aug 23, 2024
f7caf0e
Read test data as tibble
milanmlft Aug 23, 2024
08614e6
Add readr as dependency
milanmlft Aug 23, 2024
7129850
Merge branch 'main' into baptistebr/realistic-test-data
milanmlft Aug 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .Rprofile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ source("renv/activate.R")
# Path to download Eunomia datasets
Sys.setenv(EUNOMIA_DATA_FOLDER = file.path("dev/test_db/eunomia"))
# Name of the synthetic dataset to use
Sys.setenv(TEST_DB_NAME = "GiBleed")
Sys.setenv(TEST_DB_NAME = "synthea-allergies-10k")
# OMOP CDM version
Sys.setenv(TEST_DB_OMOP_VERSION = "5.3")
# Schema name for data
Expand Down
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Imports:
glue,
tidyr,
withr,
readr,
lubridate,
dplyr
Suggests:
Expand Down
33 changes: 5 additions & 28 deletions R/utils_get_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,9 @@
#' @noRd
get_concepts_table <- function() {
if (golem::app_dev()) {
return(data.frame(
concept_id = c(40213251, 133834, 4057420),
concept_name = c(
"varicella virus vaccine",
"Atopic dermatitis",
"Catheter ablation of tissue of heart"
),
domain_id = c("Drug", "Condition", "Procedure"),
vocabulary_id = c("CVX", "SNOMED", "SNOMED"),
concept_class_id = c("CVX", "Clinical Finding", "Procedure"),
standard_concept = c("S", "S", "S"),
concept_code = c("21", "24079001", "18286008")
))
return(
readr::read_csv(app_sys("test_data", "calypso_concepts.csv"), show_col_types = FALSE)
)
}

con <- connect_to_test_db()
Expand All @@ -28,15 +18,7 @@ get_concepts_table <- function() {
get_monthly_counts <- function() {
if (golem::app_dev()) {
return(
data.frame(
concept_id = c(
rep(c(40213251, 133834, 4057420), each = 3)
),
date_year = c(2019L, 2020L, 2020L, 2019L, 2020L, 2020L, 2020L, 2019L, 2019L),
date_month = c(4L, 3L, 5L, 5L, 8L, 4L, 11L, 6L, 3L),
person_count = c(1, 1, 3, 4, 2, 3, 2, 4, 1),
records_per_person = c(1, 1, 1, 1, 1, 1, 1, 1, 1)
)
readr::read_csv(app_sys("test_data", "calypso_monthly_counts.csv"), show_col_types = FALSE)
)
}

Expand All @@ -48,12 +30,7 @@ get_monthly_counts <- function() {
get_summary_stats <- function() {
if (golem::app_dev()) {
return(
data.frame(
concept_id = rep(c(40213251, 133834, 4057420), each = 2),
summary_attribute = rep(c("mean", "sd"), times = 3),
value_as_string = rep(NA, 6),
value_as_number = c(1.5, 0.5, 2.5, 0.7, 3.5, 0.8)
)
readr::read_csv(app_sys("test_data", "calypso_summary_stats.csv"), show_col_types = FALSE)
)
}

Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,17 @@ as it has good support for R package development and Shiny.

The `dev/02_dev.R` script contains a few helper functions to get you started.

Calypso test data can be found in [`inst/test_data`](https://github.com/UCLH-Foundry/omop-data-catalogue/tree/main/inst/data). These data have been generated by using the synthetic dataset '[synthea-allergies-10k](https://darwin-eu.github.io/CDMConnector/reference/eunomiaDir.html)', and adding some [dummy data](https://github.com/UCLH-Foundry/omop-data-catalogue/tree/main/dev/test_db/dummy) for the MEASUREMENT and OBSERVATION tables (to have some records in the 'calypso-summary-stats' table).

If you want to recreate a test dataset, you can run the following R scripts:

```r
source(here::here("dev/test_db/setup_test_db.R"))
source(here::here("dev/test_db/insert_dummy_tables.R"))
source(here::here("dev/omop_analyses/analyse_omop_cdm.R"))
source(here::here("dev/test_db/produce_test_data.R"))
```

### Updating the `renv` lockfile

Make sure to regularly run `renv::status(dev = TRUE)` to check if your local library and the lockfile
Expand Down
11 changes: 8 additions & 3 deletions dev/omop_analyses/analyse_omop_cdm.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
library(tidyverse)
cli::cli_h1("Generating summarys statistics")

suppressPackageStartupMessages(
library(tidyverse)
)

dir <- Sys.getenv("EUNOMIA_DATA_FOLDER")
name <- Sys.getenv("TEST_DB_NAME")
Expand Down Expand Up @@ -180,8 +184,7 @@ write_results <- function(data, con, table) {
table = table
),
value = data,
append = TRUE,
overwrite = FALSE
overwrite = TRUE
BaptisteBR marked this conversation as resolved.
Show resolved Hide resolved
)
}

Expand Down Expand Up @@ -220,3 +223,5 @@ ids <- unique(c(monthly_counts$concept_id, summary_stats$concept_id))
# Retrieve concept properties from the list of ids
get_concepts_table(cdm, ids) |>
write_results(con, "calypso_concepts")

cli::cli_alert_success("Summary statistics generated successfully")
24 changes: 24 additions & 0 deletions dev/test_db/dummy/measurement.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
measurement_id,person_id,measurement_concept_id,measurement_date,measurement_datetime,measurement_time,measurement_type_concept_id,operator_concept_id,value_as_number,value_as_concept_id,unit_concept_id,range_low,range_high,provider_id,visit_occurrence_id,visit_detail_id,measurement_source_value,measurement_source_concept_id,unit_source_value,value_source_value
10000000,103,4354252,2021-06-01,2021-06-01T10:45:00Z,NA,32817,0,115,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000001,12,4354252,2020-08-25,2020-08-25T14:15:00Z,NA,32817,0,122,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000002,866,4354252,2021-11-14,2021-11-14T13:05:00Z,NA,32817,0,125,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000003,12,4354252,2019-08-23,2019-08-23T02:28:00Z,NA,32817,0,131,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000004,51,4354252,2021-02-11,2021-02-11T15:01:00Z,NA,32817,0,111,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000005,3028,4354252,2018-01-18,2018-01-18T12:14:00Z,NA,32817,0,138,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000006,7,4248525,2021-10-15,2021-10-15T10:20:00Z,NA,32817,0,169,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000007,553,4248525,2015-04-19,2015-04-19T16:47:00Z,NA,32817,0,131,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000008,1641,4248525,2019-10-11,2019-10-11T07:00:00Z,NA,32817,0,128,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000009,553,4248525,2020-06-26,2020-06-26T00:00:00Z,NA,32817,0,114,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000010,12,4248525,2019-05-01,2019-05-01T20:55:00Z,NA,32817,0,122,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000011,978,4353843,2002-03-03,2002-03-03T22:00:00Z,NA,32817,0,162,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000012,12,4353843,2021-09-18,2021-09-18T02:00:00Z,NA,32817,0,152,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000013,6459,4353843,2021-12-28,2021-12-28T02:00:00Z,NA,32817,0,118,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000014,995,4353843,2023-04-08,2023-04-08T08:00:00Z,NA,32817,0,99,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000015,110,4353843,2015-09-20,2015-09-20T07:00:00Z,NA,32817,0,117,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000016,8746,4353843,2021-10-01,2021-10-01T09:00:00Z,NA,32817,0,125,0,8876,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000017,978,4108450,2001-06-15,2001-06-15T07:50:00Z,NA,32817,0,0.6666666666666666,0,8523,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000018,8916,4108450,2019-09-13,2019-09-13T08:29:00Z,NA,32817,0,0.6666666666666666,0,8523,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000019,51,3001079,2020-05-15,2020-05-15T22:44:00Z,NA,32817,0,NA,45878588,0,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000020,909,3001079,2018-03-11,2018-03-11T13:30:00Z,NA,32817,0,NA,45878588,0,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000021,553,4128111,2020-12-02,2020-12-02T00:00:00Z,NA,32817,0,NA,1635564,0,NA,NA,NA,NA,NA,NA,NA,NA,NA
10000022,7,4128111,2020-11-25,2020-11-25T00:00:00Z,NA,32817,0,NA,1633781,0,NA,NA,NA,NA,NA,NA,NA,NA,NA
13 changes: 13 additions & 0 deletions dev/test_db/dummy/observation.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
observation_id,person_id,observation_concept_id,observation_date,observation_datetime,observation_type_concept_id,value_as_number,value_as_string,value_as_concept_id,qualifier_concept_id,unit_concept_id,provider_id,visit_occurrence_id,visit_detail_id,observation_source_value,observation_source_concept_id,unit_source_value,qualifier_source_value
10000000,11,45766147,2022-06-24,2022-06-24T09:00:00Z,32817,NA,NA,4086518,NA,0,NA,NA,NA,NA,NA,NA,NA
10000001,59,4257036,2018-09-02,2018-09-02T09:19:00Z,32817,NA,NA,37208662,NA,0,NA,NA,NA,NA,NA,NA,NA
10000002,237,4257036,2014-11-19,2014-11-19T17:10:00Z,32817,NA,NA,37208662,NA,0,NA,NA,NA,NA,NA,NA,NA
10000003,299,4257036,2017-02-17,2017-02-17T11:14:00Z,32817,NA,NA,37208662,NA,0,NA,NA,NA,NA,NA,NA,NA
10000004,673,4216746,2011-03-22,2011-03-22T16:00:00Z,32817,8,NA,0,NA,44777590,NA,NA,NA,NA,NA,NA,NA
10000005,11,4353717,2022-12-05,2022-12-05T17:00:00Z,32817,10.6,NA,0,NA,8698,NA,NA,NA,NA,NA,NA,NA
10000006,1502,4353717,2021-05-13,2021-05-13T11:00:00Z,32817,16.8,NA,0,NA,8698,NA,NA,NA,NA,NA,NA,NA
10000007,986,4216746,2008-09-28,2008-09-28T20:00:00Z,32817,8,NA,0,NA,44777590,NA,NA,NA,NA,NA,NA,NA
10000008,299,4353717,2016-10-09,2016-10-09T01:00:00Z,32817,11,NA,0,NA,8698,NA,NA,NA,NA,NA,NA,NA
10000009,299,4353713,2018-01-12,2018-01-12T15:58:00Z,32817,5,NA,0,NA,44777590,NA,NA,NA,NA,NA,NA,NA
10000010,6288,4353713,2021-07-19,2021-07-19T15:14:00Z,32817,7,NA,0,NA,44777590,NA,NA,NA,NA,NA,NA,NA
10000011,3362,4353713,2019-02-08,2019-02-08T12:25:00Z,32817,5,NA,0,NA,44777590,NA,NA,NA,NA,NA,NA,NA
4 changes: 4 additions & 0 deletions dev/test_db/eunomia/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@

# duckdb databases
*.duckdb

# duckdb temp files
# (in case of failure)
*.duckdb.wal
79 changes: 79 additions & 0 deletions dev/test_db/insert_dummy_tables.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# PRODUCED FOR A SPECIFIC DATASET:
# synthea-allergies-10k
# (but could work for others)

cli::cli_h1("Inserting dummy tables")

library(readr)

dir <- Sys.getenv("EUNOMIA_DATA_FOLDER")
name <- Sys.getenv("TEST_DB_NAME")
version <- Sys.getenv("TEST_DB_OMOP_VERSION")

# Connect to the duckdb test database
con <- DBI::dbConnect(
duckdb::duckdb(dbdir = glue::glue("{dir}/{name}_{version}_1.0.duckdb"))
)

withr::defer(DBI::dbDisconnect(con))

# Function to write data to a table in the cdm schema
write_table <- function(data, con, table) {
# Insert data into the specified table
# (in the cdm schema)
DBI::dbWriteTable(
conn = con,
name = DBI::Id(
schema = Sys.getenv("TEST_DB_CDM_SCHEMA"),
table = table
),
value = data,
overwrite = TRUE
BaptisteBR marked this conversation as resolved.
Show resolved Hide resolved
)
}

## Load dummy data and write tables to database
## We explicitly set the column types for columns that are needed later down the pipeline
dummy_measurements <- read_csv(
here::here("dev/test_db/dummy/measurement.csv"),
col_types = cols(
measurement_id = col_double(),
person_id = col_double(),
measurement_concept_id = col_double(),
measurement_date = col_date(),
value_as_number = col_double(),
value_as_concept_id = col_double(),
)
)
write_table(dummy_measurements, con, "measurement")

dummy_observations <- read_csv(here::here(
"dev/test_db/dummy/observation.csv"),
col_types = cols(
observation_id = col_double(),
person_id = col_double(),
observation_concept_id = col_double(),
observation_date = col_date(),
value_as_number = col_double(),
value_as_string = col_logical(),
value_as_concept_id = col_double(),
)
)
write_table(dummy_observations, con, "observation")

# Sanity check: read the data back and make sure its consistent
db_measurements <- DBI::dbReadTable(con, "measurement")
stopifnot(all.equal(db_measurements, as.data.frame(dummy_measurements)))

db_observations <- DBI::dbReadTable(con, "observation")
stopifnot(all.equal(db_observations, as.data.frame(dummy_observations)))
BaptisteBR marked this conversation as resolved.
Show resolved Hide resolved

# Load the CMD object to verify integrity of the schema after insertions
cdm <- CDMConnector::cdm_from_con(
con = con,
cdm_schema = Sys.getenv("TEST_DB_CDM_SCHEMA"),
write_schema = Sys.getenv("TEST_DB_RESULTS_SCHEMA"),
cdm_name = name
)

cli::cli_alert_success("Dummy tables inserted successfully")
52 changes: 52 additions & 0 deletions dev/test_db/produce_test_data.R
milanmlft marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
cli::cli_h1("Producing test data")

suppressPackageStartupMessages({
library(dplyr)
})

dir <- Sys.getenv("EUNOMIA_DATA_FOLDER")
name <- Sys.getenv("TEST_DB_NAME")
version <- Sys.getenv("TEST_DB_OMOP_VERSION")

# Connect to the duckdb test database
con <- DBI::dbConnect(
duckdb::duckdb(dbdir = glue::glue("{dir}/{name}_{version}_1.0.duckdb"))
)
milanmlft marked this conversation as resolved.
Show resolved Hide resolved
withr::defer(DBI::dbDisconnect(con))

# Function to write results from a table to the test data folder
read_table <- function(con, table) {
schema <- Sys.getenv("TEST_DB_RESULTS_SCHEMA")
# Get all rows from the table
query <- glue::glue("SELECT * FROM {schema}.{table}")
# Run the query and write results
con |>
DBI::dbGetQuery(query) |>
arrange(across(everything()))
BaptisteBR marked this conversation as resolved.
Show resolved Hide resolved
}

# Get the relevant tables and filter
table_names <- c("calypso_concepts", "calypso_monthly_counts", "calypso_summary_stats")
tables <- purrr::map(table_names, read_table, con = con)
names(tables) <- table_names

# Keep only concepts for which we have summary statistics
keep_concepts <- tables$calypso_summary_stats$concept_id
tables <- purrr::map(tables, ~ .x[.x$concept_id %in% keep_concepts, ])

# Keep only data from 2019 onwards
monthly_counts <- tables$calypso_monthly_counts
filtered_monthly <- monthly_counts[monthly_counts$date_year >= 2019, ]
tables$calypso_monthly_counts <- filtered_monthly

# Filter the other tables to match the concepts left over after year filtering
tables <- purrr::map(tables, ~ .x[.x$concept_id %in% filtered_monthly$concept_id, ])
BaptisteBR marked this conversation as resolved.
Show resolved Hide resolved

# Write all results to the test data folder
purrr::iwalk(tables, function(tbl, name) {
path <- here::here(glue::glue("inst/test_data/{name}.csv"))
cli::cli_alert_info("Writing {name} to {path}")
readr::write_csv(tbl, file = path)
})

cli::cli_alert_success("Test data produced")
3 changes: 3 additions & 0 deletions dev/test_db/setup_test_db.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
cli::cli_h1("Setting up test database")

# Create an duckdb database from Eunomia datasets
con <- DBI::dbConnect(
Expand All @@ -18,3 +19,5 @@ CDMConnector::cdm_from_con(
write_schema = Sys.getenv("TEST_DB_RESULTS_SCHEMA"),
cdm_name = Sys.getenv("TEST_DB_NAME")
)

cli::cli_alert_success("Test database setup successfully")
10 changes: 10 additions & 0 deletions inst/test_data/calypso_concepts.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
concept_id,concept_name,vocabulary_id,domain_id,concept_class_id,standard_concept,concept_code
3001079,Blood group antibody screen [Presence] in Serum or Plasma,LOINC,Measurement,Lab Test,S,890-4
4108450,Inspiration/expiration time ratio,SNOMED,Measurement,Observable Entity,S,250822000
4128111,T - Tumor stage,SNOMED,Observation,Attribute,S,260878002
4248525,Lying systolic blood pressure,SNOMED,Measurement,Observable Entity,S,407556006
4353713,Positive end expiratory pressure,SNOMED,Observation,Observable Entity,S,250854009
4353717,Ventilator delivered minute volume,SNOMED,Observation,Observable Entity,S,250875001
4353843,Invasive systolic arterial pressure,SNOMED,Measurement,Observable Entity,S,251071003
4354252,Non-invasive systolic arterial pressure,SNOMED,Measurement,Observable Entity,S,251070002
45766147,Appearance,SNOMED,Observation,Observable Entity,S,703248002
23 changes: 23 additions & 0 deletions inst/test_data/calypso_monthly_counts.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
concept_id,concept_name,date_year,date_month,person_count,records_per_person
3001079,Blood group antibody screen [Presence] in Serum or Plasma,2020,5,1,1
4108450,Inspiration/expiration time ratio,2019,9,1,1
4128111,T - Tumor stage,2020,11,1,1
4128111,T - Tumor stage,2020,12,1,1
4248525,Lying systolic blood pressure,2019,5,1,1
4248525,Lying systolic blood pressure,2019,10,1,1
4248525,Lying systolic blood pressure,2020,6,1,1
4248525,Lying systolic blood pressure,2021,10,1,1
4353713,Positive end expiratory pressure,2019,2,1,1
4353713,Positive end expiratory pressure,2021,7,1,1
4353717,Ventilator delivered minute volume,2021,5,1,1
4353717,Ventilator delivered minute volume,2022,12,1,1
4353843,Invasive systolic arterial pressure,2021,9,1,1
4353843,Invasive systolic arterial pressure,2021,10,1,1
4353843,Invasive systolic arterial pressure,2021,12,1,1
4353843,Invasive systolic arterial pressure,2023,4,1,1
4354252,Non-invasive systolic arterial pressure,2019,8,1,1
4354252,Non-invasive systolic arterial pressure,2020,8,1,1
4354252,Non-invasive systolic arterial pressure,2021,2,1,1
4354252,Non-invasive systolic arterial pressure,2021,6,1,1
4354252,Non-invasive systolic arterial pressure,2021,11,1,1
45766147,Appearance,2022,6,1,1
17 changes: 17 additions & 0 deletions inst/test_data/calypso_summary_stats.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
concept_id,concept_name,summary_attribute,value_as_number,value_as_string
3001079,Blood group antibody screen [Presence] in Serum or Plasma,frequency,2,Not present
4108450,Inspiration/expiration time ratio,mean,0.6666666666666666,NA
4108450,Inspiration/expiration time ratio,sd,0,NA
4128111,T - Tumor stage,frequency,1,NA
4128111,T - Tumor stage,frequency,1,NA
4248525,Lying systolic blood pressure,mean,132.8,NA
4248525,Lying systolic blood pressure,sd,21.25323504786977,NA
4353713,Positive end expiratory pressure,mean,5.666666666666667,NA
4353713,Positive end expiratory pressure,sd,1.1547005383792517,NA
4353717,Ventilator delivered minute volume,mean,12.799999999999999,NA
4353717,Ventilator delivered minute volume,sd,3.469870314579495,NA
4353843,Invasive systolic arterial pressure,mean,128.83333333333334,NA
4353843,Invasive systolic arterial pressure,sd,23.65938855225694,NA
4354252,Non-invasive systolic arterial pressure,mean,123.66666666666667,NA
4354252,Non-invasive systolic arterial pressure,sd,9.993331109628395,NA
45766147,Appearance,frequency,1,Well nourished
10 changes: 10 additions & 0 deletions tests/testthat/_snaps/utils_get_data/concepts_table.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
concept_id,concept_name,vocabulary_id,domain_id,concept_class_id,standard_concept,concept_code
3001079,Blood group antibody screen [Presence] in Serum or Plasma,LOINC,Measurement,Lab Test,S,890-4
4108450,Inspiration/expiration time ratio,SNOMED,Measurement,Observable Entity,S,250822000
4128111,T - Tumor stage,SNOMED,Observation,Attribute,S,260878002
4248525,Lying systolic blood pressure,SNOMED,Measurement,Observable Entity,S,407556006
4353713,Positive end expiratory pressure,SNOMED,Observation,Observable Entity,S,250854009
4353717,Ventilator delivered minute volume,SNOMED,Observation,Observable Entity,S,250875001
4353843,Invasive systolic arterial pressure,SNOMED,Measurement,Observable Entity,S,251071003
4354252,Non-invasive systolic arterial pressure,SNOMED,Measurement,Observable Entity,S,251070002
45766147,Appearance,SNOMED,Observation,Observable Entity,S,703248002
23 changes: 23 additions & 0 deletions tests/testthat/_snaps/utils_get_data/monthly_counts.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
concept_id,concept_name,date_year,date_month,person_count,records_per_person
3001079,Blood group antibody screen [Presence] in Serum or Plasma,2020,5,1,1
4108450,Inspiration/expiration time ratio,2019,9,1,1
4128111,T - Tumor stage,2020,11,1,1
4128111,T - Tumor stage,2020,12,1,1
4248525,Lying systolic blood pressure,2019,5,1,1
4248525,Lying systolic blood pressure,2019,10,1,1
4248525,Lying systolic blood pressure,2020,6,1,1
4248525,Lying systolic blood pressure,2021,10,1,1
4353713,Positive end expiratory pressure,2019,2,1,1
4353713,Positive end expiratory pressure,2021,7,1,1
4353717,Ventilator delivered minute volume,2021,5,1,1
4353717,Ventilator delivered minute volume,2022,12,1,1
4353843,Invasive systolic arterial pressure,2021,9,1,1
4353843,Invasive systolic arterial pressure,2021,10,1,1
4353843,Invasive systolic arterial pressure,2021,12,1,1
4353843,Invasive systolic arterial pressure,2023,4,1,1
4354252,Non-invasive systolic arterial pressure,2019,8,1,1
4354252,Non-invasive systolic arterial pressure,2020,8,1,1
4354252,Non-invasive systolic arterial pressure,2021,2,1,1
4354252,Non-invasive systolic arterial pressure,2021,6,1,1
4354252,Non-invasive systolic arterial pressure,2021,11,1,1
45766147,Appearance,2022,6,1,1
Loading