Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring in stream metadata from static CSVs #55

Merged
merged 6 commits into from
Oct 3, 2024

Conversation

jeremyestein
Copy link
Collaborator

@jeremyestein jeremyestein commented Sep 19, 2024

Addresses #45

Copy CSV metadata tables from Elise's repo and use this to fill in some gaps in the HL7 data we encounter (eg. sampling rate).

A further PR will perform validation that sampling rate is what we expect it to be. The gap checking algorithm in collation is already poised to do this, although it won't be able to do very much about it apart from report an error that we can investigate.

Also adding more threads to try and make it go faster. And hold locks for shorter periods during collation.

if we can get actual parallelism.
Spring Integration uses scheduling so increase that thread pool too.
Effect is not dramatic - will have to come back to this.
location+stream! Because can mix streams within an HL7 message.
Copy link

PR checklist

Default guide for a PR (if multiple PRs for the work, only keep one version of it and link to it on the other PRs)

  • From the UCLH data science desktop, a validation run has been set off
  • load times
    in UCL teams has been populated with the run information
  • During the run, glowroot has been checked for any queries which are taking a substantial proportion of the
    total processing time. This can be useful to identify indexes that are required.
  • After the run, look for any unexpected errors in the etl_per_message_logging table, the error_search.sql file
    on the shared drive can be used for this \\sharefs6\UCLH6\EMAP\Shared\EmapSqlScripts\devops\error_search.sql.
    Create an issue if you find an unexpected exception and is not related to the changes you've made, otherwise
    fix them!
  • After the run, populate the end time in
    load times
  • Let Aasiyah know about the completed validation and give her information on the changes and where to start
    with the validation
  • Check validation report and give any feedback to Aasiyah if there are any changes needed on her side,
    iterate on getting the validation to match at least 99% (validation and emap code).

@jeremyestein jeremyestein marked this pull request as ready for review September 25, 2024 16:21
@jeremyestein jeremyestein merged commit 2eba4de into sk/waveform-dev Oct 3, 2024
8 checks passed
@jeremyestein jeremyestein deleted the jeremy/hf-data-metadata branch October 3, 2024 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants