Skip to content

Latest commit

 

History

History
51 lines (43 loc) · 4.49 KB

README.md

File metadata and controls

51 lines (43 loc) · 4.49 KB

global-river-databases

This repository contains reformatted and merged versions of RiverAtlas and GLORICH global river databases. The resulting merged data set is used as the input for making automated, continental-scale estimates of river sediment respiration rates archived in dynamic-learning-rivers. The machine learning (ML) workflow to generate these estimates uses the SuperLearner stacked ensemble of ML models. Data from the WHONDRS project is also included here to document the preprocessing steps necessary for creating a training data set.

The code, in ./scripts, searches for the closest RiverAtlas data point (physical information about each river segment) for each data point in the GLORICH database (river chemistry). Intermediate data saved at key steps are available in ./GLORICH and ./RiverAtlas while more information about the data processing and the final results are in ./scripts/. The code in ./scripts uses GMT and to ensure portability on cloud resources, uses a containerized version of GMT that is described in the Dockerfile in ./container.

The ./WHONDRS directory contains a copy of the WHONDRS data that can be used to train an ML model predicting river sediment respiration rate. This data is processed by ./notebooks/whondrs_preproc.ipynb into a format that is directly usable by the ML workflow.

The contents of each directory are distributed under different licenses. RiverAtlas is licensed under a Creative Commons Attribution (CC-BY) 4.0 International License. RiverAtlas was created/published by:

Linke, S., Lehner, B., Ouellet Dallaire, C., Ariwi, J., Grill, G., Anand, M., Beames, P., Burchard-Levine, V., Maxwell, S., Moidu, H., Tan, F., Thieme, M. (2019). Global hydro-environmental sub-basin and river reach characteristics at high spatial resolution. Scientific Data 6: 283. doi: https://doi.org/10.1038/s41597-019-0300-6

GLORICH is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA-4.0). GLORICH was created/published by:

Hartmann et al. (2014) A Brief Overview of the GLObal RIver Chemistry Database, GLORICH. Procedia Earth and Planetary Science, 10, 23-27: https://doi.org/10.1016/j.proeps.2014.08.005

WHONDRS is distributed under two different CC licenses and only partial copies are available here. The FTICR data is under a Creative Commons Universal 1.0 Public Domain Dedication while the site_data are under a Creative Commons Attribution 4.0 International License. Please see the following citations for the full WHONDRS data set:

Garayburu-Caruso V A ; Goldman A E ; Toyoda J G ; Chu R ; Renteria L ; Stegen J C ; Sengupta A ; Torgeson J M ; Willi K ; Ross M (2022): FTICR-MS Data from Multi-continent River Water and Sediment and from Coastal River Fresh and Saline Sediment Associated with: Dissolved Organic Matter Functional Trait Relationships are Conserved Across Rivers. Early Career Research Program: Watershed Perturbation-Response Traits Derived Through Ecological Theory - Worldwide Hydrobiogeochemistry Observation Network for Dynamic River Systems (WHONDRS), ESS-DIVE repository. Dataset. doi:10.15485/1824222 accessed via https://data.ess-dive.lbl.gov/datasets/doi:10.15485/1824222 on 2024-01-16

Goldman A E ; Arnon S ; Bar-Zeev E ; Chu R K ; Danczak R E ; Daly R A ; Delgado D ; Fansler S ; Forbes B ; Garayburu-Caruso V A ; Graham E B ; Laan M ; McCall M L ; McKever S ; Patel K F ; Ren H ; Renteria L ; Resch C T ; Rod K A ; Tfaily M ; Tolic N ; Torgeson J M ; Toyoda J G ; Wells J ; Wrighton K C ; Stegen J C ; WHONDRS Consortium T (2020): WHONDRS Summer 2019 Sampling Campaign: Global River Corridor Sediment FTICR-MS, Dissolved Organic Carbon, Aerobic Respiration, Elemental Composition, Grain Size, Total Nitrogen and Organic Carbon Content, Bacterial Abundance, and Stable Isotopes (v8). River Corridor and Watershed Biogeochemistry SFA, ESS-DIVE repository. Dataset. doi:10.15485/1729719 accessed via https://data.ess-dive.lbl.gov/datasets/doi:10.15485/1729719 on 2024-01-16

Finally, the data reformatting scripts themselves are distributed under the MIT license.