From 14b13d02c132b44a157b6786ce087654cb31dc00 Mon Sep 17 00:00:00 2001 From: Michael Mattig Date: Fri, 6 Dec 2024 17:03:11 +0100 Subject: [PATCH] add architecture --- .github/workflows/jekyll-gh-pages.yml | 25 ++++++++++- README.md | 63 ++++++++++++++++++++++++++- assets/architecture.mmd | 51 ++++++++++++++++++++++ assets/architecture.svg | 1 + 4 files changed, 137 insertions(+), 3 deletions(-) create mode 100644 assets/architecture.mmd create mode 100644 assets/architecture.svg diff --git a/.github/workflows/jekyll-gh-pages.yml b/.github/workflows/jekyll-gh-pages.yml index e31d81c..366471a 100644 --- a/.github/workflows/jekyll-gh-pages.yml +++ b/.github/workflows/jekyll-gh-pages.yml @@ -1,5 +1,5 @@ # Sample workflow for building and deploying a Jekyll site to GitHub Pages -name: Deploy Jekyll with GitHub Pages dependencies preinstalled +name: Deploy Jekyll with GitHub Pages and Mermaid support on: # Runs on pushes targeting the default branch @@ -16,7 +16,6 @@ permissions: id-token: write # Allow only one concurrent deployment, skipping runs queued between the run in-progress and latest queued. -# However, do NOT cancel in-progress runs as we want to allow these production deployments to complete. concurrency: group: "pages" cancel-in-progress: false @@ -28,13 +27,35 @@ jobs: steps: - name: Checkout uses: actions/checkout@v4 + - name: Setup Pages uses: actions/configure-pages@v5 + + # Set up Ruby to run Jekyll + - name: Set up Ruby + uses: ruby/setup-ruby@v1 + with: + ruby-version: "3.0" # Ensure the version matches the latest stable Jekyll-compatible version + bundler-cache: true + + # Create a Gemfile on the fly and add required gems + - name: Create Gemfile + run: | + echo "source 'https://rubygems.org'" > Gemfile + echo "gem 'jekyll', '~> 4.2'" >> Gemfile + echo "gem 'jekyll-mermaid'" >> Gemfile + + # Install dependencies (including jekyll and jekyll-mermaid) + - name: Install dependencies + run: bundle install + + # Build with Jekyll - name: Build with Jekyll uses: actions/jekyll-build-pages@v1 with: source: ./ destination: ./_site + - name: Upload artifact uses: actions/upload-pages-artifact@v3 diff --git a/README.md b/README.md index 9a8d62c..94a5990 100644 --- a/README.md +++ b/README.md @@ -1 +1,62 @@ -# fair-ds-oc1-docs \ No newline at end of file +# FAIR-DS Demonstrator: Copernicus Data SpaceMachine Learning - ECOMETRICS + +The ECOMETRICS app is a demonstrator for the FAIR-DS project. +This document presents the software architecture. +You can find more information about the app itself in the [FAIR-DS wiki](https://fair-ds4nfdi.github.io/wiki/). + +## Architecture + +There are four components to the ECOMETRICS app demonstrator: the Data Spaces, the Geo Engine instance, the machine learning in Jupyter and the ECOMETRICS dashboard. +There are two kinds of users: a Data Scientist that trains and refines a model and a user that select area and time of interest and performs an analysis. +The following diagram shows the architecture of the ECOMETRICS app. ![ECOMETRICS architecture](./assets/architecture.svg) + +{% mermaid %} +flowchart LR +subgraph DataSpaces +NFDI[NFDI 4 Biodiversity] --> Aruna +Copernicus[Copernicus Data Space Ecosystem] --> Aruna +end + +subgraph GeoEngine +Aruna[Data Connectors - Aruna & STAC] --> ProcessingEngine[Processing Engine] +ProcessingEngine --> API[API] +end + +API --> Dashboards[ECOMETRICS Dashboards] +API --> PythonLibrary[Python Library] + +User[User] -->|Select area/time and analyze| Dashboards +DataScientist[Data Scientist] -->|Train and refine model| PythonLibrary + +PythonLibrary --> Jupyter[Jupyter Notebook] +PythonLibrary --> ScikitLearn[Scikit-Learn] +PythonLibrary --> ONNX[ONNX] + +Dashboards -->|Show insights| User +{% endmermaid %} + +### Data Spaces + +The demonstrator connects to the Copernicus Data Space and the NFDI4Biodiversity Data Space. +The NFDI4Biodiversity provides training labels for the machine learning model. +The Copernicus Data Space provides the satellite data. + +### Geo Engine + +Geo Engine has a data connector for each of the Data Spaces. +The connectors are able to browser metadata and access the raw data for analysis. +They map the files to the Geo Engine data model that supports temporal and spatial queries. +The processing engine harmonizes and enriches the data and makes it ready for machine learning. +Standardized and custom API methods make the data available. + +### Machine Learning in Jupyter + +The machine learning notebook access the Geo Engine API to retrieve the data. +It uses Sci-kit Learn to train and refine a model. +The model is stored in the ONNX format and uploaded to the Geo Engine where it is registered and can be used as an operator. + +### ECOMETRICS Dashboard + +The ECOMETRICS Dashboard is an easy-to-use web app that build upon the Geo Engine UI toolkit. +The dashboard has an interactive map and lets the user select an area and time of interest. +It has an analysis functionality that triggers the Geo Engine to run the machine learning model. diff --git a/assets/architecture.mmd b/assets/architecture.mmd new file mode 100644 index 0000000..2bad5ae --- /dev/null +++ b/assets/architecture.mmd @@ -0,0 +1,51 @@ +flowchart TB + %% Data Spaces + subgraph DataSpaces["Data Spaces"] + nfdi["NFDI 4 BIODIVERSITY"] + copernicus["Copernicus Data Space Ecosystem"] + end + + %% Geo Engine Components + subgraph GeoEngine["Geo Engine"] + subgraph DataConnectors["Data Connectors"] + aruna["ARUNA"] + stac["STAC"] + end + processingEngine["Processing Engine"] + api["API"] + + subgraph PythonLibrary["Python Library"] + pythonLib["Python Library"] + scikit["Scikit-learn"] + onnx["ONNX"] + jupyter["Jupyter"] + end + end + + %% ECOMETRICS Dashboards + subgraph Dashboards["ECOMETRICS Dashboards"] + dashboard["Dashboard"] + end + + %% Actors + user["User"] + dataScientist["Data Scientist"] + + %% Connections within Geo Engine + nfdi --> aruna + copernicus --> stac + aruna --> processingEngine + stac --> processingEngine + processingEngine --> api + api --> dashboard + api --> pythonLib + pythonLib --> scikit + pythonLib --> onnx + pythonLib --> jupyter + + %% User Interactions + user --> dashboard:::interaction["Select area/time and analyze"] + dataScientist --> pythonLib:::interaction["Train and refine model"] + + %% Styling for better readability (optional) + classDef interaction fill:#f9f,stroke:#333,stroke-width:2px; diff --git a/assets/architecture.svg b/assets/architecture.svg new file mode 100644 index 0000000..dfa3bef --- /dev/null +++ b/assets/architecture.svg @@ -0,0 +1 @@ + \ No newline at end of file