Skip to content

Commit

Permalink
Merge branch 'time-series'
Browse files Browse the repository at this point in the history
# Conflicts:
#	url_handlers/basic_stats.py
#	url_handlers/coplots_pl.py
#	url_handlers/histogram.py
#	url_handlers/scatter_plot.py
#	webserver.py
  • Loading branch information
Magdalena5 committed Jun 4, 2021
2 parents 4f31c6f + 294ef5d commit e424c6f
Show file tree
Hide file tree
Showing 116 changed files with 29,099 additions and 14,434 deletions.
10 changes: 5 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@ RUN apt-get update && \
pip install pipenv && \
pipenv install --ignore-pipfile --deploy --system



WORKDIR /app

ENV FLASK_ENV production
# not really needed as weitress ignores this option
ENV FLASK_APP webserver.py

ENV FLASK_RUN_HOST 0.0.0.0
ENV TZ=Europe/Berlin

EXPOSE 5428
EXPOSE 80

CMD [ "waitress-serve", "--port", "80", "--call", "webserver:main" ]

CMD [ "waitress-serve","--port","80","--call", "webserver:main" ]
19 changes: 9 additions & 10 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,12 @@ pyyaml = "*"
nltk = "*"
redis = "*"
ipython = "*"
pybind11 = "*"
"pybind11" = "*"
python-dotenv = "*"
click = "*"
flask = "*"
misc = {git = "https://bitbucket.org/bmmalone/misc.git"}
autosklearn = {ref = "development",git = "https://github.com/automl/auto-sklearn.git"}
flask-redis = "*"
sklearn = "*"
seaborn = "*"
Expand All @@ -31,18 +33,15 @@ pytest = "*"
apscheduler = "*"
waitress = "*"
ipdb = "*"
sqlalchemy = "*"

psycopg2 = "*"
statsmodels = "*"
gower = "*"
umap = "*"
flask_cors="*"
requests = "*"

[dev-packages]
pylint = "*"

[requires]
python_version = "3.7"

[packages.misc]
git = "https://bitbucket.org/bmmalone/misc.git"

[packages.autosklearn]
ref = "development"
git = "https://github.com/automl/auto-sklearn.git"
1,076 changes: 672 additions & 404 deletions Pipfile.lock

Large diffs are not rendered by default.

63 changes: 8 additions & 55 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,76 +16,29 @@ Currently setup for deployment and not development
#### Usage ####
* `docker-compose up`

### Setup Instructions Development ###
### Setup Instructions Development [(detailed documentation)](https://github.com/dieterich-lab/medex/tree/PostgreSQL/documentation) ###
Not recommended for pure deployment.

#### Requirements ####
* [Python](https://www.python.org/) >= 3.7
* [pipenv](https://docs.pipenv.org/en/latest/) >= 2018.10.13
* [redis](https://redis.io/) >= 5.x
* [Docker-CE](https://docs.docker.com/install/) >= 18.09.07
* [docker-compose](https://docs.docker.com/compose/overview/) >= 1.24.0
* Linux/MacOS

#### Usage ####
* `pipenv install` installs the latest depencies
* `pipenv install` installs the latest dependencies
* `pipenv shell` enters the virtual environment
* `docker-compose up` necessary for creating container for PostgreSQL database
* `./scripts/start.sh`
* Develop


## Data Import ##
* Database imports run every night at 5:05 and at startup.
* The database is only updated if there is new data to import.

### Importing new data ###
In order to add new data add a new `entities.csv` and `dataset.csv` to the `./import` folder

To work the files should have the same format as the current example files that are already in that directory.

The currently used format of the dataset.csv file comes from the research warehouse export format of the data we are analysing with this tool:

`Patient_ID,Billing_ID,Date,Time,Key,Value`

Example file starts like this:
```
f96ae85e2c3598e7eefa593a927fe1c8,d41d8cd98f00b204e9800998ecf8427e,2012-07-13,4:51:9,Gender,male
f96ae85e2c3598e7eefa593a927fe1c8,d41d8cd98f00b204e9800998ecf8427e,1999-03-13,15:26:20,Jitter_rel,0.25546
```
Billing_ID, Date and Time are currently not used and are optional, required are only a unique identifier of the data instance (Patient), a parameter name and the respective value, and the six columns format, so this line works as well:
```
Patient1,,,,A_numeric_parameter,5.8
```

Also necessary is the entities.csv file, specifying the data type, which can be String or Double.
In our Example that would be a file starting like this:
```
entity,datatype
Gender,String
Jitter_rel,Double
```

Example files can be found in `./dataset_examples`. To test them copy them to `./import` and restart the tool.


### Controlling the Data Import Scheduler ###
To learn more about the scheduler and the configuration methods read [here](https://apscheduler.readthedocs.io/en/latest/modules/triggers/cron.html#module-apscheduler.triggers.cron).
The scheduler can be controlled by 4 envrionment variables:
* `IMPORT_DISABLED` disables the scheduler if any value is set
* `IMPORT_DAY_OF_WEEK` decides which days the import is run
* `IMPORT_HOUR` decides which hour it is run at i.e. 5 is 5 a.m.
* `IMPORT_MINUTE` changes the minute the import is run



## Deploy Debian Based ##
Please keep in mind that the application uses port 80 by default, that can be changed in the `./docker-compose.yml`!
This is important because it manages the application startup automatically. i.e. Autostart and Restart on Crash

1. Open service `./scripts/data-warehouse.service`
2. Change `WorkingDirectory` to current main directory
3. Copy to systemd folder; for instance, `/etc/systemd/system/`
4. Run `sudo systemctl daemon-reload`
5. Run `sudo systemctl enable data-warehouse.service`
6. Run `sudo systemctl start data-warehouse.service`
7. Check whether it runs properly
* In order to add new data add a new `header.csv`,`entities.csv` and `dataset.csv` to the `./import` folder.
* The `header.csv`,`entities.csv` and `dataset.csv` files should look like in directory `dataset_examples` [(detailed documentation)](https://github.com/dieterich-lab/medex/tree/time-series/dataset_examples/Data_import.md).



Loading

0 comments on commit e424c6f

Please sign in to comment.