-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Martin Gleize
committed
Mar 9, 2020
1 parent
90c0ec6
commit b07259b
Showing
914 changed files
with
810,913 additions
and
193,932 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
** | ||
|
||
!Dockerfile | ||
!data | ||
!target/*.war | ||
!*.properties | ||
!LICENSE | ||
!README.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,30 @@ | ||
/target/ | ||
/output/ | ||
**/.DS_Store | ||
scripts/bcts.supervised.properties | ||
logs/ | ||
bct-s.properties | ||
scripts/*.properties | ||
scripts/res*.txt | ||
test_ie_index/ | ||
PropertiesExplained.properties | ||
PropertiesExplained | ||
com.ibm.drl.hbcp.inforetrieval.apr/*.properties | ||
trec_eval/trec_eval | ||
|
||
# Binaries | ||
*.class | ||
|
||
# eclipse project file | ||
.settings/ | ||
.classpath | ||
.project | ||
|
||
# IntelliJ files | ||
.idea/ | ||
hbcpIE.iml | ||
/nbproject/ | ||
|
||
# Grobid-related files/folders | ||
/grobid-0.5.3 | ||
grobid.log* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# You can validate this file at: https://lint.travis-ci.org | ||
|
||
language: java | ||
jdk: | ||
- openjdk8 | ||
|
||
cache: | ||
directories: | ||
- $HOME/data/ | ||
- $HOME/data_resources/ | ||
|
||
#Use the default script command of travis with maven which is mvn test -B | ||
#Before running script, for maven projects travis executes 'mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V' | ||
|
||
before_script: mvn install -DskipTests=true -Dmaven.javadoc.skip=true -B -V | ||
|
||
script: | ||
- mvn javadoc:javadoc | ||
- mvn test -B | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# OpenJDK JRE for Java 8, with Tomcat 9.0.30 server | ||
FROM tomcat:9.0.30-jdk8-openjdk | ||
# Tomcat server port | ||
EXPOSE 8080 | ||
# places the app in the Tomcat app folder (at the root of the server) | ||
RUN rm -rf /usr/local/tomcat/webapps/* | ||
COPY ./target/hbcpIE-0.0.1-SNAPSHOT.war /usr/local/tomcat/webapps/ROOT.war | ||
# set the max size of the JVM heap | ||
ENV CATALINA_OPTS -Xms512m -Xmx8g | ||
# run catalina (Tomcat's servlet container) | ||
CMD ["catalina.sh", "run"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
Copyright 2018 - UCL | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,190 +1,110 @@ | ||
# Human Behaviour Change Project Information Extraction (hbcpIE) | ||
version 2.0.0 | ||
# Human Behaviour Change Project (HBCP) | ||
|
||
The Human Behaviour-Change Project (HBCP) is a collaboration between behavioral scientists, computer scientists and | ||
system architects that aims to revolutionize methods for synthesizing evidence in real time and generate new insights on | ||
behavior change. | ||
|
||
### Update: | ||
demo currently down due to security issue. | ||
The system can be run locally following the instructions below. | ||
This repository includes code for two important tasks in the project: behavior change entity extraction and prediction | ||
for behavior change (e.g., outcome value given a set of population and intervention entities). | ||
|
||
## Getting Started | ||
|
||
17/04/2019 | ||
These instructions will get you a copy of the project up and running on your local machine. | ||
|
||
Copyright (c): Apache License Version 2.0 | ||
### Prerequisites | ||
|
||
Created by HBCP team, IBM Research, Dublin | ||
hbcpIE uses Java 1.8 and needs to have [Maven](https://maven.apache.org/) installed to compile and run the project. | ||
|
||
### Installing | ||
|
||
* Code contributions: Debasis Ganguly, Martin Gleize, Yufang Hou, Charles Jochim, Francesca Bonin | ||
After cloning the project go the the root `hbcpIE` directory. | ||
|
||
Java version: "1.8.0" | ||
|
||
The project represents the first version of the code of the HBCP project. | ||
It contains two main packages and information extraction pipeline ( extractor) and a prediction pipeline (predictor). Both have APIs that can be tested via a Swagger UI. | ||
|
||
|
||
## Extractor | ||
The package executes the following action: | ||
1. Indexes a collection of documents | ||
2. Store them in a Lucene index | ||
3. Extracts pieces of information from arbitrary passages. | ||
|
||
|
||
To cite the work please cite: | ||
|
||
Debasis Ganguly, L�a A. Deleris, Pol Mac Aonghusa, Alison J. Wright, Ailbhe N. Finnerty, Emma Norris, Marta M. Marques, Susan Michie: | ||
Unsupervised Information Extraction from Behaviour Change Literature. MIE 2018: 680-684 | ||
```@inproceedings{DBLP:conf/mie/GangulyAAWFNMM18, | ||
author = {Debasis Ganguly and | ||
L{\'{e}}a A. Deleris and | ||
Pol Mac Aonghusa and | ||
Alison J. Wright and | ||
Ailbhe N. Finnerty and | ||
Emma Norris and | ||
Marta M. Marques and | ||
Susan Michie}, | ||
title = {Unsupervised Information Extraction from Behaviour Change Literature}, | ||
booktitle = {Building Continents of Knowledge in Oceans of Data: The Future of | ||
Co-Created eHealth - Proceedings of {MIE} 2018, Medical Informatics | ||
Europe, Gothenburg, Sweden, April 24-26, 2018}, | ||
pages = {680--684}, | ||
year = {2018} | ||
} | ||
Compile the code: | ||
``` | ||
mvn clean compile -U | ||
``` | ||
[bibtex](https://dblp.uni-trier.de/rec/bibtex/conf/mie/GangulyAAWFNMM18) | ||
|
||
|
||
### Description | ||
The project takes as input pdfs (scientific articles of behaviour change interventions) and returns several attributes encoded by the Behaviour Change Intervention Ontology developed by UCL in the context of the Human Behaviour Change Project (HBCP) (https://www.humanbehaviourchange.org/). A rest API with a Swagger UI is provided. This allows everybody to test the system from a web browser: [http://23.97.177.82:8180/swagger-ui.html](http://23.97.177.82:8180/swagger-ui.html) | ||
|
||
Supported entities for the moment: | ||
|
||
Population characteristics: min age, max age, gender and mean age. | ||
|
||
Behavioural Change Techniques: Goal Setting (Behaviour), Problem Solving, Action Planning, Feedback on behaviour, Self-monitoring of behaviour, Social support (unspecified), Information about health consequences, Information about social and environmental consequences, Pharmacological support and Reduce negative emotions. | ||
|
||
Outcome : outcome value | ||
|
||
## Predictor | ||
The package executes the following actions: | ||
1. Takes in input entities and the documents from which they have been extracted | ||
2. Build a relation graph where entities are nodes and co-occurrence in the same document are edges | ||
3. Create an embedded space via Node2Vec | ||
4. Allow to query the space in order to find a entity given 1 or more other entities. | ||
|
||
### Description | ||
The project takes as input pdfs (scientific articles of behaviour change interventions) and a json of either extracted either manually annotated entities. It allow the user to write a query, and retrieves the most relevant outcome value for that particualre query. A rest API with a Swagger UI is provided. This allows everybody to test the system from a web browser: [http://23.97.177.82:8180/swagger-ui.html](http://23.97.177.82:8180/swagger-ui.html) | ||
|
||
|
||
### Important Note: | ||
Please consider that this is a work in progress and we are currently working on improving/expanding the project. | ||
Feel free to download the code and use it, as well as to send us feedback and bug report. | ||
|
||
|
||
## Requirements | ||
Most recent version of Maven: https://maven.apache.org/download.cgi | ||
|
||
## What has been released in this repository | ||
|
||
We are releasing: | ||
- code for supervised, semisupervised and unsupersived retrieval of a selection of BCTs | ||
- Swagger api facilities | ||
- Java documentation for each class | ||
- 17 fully annotated open access papers | ||
|
||
## Dataset | ||
|
||
In the context of the HBCP (https://www.humanbehaviourchange.org/), 244 papers of behaviour science intervention papers have been annotated for 10 behaviour change techniques and 4 population characteristics (min age, max age, mean age and gender) according to the Behaviour Change Intervention Ontology. Our pre-trained model is trained on a subset of 111 papers. We release 17 papers (that the model has not been trained on) as a sample dataset. Those 17 papers are open access and publicly available. | ||
|
||
|
||
|
||
## Quickstart for behavioural science users | ||
### Information Extraction | ||
## Example Usage | ||
|
||
- Visit this page to demo the system http://23.97.177.82:8180/swagger-ui.html | ||
- Click on extractor-controller | ||
- Choose the entity you want to extract from your pdf: | ||
- all BCT present in the paper (allbcts) | ||
- all BCT present in the paper using a supervised method (allbcts/supervised) | ||
- detect the presence of the bct specified in the parameter "code" (bct) | ||
- detect the presence of the bct specified in the parameter "code" using a supervised method (bct/supervised) | ||
- the gender of participants (gender) | ||
- the minimum age of participants (minage) | ||
- the max of participants (maxage) | ||
- the mean of participants (meanage) | ||
- Upload the pdf with the : "choose file" button | ||
- Click "Try it out!" | ||
Once you've confirmed that Maven can build the code, there are two ways to quickly test our APIs. | ||
1. [Use with Maven commands](#with-maven-commands) | ||
1. [Use with Docker](#with-docker) | ||
|
||
Results will be shown in this format: | ||
### With Maven commands | ||
|
||
The easiest way to test our entity extraction and prediction APIs is via a [Swagger](https://swagger.io/) interface | ||
using your own PDFs of behavior change literature. Before doing that, we need to build the indexes used by extraction | ||
and prediction. This is done with the following commands: | ||
``` | ||
code": "Minage", | ||
"docName": "Volpp 2009 primary paper.pdf", | ||
"extractedValue": "18" | ||
} | ||
mvn exec:java@indexer | ||
``` | ||
and | ||
``` | ||
mvn exec:java@extractor | ||
``` | ||
|
||
I.E. Looking for gender in pdf Volpp 2009 primary paper.pdf, the extracted value is 18. | ||
|
||
|
||
### Prediction | ||
Visit this page to demo the system http://23.97.177.82:8180/swagger-ui.html | ||
|
||
Click on predictor-controller | ||
|
||
You can either: | ||
- get one attribute given an ID | ||
- get the entire set of attributes ( interventions or gender, or settings etc.) | ||
- predict an outcome given a population query and a intervention query expressed as follows: | ||
|
||
- For population: C:<attributeID>:value, eg. C:4507435:18 for Mean age=18 | ||
- For Interventions: I:<attributeID>:value, eg. I:3673271:1 for Goal settings. | ||
(you should see after each of these something like `[INFO] BUILD SUCCESS`) | ||
|
||
Click "Try it out!" | ||
Results will be shown in this format: | ||
Next we will start the server that will allow us to access the Swagger interface: | ||
``` | ||
mvn spring-boot:run | ||
``` | ||
|
||
## Quickstart for coders | ||
This will take several seconds to start. After it has started, open a web browser and go to | ||
http://127.0.0.1:8080/swagger-ui.html. You can then follow the instructions on that page to see how to use the extractor and | ||
predictor APIs. | ||
|
||
### With Docker | ||
|
||
### For information extraction | ||
From command line type the following command to index the collection. | ||
First build the project with: | ||
``` | ||
mvn exec:java@indexer | ||
mvn clean install | ||
``` | ||
The next step is to execute the following command to run the IE pipeline. | ||
Build the docker image with: | ||
``` | ||
mvn exec:java@extractor | ||
docker build . | ||
``` | ||
The only thing you might need is to increase your Docker runtime memory option to at least 8GB. | ||
|
||
### For predictions | ||
Run a docker container exposing the API on port 8080: | ||
``` | ||
mvn exec:java@predict | ||
docker run -t -p 8080:8080 [your_image_id] | ||
``` | ||
It will take a while, as it is indexing and running the extraction. After it has started, open a web browser and go to http://127.0.0.1:8080/swagger-ui.html. You can then follow the instructions on that page to see how to use the extractor and predictor APIs. | ||
|
||
### Test REST API with Swagger UI | ||
The project uses spring boot and spring fox with swagger ui. | ||
## Publications | ||
|
||
From the command line, locate yourself in the project HOME (hbcpIE/) and execute | ||
``` | ||
mvn spring-boot:run | ||
``` | ||
Open a browser and hit this [URL] (http://localhost:8180/swagger-ui.html) | ||
If you use the extractor please cite: | ||
|
||
* Debasis Ganguly, Yufang Hou, Léa A. Deleris, Francesca Bonin: | ||
[Information Extraction of Behavior Change Intervention Descriptions](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6568066/). AMIA Joint Summits on Translational Science | ||
proceedings 2019:182–191. | ||
|
||
## Java Documentation | ||
[Javadoc](apidocs/index.html) | ||
#### Other Related Publications | ||
* Yufang Hou, Debasis Ganguly, Léa A. Deleris, Francesca Bonin: | ||
[Extracting Factual Min/Max Age Information from Clinical Trial Studies](https://www.aclweb.org/anthology/W19-1914/). | ||
Proceedings of the 2nd Clinical Natural Language Processing Workshop 2019: 107-116. | ||
* Debasis Ganguly, Léa A. Deleris, Pol Mac Aonghusa, Alison J. Wright, Ailbhe N. Finnerty, Emma Norris, Marta M. Marques, Susan Michie: | ||
[Unsupervised Information Extraction from Behaviour Change Literature](http://ebooks.iospress.nl/publication/48878). MIE 2018: 680-684 | ||
* Susan Michie, James Thomas, Marie Johnston, Pol Mac Aonghusa, John Shawe-Taylor, Michael P. Kelly, Léa A. Deleris, | ||
Ailbhe N. Finnerty, Marta M. Marques, Emma Norris, Alison O’Mara-Eves, Robert West: [The Human Behaviour-Change Project: | ||
harnessing the power of artificial intelligence and machine learning for evidence synthesis and interpretation](https://doi.org/10.1186/s13012-017-0641-5). | ||
Implementation Science 12, 121 (2017). | ||
|
||
|
||
## THANKS | ||
Thanks to the UCL annotators that developed the Behaviour Change Intervention Ontology. | ||
## Team Members | ||
* Debasis Ganguly | ||
* Martin Gleize | ||
* Yufang Hou | ||
* Charles Jochim | ||
* Francesca Bonin | ||
* Pierpaolo Tommasi | ||
|
||
## CHANGES | ||
- version 2.0 | ||
## Acknowledgments | ||
Thanks to the UCL annotators that developed the Behaviour Change Intervention Ontology. | ||
|
||
## LICENSE | ||
This program is free software; you can redistribute it and/or | ||
modify it under the terms of the Apache License Version 2.0. | ||
## License | ||
This program is free software; you can redistribute it and/or modify it under the terms of the [Apache License | ||
Version 2.0](./LICENSE). | ||
|
||
## Contact | ||
|
||
For help or issues using the HBCP code, please submit a GitHub issue. | ||
For personal communication related to the project, please contact the HBCP team ([email protected]) |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.