Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Hernán Vargas committed Mar 20, 2024
1 parent cb6ac2c commit 4cf1122
Show file tree
Hide file tree
Showing 7 changed files with 54 additions and 39 deletions.
10 changes: 5 additions & 5 deletions docs/concepts.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Core Concepts in DISK

- **Hypothesis:** A hypothesis statement is a set of assertions about entities that can be tested. A hypothesis can be tested by analyzing relevant data.
- **Hypothesis or Goal:** A hypothesis or goal is a set of assertions about entities that can be tested. A hypothesis can be tested by analyzing relevant data, as defined on Lines of Inquiry.
- **Question:** A statement that represents the goal of a scientific investigation.
- **Hypothesis or Question Template:** A general pattern that can be instantiated to create a particular hypothesis or question.
- **Workflow:** Workflows specify multi-step computations to carry out a type of data analysis.
- **Question Template:** A general pattern that can be instantiated to create a particular hypothesis or goal.
- **Workflow:** Workflows specify multi-step computations defimed to carry out a type of data analysis.
- **Method:** A general approach that is followed to test a hypothesis.
- **Line of Inquiry (LOI):** [DISK](https://disk.isi.edu) represents in an LOI *how* a hypothesis will be tested through a computational experiment. An LOI specifies: 1) A hypothesis template, 2) A query to retrieve relevant data from accessible data sources in [DISK](https://disk.isi.edu), 3) One or more workflows to analyze the data retrieved from the query, and 4) A meta-workflow to combine the results of all the workflows and synthesize findings.
- **Triggered Line of Inquiry:** When the user specifies a hypothesis, it is matched against the hypothesis templates of all LOIs. The matched LOI is then triggered for execution.
- **Line of Inquiry (LOI):** Represents *how* a hypothesis will be tested through a computational experiment. A LOI specifies: 1) A question template, 2) A query to retrieve relevant data from accessible data sources in [DISK](https://disk.isi.edu), 3) One or more workflows to analyze the data retrieved from the query, and 4) A meta-workflow to combine the results of all the workflows and synthesize findings.
- **Triggered Line of Inquiry:** When a Line of Inquiry finds a Hypothesis that match their query template, a new Triggered Line of Inquiry is created. TLOIs store data query results and workflow execution outputs.
- **Provenance:** [DISK](https://disk.isi.edu) records the provenance of all results so that they can be inspected and reproduced.
- **Metadata:** [DISK](https://disk.isi.edu) accesses data sources that contain datasets that are well described with appropriate metadata that can be used in specifying queries.

Expand Down
9 changes: 6 additions & 3 deletions docs/developer-guide/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ The same is true for each method provider (workflow systems), a [Method Adapter]

To understand how this is done we could look how the data interacts in the system.

![Disk API interactions](../figures/DISK-adapters.png "DISK API interactions")
![Disk API interactions](../figures/DISK-arq.png "DISK API interactions")

DISK provides an abstract classes to implement both, [method adapters](method-adapter) and [data adapters](data-adapter).
Check their respective page for a detailed explanation on how to create a new adapter.

The current implementation of the [DISK](https://disk.isi.edu) system includes two method adapters: [WINGS Workflow System](https://www.wings-workflows.org) adapter and [AirFlow](https://airflow.apache.org) adapter (WIP); And one data adapter: the [Semantic Media Wiki adapter](#).
The current implementation of the [DISK](https://disk.isi.edu) system includes two method adapters: [WINGS Workflow System](https://www.wings-workflows.org) adapter and [AirFlow](https://airflow.apache.org) adapter (WIP); And two data adapters: the [Semantic Media Wiki](https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki) adapter and a generic Graph DB [SPARQL](https://www.w3.org/TR/sparql11-query/) adapter.

## Implemented adapters

Expand All @@ -28,4 +28,7 @@ The current implementation of the [DISK](https://disk.isi.edu) system includes t
### Semantic Media Wiki Adapter
- Provides a `SPARQL` endpoint to search for data.
- SMW can be configured to use the same ontologies as [DISK](https://disk.isi.edu).
- SMW can be used to add files and metadata.
- SMW can be used to add files and metadata.

### Generic GraphDB SPARQL Adapter
- Provides a `SPARQL` endpoint to search for data.
34 changes: 15 additions & 19 deletions docs/developer-guide/data-adapter.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,28 @@ The data adapter must be able to perform at least the following operations:
- Obtain `SPARQL` results to use as options.
- Get information about files in the repository (hashes, dates, etc).


A simple overview of the abstract class is provided below, you can check code examples on [our repository](https://github.com/KnowledgeCaptureAndDiscovery/DISK-API/blob/main/server/src/main/java/org/diskproject/server/adapters/GraphDBAdapter.java).

```java
public abstract class DataAdapter {
// Basic endpoint information, getters and setters omitted for simplicity
private String endpointUrl, name, username, password, prefix, namespace;

public DataAdapter (String URI, String name);
public DataAdapter (String URI, String name, String username, String password);

public String getEndpointUrl ();
public String getName ();
protected String getUsername ();
protected String getPassword ();

public void setPrefix (String prefix, String namespace);
public String getPrefix ();
public String getNamespace ();

public abstract List<DataResult> query (String queryString);

//This data query must return two variable names:

// Query data to endpoint, return as DataResult or as a csv file as byte[]
public abstract List<DataResult> query (String queryString) throws Exception;
public abstract byte[] queryCSV(String queryString) throws Exception;

// Query for available options for variable varName, this queries URI and (opt) label.
static public String VARURI = "uri";
static public String VARLABEL = "label";
public abstract List<DataResult> queryOptions (String varname, String constraintQuery);

// file -> hash
public abstract Map<String, String> getFileHashes (List<String> dsurls);
public abstract List<DataResult> queryOptions (String varName, String constraintQuery) throws Exception;

// Check that a LOI is correctly configured for this adapter
public abstract boolean validateLOI (LineOfInquiry loi, Map<String, String> values);
// Get the hash (or e-tag) of a list of files to check if they have change
public abstract Map<String, String> getFileHashes (List<String> dsUrls) throws Exception;
public abstract Map<String, String> getFileHashesByETag(List<String> dsUrls) throws Exception;
}
```
38 changes: 27 additions & 11 deletions docs/developer-guide/method-adapter.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,39 @@
# Method Adapter

A [DISK](https://disk.isi.edu) Method adapter is the implementation of the `MethodAdapter` abstract class (at the end).
This adapters are used to gain control of the workflow system from [DISK](https://disk.isi.edu).
A [DISK](https://disk.isi.edu) Method adapter is the implementation of the `MethodAdapter` abstract class (shown at the end).
These adapters are used to give [DISK](https://disk.isi.edu) control of the workflow system.

The method adapter must be able to perform at least the following operations:

- Get a list of methods
- Get details of parameters
- Send a workflow execution
- Monitor workflows
- Monitor a workflows execution

Code examples available in [our repository](https://github.com/KnowledgeCaptureAndDiscovery/DISK-API/blob/main/server/src/main/java/org/diskproject/server/adapters/AirFlowAdapter.java).


```java
public abstract class MethodAdapter {
public MethodAdapter () {}

public List<Method> ListMethods ();
public Method GetMethodInfo (String methodid);
public boolean RunMethod (Method method);

// Check that a LOI is correctly configured for this adapter
public abstract boolean validateLOI (LineOfInquiry loi, Map<String, String> values);
// Basic endpoint information, getters and setters omitted for simplicity
private String name, id, endpointUrl, username, password, description;
private Float version;

public MethodAdapter (String adapterName, String url);
public MethodAdapter (String adapterName, String url, String username, String password);

// Get workflows and input parameters
public abstract List<WorkflowTemplate> getWorkflowList();
public abstract List<WorkflowVariable> getWorkflowVariables(String id);
// Checks if a list of input files are available for the workflow manager.
public abstract List<String> areFilesAvailable (Set<String> fileList, String dType);
// Uploads/register data into the workflow manager
public abstract String addData (String url, String name, String dType) throws Exception;
// Runs a workflow
public abstract List<String> runWorkflow (String wfId, List<VariableBinding> vBindings, Map<String, WorkflowVariable> inputVariables);
// Monitor workflow
public abstract Execution getRunStatus (String runId);
// Download a output file
public abstract FileAndMeta fetchData (String dataId);
}
```
Binary file added docs/figures/DISK-arq.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/figures/DISK-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ User-defined hypotheses are re-run when new data or methods become available.

### Overview of DISK Architecture

![Disk API interactions](figures/DISK-adapters.png "DISK API interactions")
![Disk API interactions](figures/DISK-arq.png "DISK API interactions")

For more information about the [DISK](https://disk.isi.edu) architecture, please check the [architecture](developer-guide/architecture/) page.

Expand Down

0 comments on commit 4cf1122

Please sign in to comment.