-
Notifications
You must be signed in to change notification settings - Fork 2
Workflow Runner API
✅ Status
The first implementation of this API is made to interface the Taverna Server, and is under development. A deployment of the latest snapshot is available at http://sandbox.wf4ever-project.org/runner/default/ - which accesses http://sandbox.wf4ever-project.org/taverna-server/
Research Objects, for the purpose of Wf4Ever, will generally contain workflows. In order to assess if a workflow is functional, it is generally useful to be able to (re)-execute a workflow.
Different workflow systems have different ways of running a workflow. For instance, Taverna has the Taverna Server, while Wings has a portal and a Pegasus/Condor engine in the backend. This API intends to provide a common lightweight interface within Wf4Ever for features such as "Run this workflow please" and "Show me the data from that workflow run".
At its heart, this API mirrors the RODL API, but the ROs exposed by this service each represent a particular workflow run, structured to show inputs, outputs, console logs, provenance and annotations containing wfprov and wfdesc mappings. Thus it intends to be possible to use existing RODL compatible tools with this service, for instance adding from the RO command line tool, browsing with the Portal or transforming to wfdesc using the Workflow Transformer service.
Accessing the root of the service, in this specification exemplified as http://example.com/runner, SHOULD redirect to a default server runs resource. From here the client may either:
- POST a new workflow run, providing as a minimum the workflow definition
- GET a list of existing workflow runs
- DELETE existing workflow runs
A client may also create a new run by uploading a workflow definition, provide inputs and initiate running the workflow.
See the Resources and formats below for details.
Resources are located using specific properties in the RO manifest for the workflow run.
Property Description
runner:workflow
Used in the workflow run description to link the workflow run with the main
workflow to run, such as uploaded on RO creation. It is a subproperty of
ore:aggregates
.
runner
:inputs Used in the workflow run description to link the workflow run with the list
of required workflow inputs, if any. It is a subproperty of ore:aggregates
.
runner:outputs
Used in the workflow run description to link the workflow run with the list
of (expected or actual) workflow outputs, if any. It is a subproperty of
ore:aggregates
.
runner:logs
Used in the workflow run description to link the workflow run with the list
of logs, such as stdout, if any. It is a subproperty of ore:aggregates
.
runner:provenance
Used in the workflow run description to link the workflow run with the list
of provenance related resources, if any. It is a subproperty of
ore:aggregates
.
runner:workingDirectory
Used in the workflow run description to link the workflow run with the list
of working directory and its files, if any. It is a subproperty of
ore:aggregates
.
Property | Description |
---|---|
runner:workflow | Used in the workflow run description to link the workflow run with the main workflow to run, such as uploaded on RO creation. It is a subproperty of ore:aggregates. |
runner:inputs | Used in the workflow run description to link the workflow run with the list of required workflow inputs, if any. It is a subproperty of ore:aggregates. |
runner:outputs | Used in the workflow run description to link the workflow run with the list of (expected or actual) workflow outputs, if any. It is a subproperty of ore:aggregates. |
runner:logs | Used in the workflow run description to link the workflow run with the list of logs, such as stdout, if any. It is a subproperty of ore:aggregates. |
runner:provenance | Used in the workflow run description to link the workflow run with the list of provenance related resources, if any. It is a subproperty of ore:aggregates. |
runner:workingDirectory | Used in the workflow run description to link the workflow run with the list of working directory and its files, if any. It is a subproperty of ore:aggregates. |
All formats are based on RDF in text/turtle and application/rdf+xml (by content negotiation) unless noted otherwise.
The resource types are listed below. Specifically, a compliant implementation of the Workflow runner API SHOULD support:
- Finding default workspace to redirect to the RO workspace of the default server
- Retrieve runs in workspace to see current runs
- Submit new run to workspace to create a new run
- Retrieve run to view a run and its resources
- Retrieving the workflow status to check the current status
- Changing the workflow status to initiate running of the workflow
- Retrieving the outputs when the workflow has status http://purl.org/wf4ever/runner#Finished
Resource type | Description |
---|---|
Workspace | Represents a list of workflow runs, similarly to how an RO service specified a list of research objects. The only format available is text/uri-list, which returns a list of URIs that SHOULD point to research objects representing workflow runs. |
Workflow run | A workflow run is represented as a research object and as such it shares the format of the research object as defined in the RO API. The preferred format is RDF; the support for ZIP and HTML formats is optional. The RDF format may be subject to content negotiation. |
Workflow |
The workflow as posted by the creator. It may be a workflow description as an RDF file (format subject to content negotiation) or the actual workflow file, such as application/vnd.taverna.t2flow+xml in case of a Taverna 2 workflow.
|
Workflow status |
A one-element list of URIs, in which the URI is one of predefined values indicating the status of the workflow run. The format is text/uri-list .
|
Inputs |
Any resource that has been submitted as an input to the workflow run. When submitting an input, it is possible to specify an external reference by using a “text/uri-list ” format.
|
Outputs |
Any outputs generated by the workflow run. Special formats can be used to indicate an error in generating the specific output, such as application/vnd.wf4ever.runner.error .
|
Provenance | An ro:Folder aggregating provenance resources. |
Working Directory | An ro:Folder, which content will be/was the current directory (./) when running the workflow |
Logs | An ro:Folder aggregating the log files. |
HEAD or GET on this entry point SHOULD redirect to a workspace of workflow runs on the default server:
The returned location MUST point to a workspace (see [#Retrieve] below).C: HEAD http://example.com/runner HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 303 See Other S: Location: http://example.com/runner/default/
The service MAY return 405 Method Not Allowed if it has no default server, in which case it MUST support browsing of explicit servers (see below).
The service MAY support browsing other workflow servers than the default, by ways of POSTing a text/uri-list specifying the service.
The returned location MUST point to a workspace of workflow runs.C: POST http://example.com/runner HTTP/1.1 C: Content-Type: text/uri-list C: C: http://galaxy.example.net/server/ S: HTTP/1.1 303 See Other S: Location: http://example.com/runner,galaxy=/server/
The service SHOULD return 400 Bad Request
if more than one URI was included, or the URI was malformed.
✅ This specification does not require any particular URI templates for the redirection. It is an implementation detail how the Workflow Runner service relates the request to the actual, underlying workflow execution service.
⚠️ Clients MUST ensure that the submitted URI is encoded according to RFC 3986, for instance http://example.net/fred%20and%20me/ rather than http://example.net/fred and me/.
Servers MAY use the submitted URI as a basis for constructing the returned URL, but MUST then ensure that it is likewise properly escaped.
The list of server runs is represented as a RODL workspace, where each RO represents a run.
Each URI returned, if any, SHOULD point to a research object representing a workflow run.C: GET http://example.com/runner/default/ HTTP/1.1 C: Accept: text/uri-list C: Authorization: Bearer h480djs93hd8 S: HTTP/1.1 200 OK S: Content-Type: text/uri-list S: S: http://example.com/runner/default/1/ S: http://example.com/runner/default/2/ S: http://example.com/runner/default/4/
Creating a new run is similar to creating a new research object, but requires the content-type text/uri-list
to include the URL for the workflow definition to run.
The returned location refers to a research object representing the run.C: POST http://example.com/runner/default/ HTTP/1.1 C: Content-Type: text/uri-list C: Content-Length: ... C: Slug: 1337 C: Authorization: Bearer h480djs93hd8C: C: http://example.net/workflow.t2flow S: HTTP/1.1 201 Created S: Location: http://example.com/runner/default/1337/
The client MAY provide the Slug
: header to suggest a name to include in the created run, which the service MAY support. The service SHOULD ensure the returned run URI is unique, even if multiple POSTs submit the same workflow URL.
The service SHOULD attempt to retrieve the provided workflow definition before responding to the request.
The service SHOULD NOT start running the workflow immediately, but wait for the client to modify its status. (See below).
The service SHOULD fail with 502 Bad Gateway
if it is unable to retrieve the submitted workflow definition due to network issues or HTTP errors (including 404), or 504 Gateway Timeout
if the request for the definition timed out. The service SHOULD include an error message in the response body to indicate the nature of this failure.
The service SHOULD fail with 501 Not Implemented
if the service did successfully retrieve the workflow definition, but the underlying workflow server does not support its format. The server MAY include an error message in the response body to indicate supported workflow definition formats and/or media types.
A workflow run is represented as a research object, thus retrieving it will redirect to a manifest listing its constituent resources.
The manifest MUST include Workflow Runner specific extensions to indicate the corresponding Workflow Runner API specific resources that are supported by the service. These are declared in the namespace http://purl.org/wf4ever/runner# (prefixC: GET http://example.com/runner/default/1337/ HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 303 See Other S: Location: http://example.com/runner/default/1337/manifest C: GET http://example.com/runner/default/1337/manifest HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 200 OK S: Content-Type: text/turtle S: S: @base <http://example.com/runner/default/1337/> . S: @prefix ro: <http://purl.org/wf4ever/ro#> . S: @prefix ore: <http://www.openarchives.org/ore/> . S: # .. S: <http://example.com/runner/default/1337/> a ro:ResearchObject ; S: ore:aggregates <workflow>, <status> . S: # ...
runner
: below) and associated with the research object, which MUST be of the type runner:WorkflowRun
.
Supported properties and types:@prefix runner: <http://purl.org/wf4ever/runner#> . @prefix wfdesc: <http://purl.org/wf4ever/wfdesc#> . @prefix wf4ever: <http://purl.org/wf4ever/wf4ever#> . @base <http://example.com/runner/default/1337/> . # .. <> a runner:WorkflowRun, ro:ResearchObject, wf4ever:WorkflowResearchObject ; ore:aggregates <workflow>, <status>, <inputs>, <outputs>, <logs> ; runner:workflow <workflow> ; runner:status <status> ; runner:inputs <inputs> ; runner:outputs <outputs> ; runner:logs <logs> . <workflow> a runner:Workflow, ro:Resource . <status> a runner:Status, ro:Resource . <inputs> a runner:Inputs, ro:Folder ; ore:isDescribedBy <inputs/> . <outputs> a runner:Outputs, ro:Folder ; ore:isDescribedBy <outputs/> . <logs> a runner:Logs, ro:Folder ; ore:isDescribedBy <logs/> . # ... proxies, annotations et al.
Property | Type | Superclass | Description |
---|---|---|---|
None | runner:WorkflowRun | wf4ever:WorkflowResearchObject | A research object that represents a particular workflow run |
runner:workflow | runner:Workflow | wfdesc:Workflow | (Required)The main workflow to run,such as uploaded on RO creation |
runner:status | runner:Status | ro:Resource | (Required)The status of the workflow, such as 'Running' or 'Finished' |
runner:inputs | runner:Inputs | ro:Folder | List of required workflow inputs, if any |
runner:outputs | runner:Outputs | ro:Folder | List of (expected or actual) workflow outputs, if any |
runner:logs | runner:Logs | ro:Folder | List of logs, such as stdout, if any |
runner:provenance | runner:Provenance | ro:Folder | List of provenance related resources, if any |
runner:workingDirectory | runner:WorkingDirectory | ro:Folder | List of working directory and its files, if any |
ore:aggregates
and have domain runner:WorkflowRun
See resources below for details of each type.
Retrieving the resource indicated with runner:workflow in the manifest SHOULD return the workflow definition originally posted.
The service MAY return the workflow definition directly (as in the example above), or MAY redirect withC: GET http://example.com/runner/default/1337/workflow HTTP/1.1 S: HTTP/1.1 200 OK S: Content-Type: application/vnd.taverna.t2flow+xml S: S: <?xml version="1.0"> ...
303 See Other
to the URI originally submitted when creating the RO.
The service MAY support replacing the workflow definition with PUT, but this is not covered by this specification, as it has ramifications for the other resources of the research object.
The manifest SHOULD include an annotation on the native runner:Workflow
to provide a wfdesc:Workflow
description of the workflow structure:
@prefix ao: <http://purl.org/ao/> . <> a runner:WorkflowRun, ro:ResearchObject, wf4ever:WorkflowResearchObject ; ore:aggregates <workflow>, <status>, :wfdesc, <workflow.wfdesc> ; runner:workflow <workflow> ; <workflow> a runner:Workflow, ro:Resource . <workflow.wfdesc> a ro:Resource . :wfdesc a ao:Annotation ; ao:topic wfdesc:Workflow ; ao:annotatesResource <workflow> ; ao:body <workflow.wfdesc> .
✅ The Research Object model intends to move from using AO to the unified Open Annotation Model, where the above annotation is better rendered as:Retrieving the wfdesc:@prefix oa: <http://www.w3.org/ns/openannotation/core/> . @prefix oax: <http://www.w3.org/ns/openannotation/extension/> . :wfdesc a oa:Annotation ; oax:hasSemanticTag wfdesc:Workflow ; oa:hasBody <workflow.wfdesc> ; oa:hasTarget <workflow> .
As the service might be needing to use the Wf-RO transformation service to create the wfdesc, this resource might not be available within a reasonable amount of time. The service SHOULD in this case respond withC: GET http://example.com/runner/default/1337/workflow.wfdesc HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 200 OK S: Content-Type: text/turtle S: S: @base <http://ns.taverna.org.uk/2010/workflowBundle/8781d5f4-d0ba-48a8-a1d1-14281bd8a917/workflow/Hello_World/> . S: @prefix wfdesc: <http://purl.org/wf4ever/wfdesc#> . S: @prefix wf4ever: <http://purl.org/wf4ever/wf4ever#> . S: @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . S: <> a wfdesc:Workflow , wfdesc:Description , wfdesc:Process ; S: rdfs:label "Hello_World" ; S: wfdesc:hasOutput <out/greeting> ; S: wfdesc:hasSubProcess <processor/hello/> ; S: wfdesc:hasDataLink <datalink?from=processor/hello/out/value&to=out/greeting> . S: (..)
504 Gateway Timeout
. The client MAY then try retrieving the resource again after a small delay.
Retrieving the resource indicated with runner:status
in the manifest MUST return the current status of the workflow run.
The returned URI list MUST include one and only one of these URIs:C: GET http://example.com/runner/default/1337/status HTTP/1.1 C: Accept: text/uri-list S: HTTP/1.1 200 OK S: Content-Type: text/uri-list S: S: http://purl.org/wf4ever/runner#Initialized
URI | Label | Description |
---|---|---|
http://purl.org/wf4ever/runner#Initialized | Initialized | The research object has been created (the RO is considered an roevo:LiveRO) |
http://purl.org/wf4ever/runner#Ready | Ready | All required inputs and resources are provided, the workflow is ready to run (ie. the RO is an wfdesc:WorkflowInstance) |
http://purl.org/wf4ever/runner#Queued | Queued | The workflow is in the queue, waiting to be run by the underlying workflow server |
http://purl.org/wf4ever/runner#Running | Running | The workflow is actively running on the workflow server |
http://purl.org/wf4ever/runner#Failed | Failed | The workflow could not run, or failed while running |
http://purl.org/wf4ever/runner#Finished | Finished | The workflow completed running |
http://purl.org/wf4ever/runner#Cancelled | Cancelled | The workflow run was cancelled, for instance by the client or by a server time out |
http://purl.org/wf4ever/runner#Archived | Archived | The workflow runner service has finished post-run processing (the RO is now considered an roevo:ArchivedRO) |
The service might not support all of the above status types, but MUST support Initialized, Running
and Archived
.
The service MAY do its own state transitions, like Initialized
to Ready
or Finished
to Archived
, but SHOULD NOT start the workflow as Running
unless the client has requested Queued
or Running
. (See below).
The state Archived means that the Research Object has a complete view of the workflow run. Until the workflow run is in this state, requests for resources such as outputs, the manifest and provenance MAY give incomplete results or 404 Not Found
. The workflow service SHOULD automatically transition from Finished to Archived
, but SHOULD NOT do this transition from failure states such Failed
or Cancelled
.
⚠️ Should Archived be an additional state, to keep the final state of Finished, Cancelled or Failed?
The client can request a desired state transitions by PUT-ing to the status resource:
The client MUST include one and only one of the above listed Workflow Runner statuses, but MAY also include third-party statuses.C: PUT http://example.com/runner/default/1337/status HTTP/1.1 C: Content-Type: text/uri-list C: C: http://purl.org/wf4ever/runner#Running S: HTTP/1.1 200 OK S: Content-Type: text/uri-list S: S: http://purl.org/wf4ever/runner#Queued
The service SHOULD ignore third-party statuses it does not support them. The service MAY throw errors if it understands the third party, but refuses to fulfil the request.
The service MUST return the current status, which MAY be different from the requested status (as in the above example).
The service MAY respond with 202 Accepted
if transitioning to the new state is not be immediate, for instance if it takes a while to cancel a workflow run. (Note however that the state Queued is intended for the state transition to Running
).
If the state transition is not valid according to the current state, like from Failed to Finished, the service MUST fail with 409 Conflict
.
If the client requests change to a state that is not supported by the service, like Cancelled, the service SHOULD fail with 501 Not Implemented
. Alternatively, the service MAY change the status to a similar status (like Finished instead of Cancelled) and return 200 OK
.
The service MUST support the following state transitions:
Current status | Client requests | Possible returns |
---|---|---|
Initialized | Ready |
Initialized (it was not ready), Ready, 501 Not Implemented (service can't check readiness without running)
|
Initialized | Running | Initialized (it was not ready), Queued, Running, Failed, Finished, Cancelled, Archived |
Ready | Running | Queued, Running, Failed, Finished, Cancelled, Archived |
Finished | Archived |
Archived, 202 Accepted
|
Retrieving the folder identified using runner:inputs in the manifest:
C: GET http://example.com/runner/default/1337/inputs HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 303 See Other S: Location: http://example.com/runner/default/1337/inputs/ C: GET http://example.com/runner/default/1337/inputs/ HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 200 OK S: Content-Type: text/turtle S: S: @prefix ro: <http://purl.org/wf4ever/ro#> . S: # .. S: <> a ro:Folder, runner:Inputs ; S: ore:aggregates <in1>, <in2> . S: :in1 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <in1> ; S: ro:entryName "in1" . S: :in2 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <in2> ; S: ro:entryName "in2" . S: <in1> a ro:Resource . S: <in2> a ro:Folder ; S: ore:isDescribedBy <in2/> .
The service MAY provide a list of expected inputs (such as in the example above). An attempt by the client to retrieve these before⚠️ Expected Inputs
Note that the expected inputs listed might not yet exist, so a GET on<in1>
above would then give a404
until it has been uploaded with PUT.
PUT}}ing
them SHOULD give a {{404
error unless they contain a default from the workflow definition.
The service MAY expect nested inputs (ie. a list of values, or list of lists of values, etc). Such inputs are indicated by being a ro:Folder
rather than ro:Resource
.
The client SHOULD provide inputs for all input resources:
C: PUT http://example.com/runner/default/1337/inputs/in1 C: Content-Type: text/plain C: C: A textual value S: HTTP/1.1 204 No Content
The service MAY respond with 415 Unsupported Media Type
if the content type is not supported, for instance because it requires an input to be a URI or a file, as shown below.
The client MAY attempt to change the state to Ready to see if the inputs provided are sufficient to run the workflow.
The client MAY provide input to be retrieved from an URI:
The service SHOULD respond with aC: PUT http://example.com/runner/default/1337/inputs/in1 C: Content-Type: text/uri-list C: C: http://example.org/external.txt S: HTTP/1.1 204 No Content
415 Unsupported Media Type
if it does not support input from an URI (that is the URL would be interpreted as a literal by the workflow system).
The service SHOULD respond with a 400 Bad Request
if the URI given is not valid or not supported, for instance ftp://example.com/file.txt.
The client MAY provide input to be retrieved from a file uploaded to the working directory (See below):
The service MAY in this case recognize the prefix for the working directory as given byC: PUT http://example.com/runner/default/1337/inputs/in1 C: Content-Type: text/uri-list C: C: http://example.com/runner/default/1337/workingDirectory/uploaded.txt S: HTTP/1.1 204 No Content
runner:workingDirectory
in the manifest, and replace the URL with the the relative file path uploaded.txt
when running the workflow.
The client is not required to have already uploaded the file, for instance this file could be written by the workflow itself or by the client at a later stage. However the service MAY in this case refuse to run the workflow if it does not support this feature.
Outputs are shown as a folder structure, similar to inputs, by following the runner:outputs
link in the manifest.
The service MAY show expected output resources before the workflow has been in stateC: GET http://example.com/runner/default/1337/outputs HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 303 See Other S: Location: http://example.com/runner/default/1337/outputs/ C: GET http://example.com/runner/default/1337/outputs/ HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 200 OK S: Content-Type: text/turtle S: S: @prefix ro: <http://purl.org/wf4ever/ro#> . S: # .. S: <> a ro:Folder, runner:Outputs; S: ore:aggregates <out1>, <out2> . S: :out1 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <out1> ; S: ro:entryName "out1" . S: :out2 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <out2> ; S: ro:entryName "out2" . S: <out1> a ro:Resource . S: <out2> a ro:Folder ; S: ore:isDescribedBy <out2/> .
Running
, but attempting to resolve any of the resources at that stage SHOULD give a 404 Not Found
.
If the service does not support expected outputs, it SHOULD give a 404 Not Found
on attempt to resolve the runner:Outputs
folder, as indicating an empty folder would wrongly suggest that the workflow is predicted to have no outputs.
As for inputs, outputs MAY be nested. The extent of the nesting might not be known at the Initialized
state, so an output previously indicated as a ro:Resource
might be a ro:Folder
at the time the workflow is Finished
or Archived
.
The service MAY expose outputs before the workflow has reached the Finished
state, for instance if the workflow engine provides partial outputs before completion, or some outputs were produced even though the workflow was Cancelled
.
The client can retrieve outputs by following the links:
If the service do not know the correct content type of the output, it SHOULD fail over toC: GET http://example.com/runner/default/1337/outputs/out1 HTTP/1.1 S: HTTP/1.1 200 OK S: Content-Type: text/plain S: S: The result is here
text/plain
; charset="utf-8"
or application/octet-stream
accordingly.
Some workflow systems can indicate a (partial) error on a particular output. For instance, out1
might be produced fine, while out2
contains an error rather than a value. It is currently out of scope of this specification how to indicate such errors to clients of the Workflow Runner, but it is recommended to use a custom media type in the response, like application/vnd.wf4ever.runner.error
, rather than a HTTP error.
Retrieving a nested output yields another folder:
C: GET http://example.com/runner/default/1337/outputs/out2 HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 303 See Other S: Location: http://example.com/runner/default/1337/outputs/out2/ C: GET http://example.com/runner/default/1337/outputs/out2/ HTTP/1.1 C: Accept: text/turtle S: HTTP/1.1 200 OK S: Content-Type: text/turtle S: S: @prefix ro: <http://purl.org/wf4ever/ro#> . S: # .. S: <> a ro:Folder ; S: ore:aggregates <1>, <2>, <3> . S: :1 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <1> ; S: ro:entryName "1" . S: :2 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <2> ; S: ro:entryName "2" . S: :3 a ro:FolderEntry ; S: ore:proxyIn <> ; S: ore:proxyFor <3> ; S: ro:entryName "3" . S: <1> a ro:Resource . S: <2> a ro:Resource . S: <3> a ro:Resource .
⚠️ Name of nested outputs
This specification does not put any requirements on the file names of nested output entry names (beyond them being unique within the folder). Server implementations might however have particular naming schemes such as increasing integers with gaps, including gaps for missing values.
%% To be done (also a folder - but with annotation to wfprov)
%% To be done, folder
%% To be done, folder - some standards for stdout/stderr
The service SHOULD include appropriate cache control/expiry headers when such are available. For instance, if a workflow is Running
and it is not possible to change the inputs after this state, then the Inputs resources can be given a long cache life time.
Some resources are transient in their nature, such as the Status. The service SHOULD provide a cache headers for the status where appropriate, for instance if it only checks the underlying server status every 5s, then the status resource should have a similar Expiry time set.
When the research object is in status Archived, then the cache headers SHOULD show a long expiration time for all resources.
The service MAY expire research objects from any state after a reasonable or configured period of time (like 48 hours). The service SHOULD respond 410 Gone for requests to an expired RO or any of its resources.
This specification does not specify the authentication mechanism for accessing the service or the underlying workflow system. It is envisioned that a system using OAuth 2.0 with common users on both the Workflow Runner API and the underlying workflow system would provide reasonable authentication measures.
The service SHOULD NOT expose workflow runs or its data that the authenticated user should not have access to.
This service, by its nature, allows execution of arbitrary workflows of the supported workflow system. Depending on the workflow system, this might give the client execution rights on the underlying workflow server, which might be used to expose the data of other users of the service, in addition to be a platform for further exploits. This service might allows uploading of arbitrary data as workflows, workflow inputs and files in a working directory, which could be used by attackers for hosting unwanted content such as spam links and pornographic content. Even pre-approved workflows might in some cases be the subject of abuse if the service allows execution with arbitrary content, for instance to cause out of memory exceptions or SQL injections.
Implementations should ensure that the underlying workflow server is subject to additional security constraints, such as firewalls, user isolation (sudo) and use of virtual machines with snapshot rollback. Implementations should prevent workflow executions access to any security tokens needed by the Workflow Runner Service, for instance to prevent an malign workflow from submitting additional workflow runs.
This service should preferably only allow execution by a pre-approved list of accountable users, ie. users who could otherwise be given direct execution rights on the underlying execution platform, although it may allow unauthenticated execution on third-party workflow systems which authentication details are provided by the client. It is outside the scope of this specification how to provide these details to the service.
The first implementation of this service will interface the Taverna Server using its REST API and will be made available on the Wf4Ever sandbox.