-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathpractices.tex
134 lines (112 loc) · 11.8 KB
/
practices.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
% -----------------------------------------------------------------------------
\section{Recommended Best Practices}
\label{sec:bestpractices}
% -----------------------------------------------------------------------------
\twozeroone{\subsection{SBOL Namespaces}
Namespaces for different versions of SBOL SHOULD NOT be semantically mixed in the same document. For example, an SBOL 3.x \sbol{ComponentInstance} SHOULD NOT refer to an SBOL 1.x DNAComponent. However, namespaces for different versions MAY be present in the same document so long as they are semantically independent (that is, so long as objects belonging to these namespaces do not refer to each other). When multiple SBOL namespaces are present in a document, libraries SHOULD default to interpreting as SBOL only those objects belonging to the namespace for the most recent version of SBOL in the document. Any remaining objects belonging to any other SBOL namespace SHOULD be interpreted as \sbol{GenericTopLevel}s and custom \sbol{Annotation}s.}
\subsection{Use of the Version Property}
Once an SBOL object has been published where others might have access it (e.g., to an online repository), it might be the case that other objects come to depend on the particular contents of the published object.
Thus, in order to avoid creating conflicting data, if a person wants to change the properties of a published object, they SHOULD do so by making a new copy of object that incorporates the change and has an \sbol{identity} property that contains a new \sbol{URI}.
The relationship between the old and new objects (i.e., that the new object was derived from the old object), however, is not visible unless it is explicitly declared. This is RECOMMENDED to be done using the \sbol{persistentIdentity}, and \sbol{version} properties. The preferred practice for declaring such a relationship is to use the same \sbol{persistentIdentity} for both objects, but give the newer object a later \sbol{version}. Then, when the new object is published, it can be clear to both humans and machines that this object is intended to update the previously published object. In this way, when a user wants the latest version of an object, they can obtain it by referencing the object via its \sbol{persistentIdentity} and rely on a tool to find the object with that \sbol{persistentIdentity} and the latest \sbol{version}.
As stated in \ref{sec:version}, it is RECOMMENDED that version numbering follow
the conventions of semantic versions (\url{http://semver.org/}), particularly as implemented by Maven (\url{http://maven.apache.org/}).
This convention represents versions as sequences of numbers and qualifiers separated by the characters {\tt .} and {\tt -} and compared in lexicographical order (for example, 1 < 1.3.1 < 2.0-beta). For a full explanation, see the linked resources.
\subsection{Compliant SBOL Objects}
\label{sec:compliant}
Maintaining unique \sbol{identity} \sbol{URI}s for all SBOL objects can be a very challenging implementation task. To reduce this burden, users of SBOL 3.x are encouraged to follow a few simple rules when constructing the \sbol{identity} properties and related properties for SBOL objects. When these rules are followed in constructing an SBOL object, we say that this object is \emph{compliant}. These rules are as follows:
\begin{enumerate}
\item The \sbol{identity} of a compliant SBOL object MUST begin with a \emph{URI prefix} that maps to a domain over which the user has control. Namely, the user can guarantee uniqueness of identities within this domain.
\item The \sbol{persistentIdentity} and \sbol{displayId} properties are REQUIRED of a compliant SBOL object.
\item The \sbol{persistentIdentity} of a compliant \sbol{TopLevel} object MUST end with a delimiter (`/', `\#', or `:') followed by the \sbol{displayId} of the object.
\item The \sbol{persistentIdentity} of a compliant SBOL object that is not also a \sbol{TopLevel} object MUST begin with the \sbol{persistentIdentity} of its parent object and be immediately followed by a delimiter (`/', `\#', or `:') and the \sbol{displayId} of the compliant object.
\item If a compliant SBOL object is not given a \sbol{version}, then its \sbol{identity} and \sbol{persistentIdentity} properties MUST contain the same \sbol{URI}.
\item If a compliant SBOL object has a \sbol{version}, then its \sbol{identity} property MUST contain a \sbol{URI} of the form ``\refObj{persistentIdentity}/\refObj{version}''.
\item The \sbol{version} of a compliant SBOL object that is not also a \sbol{TopLevel} object MUST contain the same \sbol{String} as the \sbol{version} property of the compliant object's parent object.
\item The \sbol{identity}, \sbol{persistentIdentity}, \sbol{displayId}, and \sbol{version} properties of a compliant SBOL object MUST NOT be changed once set.
\end{enumerate}
All examples in this specification use compliant \sbol{URI}s.
\subsection{Annotations: Embedded Objects vs. External References}
When annotating an SBOL document with additional information, there are
two general methods that can be used:
\begin{itemize}
\item Embed the information in the SBOL document, either as non-SBOL
properties or \sbol{GenericTopLevel} objects.
\item Store the information separately and annotate the SBOL document
with \sbol{URI}s that point to it.
\end{itemize}
In theory, either method can be used in any case. (Note that a third case not
discussed here is to use SBOL to annotate external objects with linking
to SBOL documents, rather than annotate SBOL documents with links external objects.)
In practice,
embedding massive amounts of non-SBOL data into SBOL documents is likely
to cause problems for people and software tools trying to manage and
exchange such documents. Therefore, it is RECOMMENDED that small amounts of information (e.g., design notes or preferred graphical layout) be embedded in the SBOL model, while large amounts of information (e.g., the contents of the scientific publication from which a model was derived or flow cytometry data that characterizes performance) be linked with URIs pointing to external resources. The boundary between ``small'' and ``large'' is left deliberately vague, recognizing that it will likely depend on the particulars of a given SBOL application.
\subsection{Completeness and Validation}
RDF documents containing serialized SBOL objects might or might not be
entirely self-contained. A SBOL document is self-contained or ``complete'' if every SBOL object referred to in the document is contained in the document. It is RECOMMENDED that serializations be complete whenever practical. In order words, when serializing an SBOL object, serialize all of the other objects that it points to, then serialize all of the other objects that these objects point to, etc., until the document is complete.
It is important to note that there is no guarantee that an RDF document
contains valid SBOL. When an RDF document is deserialized to SBOL
objects, the program doing so SHOULD verify that all of the property
values encoded therein have the right data type (e.g., that the object
pointed to by the \sbol{sequences} property of a
\sbol{Component} is really a \sbol{Sequence}).
For complete files, this validation can be carried out entirely locally. For files that are not complete, an implementation either needs to
have a means of validating those external references (e.g., by
retrieving them from various repositories), or it needs to mark them as
unverified and not depend on their correctness.
\subsection{Recommended Ontologies for External Terms}
\label{sec:recomm_ontologies}
External ontologies and controlled vocabularies are an integral part of SBOL. SBOL uses \sbol{URI}s to access existing biological information through these resources. New SBOL specific terms are defined only when necessary. For example, \sbol{Component} \sbolmult{types:CD}{types}, such as DNA or protein, are described using SBO terms. Similarly, the roles of a DNA \twozeroone{or RNA} \sbol{Component} are described via SO terms. Although RECOMMENDED ontologies have been indicated in relevant sections where possible, other resources providing similar terms can also be used. A summary of these external sources can be found in \ref{tbl:preferred_external_resources}.
\begin{table}[ht]
\begin{edtable}{tabular}{p{3cm}p{1.5cm}p{4.5cm}p{6cm}}
\toprule
\textbf{SBOL Entity} & \textbf{Property} & \textbf{Preferred External Resource}
& \textbf{More Information} \\
\midrule
\textbf{Component} & types & SBO & \url{https://www.ebi.ac.uk/sbo/main/}\\
& types & SO (nucleic acid topology)& \url{http://www.sequenceontology.org}\\
& roles & SO (\textit{DNA} or \textit{RNA}) & \url{http://www.sequenceontology.org} \\
& roles & CHEBI (\textit{small molecule}) & \url{https://www.ebi.ac.uk/chebi/} \\
% & roles & UniProt (if type is \textit{protein}??) \\
\textbf{Interaction} & types & SBO (occurring entity branch) &
\url{http://www.ebi.ac.uk/sbo/main/} \\
\textbf{Participation} & roles & SBO (participant roles branch) &
\url{http://www.ebi.ac.uk/sbo/main/} \\
\textbf{Model} & language & EDAM & \url{http://bioportal.bioontology.org/ontologies/EDAM} \\
& framework & SBO (modeling framework branch) &
\url{http://www.ebi.ac.uk/sbo/main/} \\
\bottomrule
\end{edtable}
\caption{Preferred external resources from which to draw values for various SBOL properties.}
\label{tbl:preferred_external_resources}
\end{table}
\twoonezero{The URIs for ontological terms SHOULD come from identifiers.org. However, it is acceptable to use terms from purl.org as an alternative, for example when RDF tooling requires URIs to be represented as compliant QNames. SBOL software may convert between these forms as required.}
\twozeroone{\subsection{Annotating Entities with Date \& Time}\label{sec:DateTime}
Entities in an SBOL document can be annotated with creation and modification date. It is RECOMMENDED that predicates, or properties, from DCMI Metadata Terms SHOULD be used to include date and time information. The \texttt{created} and \texttt{modified} terms SHOULD respectively be used to annotate SBOL entities with creation and modification dates. Date and time values SHOULD be expressed using the XML Schema \texttt{DateTime} datatype~\citep{Biron2004}. For example, "\texttt{2016-03-16T20:12:00Z}" specifies that the day is 16 March 2016 and the time is 20:12pm in UTC (Coordinated Universal Time). An example serialization is shown below:}
\lstsetsbol
\begin{lstlisting}
<?xml version="1.0" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#">
<sbol:CLASS_NAME rdf:about="...">
<dcterms:created>DATE_TIME_STRING</dcterms:created>
<dcterms:modified>DATE_TIME_STRING</dcterms:modified>
...
</sbol:CLASS_NAME>
...
</rdf:RDF>
\end{lstlisting}
\twotwoone{\subsection{Annotating Entities with Authorship information}\label{sec:Authorship}
Authorship information should ideally be added to \sbol{TopLevel} entities, where possible. It is RECOMMENDED that the \texttt{creator} DCMI Metadata term SHOULD be used to annotate SBOL entities with authorship information. This property can be repeated for each author. The example below shows the use of this property for two authors and the values shown are free text \texttt{String} literals.
}
\lstsetsbol
\begin{lstlisting}
<?xml version="1.0" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:sbol="http://sbols.org/v2#">
<sbol:CLASS_NAME rdf:about="...">
<dcterms:creator>First author</dcterms:creator>
<dcterms:creator>Second author</dcterms:creator>
...
</sbol:CLASS_NAME>
...
</rdf:RDF>
\end{lstlisting}