Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate in samtools, kraken, vcflib #1148

Merged
merged 40 commits into from
Apr 28, 2017
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
4d079c8
migrate in kraken, vflib, samtools
martenson Jan 24, 2017
a0eb535
change shed.yml: remote repo url to iuc
martenson Jan 24, 2017
d9c1bf7
update the rest of the remote repo urls
martenson Jan 24, 2017
c345e50
set chunk size to 4 temporarily
martenson Jan 24, 2017
2a298cb
fix samtools stats schema lint
martenson Jan 24, 2017
70de720
remove tool_dependencies.xml for samtools, let conda do it
martenson Jan 30, 2017
b392dc3
samtools linting errors
yhoogstrate Feb 6, 2017
c5da921
samtools mpileup - attempt to get testing and linting working
yhoogstrate Feb 6, 2017
ae39c24
samtools mpileup - attempt 2 to get testing and linting working
yhoogstrate Feb 6, 2017
b637cef
samtools stats+sam_to_bam - small changes to get tests working
yhoogstrate Feb 6, 2017
eee57a1
Samtools: update to version_command
yhoogstrate Feb 15, 2017
840024e
Samtools: update to version_command - attempt 2
yhoogstrate Feb 15, 2017
eaf496a
kraken: minor changes
yhoogstrate Feb 15, 2017
d99fc25
Kraken: removal of tool-dependencies
yhoogstrate Feb 15, 2017
9c88880
samtools bam_to_cram small changes
yhoogstrate Feb 15, 2017
223a8ac
add changes from @nsoranzo
martenson Feb 17, 2017
990f27c
add more improvements by :eagle::eyes:
martenson Feb 17, 2017
9cfe78d
typo breaking planemo test
yhoogstrate Feb 24, 2017
d0682b1
samtools bedcov: proper usage of multiple bam files
yhoogstrate Feb 24, 2017
21b0a8f
updates test data of samtools flagstat and updates iterator in samtoo…
yhoogstrate Feb 24, 2017
ca83496
textual changes
yhoogstrate Feb 24, 2017
8a279e5
kraken: more textual changes
yhoogstrate Feb 24, 2017
8215ba7
samtools: small change
yhoogstrate Feb 24, 2017
4dc44ce
samtools: a few more textual changes
yhoogstrate Feb 25, 2017
361b7a9
samtoools: typo
yhoogstrate Feb 25, 2017
abc67ae
samtools idxstats: fix symlinking bai file
yhoogstrate Feb 27, 2017
cde9556
More changes thanks to Nicola
yhoogstrate Feb 27, 2017
a44101b
fix shedyaml to point at iuc repo
martenson Feb 28, 2017
6e1b060
fix shedyaml to point at iuc repo
martenson Feb 28, 2017
6c16017
fix shedyaml to point at iuc repo
martenson Feb 28, 2017
f8fb9de
update rmdup to samtools 1.3 and tiny improvements, all thanks to :ea…
martenson Mar 7, 2017
9bb0b6d
Make test of cram/bam tools pass
yhoogstrate Apr 25, 2017
5ce99d2
Fix more tests
bgruening Apr 25, 2017
96d9878
test fixes
bgruening Apr 25, 2017
b758595
Change test data
bgruening Apr 25, 2017
2f64c9d
removal of not used test files
yhoogstrate Apr 26, 2017
c339ded
upgraded mpileup tests
yhoogstrate Apr 26, 2017
8e28b30
samtools reheader: allow testing with 1 line difference to correct fo…
yhoogstrate Apr 26, 2017
9cf113c
samtools stats also reports stats on "N" and "0"
yhoogstrate Apr 28, 2017
46b0568
samtools stats: moving test-data from sub-directory to allow properly…
yhoogstrate Apr 28, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 3 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ python: 2.7
env:
- CHUNK=0
- CHUNK=1
- CHUNK=2
- CHUNK=3

before_install:
- export GALAXY_REPO=https://github.com/galaxyproject/galaxy
Expand All @@ -32,7 +34,7 @@ install:
planemo ci_find_repos --exclude_from .tt_blacklist \
--exclude packages --exclude data_managers \
--changed_in_commit_range "$TRAVIS_COMMIT_RANGE" \
--chunk_count 2 --chunk "${CHUNK}" \
--chunk_count 4 --chunk "${CHUNK}" \
--output changed_repositories_chunk.list
- cat changed_repositories_chunk.list

Expand Down
92 changes: 92 additions & 0 deletions tool_collections/kraken/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
Introduction
============

`Kraken <http://ccb.jhu.edu/software/kraken/>`__ is a taxonomic sequence
classifier that assigns taxonomic labels to short DNA reads. It does
this by examining the :math:`k`-mers within a read and querying a
database with those :math:`k`-mers. This database contains a mapping of
every :math:`k`-mer in
`Kraken <http://ccb.jhu.edu/software/kraken/>`__'s genomic library to
the lowest common ancestor (LCA) in a taxonomic tree of all genomes that
contain that :math:`k`-mer. The set of LCA taxa that correspond to the
:math:`k`-mers in a read are then analyzed to create a single taxonomic
label for the read; this label can be any of the nodes in the taxonomic
tree. `Kraken <http://ccb.jhu.edu/software/kraken/>`__ is designed to be
rapid, sensitive, and highly precise. Our tests on various real and
simulated data have shown
`Kraken <http://ccb.jhu.edu/software/kraken/>`__ to have sensitivity
slightly lower than Megablast with precision being slightly higher. On a
set of simulated 100 bp reads,
`Kraken <http://ccb.jhu.edu/software/kraken/>`__ processed over 1.3
million reads per minute on a single core in normal operation, and over
4.1 million reads per minute in quick operation.

The latest released version of Kraken will be available at the `Kraken
website <http://ccb.jhu.edu/software/kraken/>`__, and the latest updates
to the Kraken source code are available at the `Kraken GitHub
repository <https://github.com/DerrickWood/kraken>`__.

If you use `Kraken <http://ccb.jhu.edu/software/kraken/>`__ in your
research, please cite the `Kraken
paper <http://genomebiology.com/2014/15/3/R46>`__. Thank you!

System Requirements
===================

Note: Users concerned about the disk or memory requirements should read
the paragraph about MiniKraken, below.

- **Disk space**: Construction of Kraken's standard database will
require at least 160 GB of disk space. Customized databases may
require more or less space. Disk space used is linearly proportional
to the number of distinct :math:`k`-mers; as of Feb. 2015, Kraken's
default database contains just under 6 billion (6e9) distinct
:math:`k`-mers.

In addition, the disk used to store the database should be
locally-attached storage. Storing the database on a network
filesystem (NFS) partition can cause Kraken's operation to be very
slow, or to be stopped completely. As NFS accesses are much slower
than local disk accesses, both preloading and database building will
be slowed by use of NFS.

- **Memory**: To run efficiently, Kraken requires enough free memory to
hold the database in RAM. While this can be accomplished using a
ramdisk, Kraken supplies a utility for loading the database into RAM
via the OS cache. The default database size is 75 GB (as of Feb.
2015), and so you will need at least that much RAM if you want to
build or run with the default database.

- **Dependencies**: Kraken currently makes extensive use of Linux
utilities such as sed, find, and wget. Many scripts are written using
the Bash shell, and the main scripts are written using Perl. Core
programs needed to build the database and run the classifier are
written in C++, and need to be compiled using g++. Multithreading is
handled using OpenMP. Downloads of NCBI data are performed by wget
and in some cases, by rsync. Most Linux systems that have any sort of
development package installed will have all of the above listed
programs and libraries available.

Finally, if you want to build your own database, you will need to
install the
`Jellyfish <http://www.cbcb.umd.edu/software/jellyfish/>`__
:math:`k`-mer counter. Note that Kraken only supports use of
Jellyfish version 1. Jellyfish version 2 is not yet compatible with
Kraken.

- **Network connectivity**: Kraken's standard database build and
download commands expect unfettered FTP and rsync access to the NCBI
FTP server. If you're working behind a proxy, you may need to set
certain environment variables (such as ``ftp_proxy`` or
``RSYNC_PROXY``) in order to get these commands to work properly.

- **MiniKraken**: To allow users with low-memory computing environments
to use Kraken, we supply a reduced standard database that can be
downloaded from the Kraken web site. When Kraken is run with a
reduced database, we call it MiniKraken.

The database we make available is only 4 GB in size, and should run
well on computers with as little as 8 GB of RAM. Disk space required
for this database is also only 4 GB.


Binary file added tool_collections/kraken/database.idx
Binary file not shown.
Binary file added tool_collections/kraken/database.kdb
Binary file not shown.
16 changes: 16 additions & 0 deletions tool_collections/kraken/kraken/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
categories:
- Metagenomics
description: Kraken for taxonomic designation.
homepage_url: http://ccb.jhu.edu/software/kraken/
long_description: |
Kraken is a system for assigning taxonomic labels to short DNA
sequences, usually obtained through metagenomic studies. Previous attempts by other
bioinformatics software to accomplish this task have often used sequence alignment
or machine learning techniques that were quite slow, leading to the development
of less sensitive but much faster abundance estimation programs. Kraken aims to
achieve high sensitivity and high speed by utilizing exact alignments of k-mers
and a novel classification algorithm.
name: kraken
owner: devteam
remote_repository_url: https://github.com/galaxyproject/tools-iuc/blob/master/tool_collections/kraken/kraken/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martenson The kraken wrappers are under devteam and the others under IUC. Is it ok to choose either devteam or IUC for all of them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yhoogstrate I do not understand the question. Please rephrase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martenson, oops I think I clicked at the wrong file to explain myself. The following .shed.yml files have their remote_repository_url: linked to tools-devteam:

tool_collections/kraken/kraken_filter/.shed.yml:remote_repository_url: https://github.com/galaxyproject/tools-devteam/blob/master/tool_collections/kraken/kraken_filter/
tool_collections/kraken/kraken_report/.shed.yml:remote_repository_url: https://github.com/galaxyproject/tools-devteam/blob/master/tool_collections/kraken/kraken_report/
tool_collections/kraken/kraken_translate/.shed.yml:remote_repository_url: https://github.com/galaxyproject/tools-devteam/blob/master/tool_collections/kraken/kraken_translate/

While all other tools have this linked to tools-iuc. Is this on purpose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure these are just leftovers, all remote_repository_url should link to tools-iuc .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yhoogstrate I think @nsoranzo is right. I fixed those 3.

type: unrestricted
1 change: 1 addition & 0 deletions tool_collections/kraken/kraken/README.rst
164 changes: 164 additions & 0 deletions tool_collections/kraken/kraken/kraken.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
<?xml version="1.0"?>
<tool id="kraken" name="Kraken" version="1.2.1">
<description>
assign taxonomic labels to sequencing reads
</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements" />
<expand macro="stdio" />
<expand macro="version_command" />
<command>
<![CDATA[
@SET_DATABASE_PATH@ &&
kraken --threads \${GALAXY_SLOTS:-1} @INPUT_DATABASE@

${only_classified_output}

#if str( $quick_operation.quick ) == "yes":
--quick
--min-hits ${quick_operation.min_hits}

#end if

#if $single_paired.single_paired_selector == "yes":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        #if $single_paired.single_paired.single_paired_selector == 'yes'

#if $forward_input.is_of_type( 'fastq' ):
--fastq-input
#else:
--fasta-input
#end if
"$forward_input" "$reverse_input"
Copy link
Member

@nsoranzo nsoranzo Feb 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            '${single_paired.forward_input}' '${single_paired.reverse_input}'

${single_paired.check_names}
#elif $single_paired.single_paired_selector == "collection":
#if $single_paired.input_pair.forward.is_of_type( 'fastq' ):
--fastq-input
#else:
--fasta-input
#end if
"${single_paired.input_pair.forward}" "${single_paired.input_pair.reverse}"
${single_paired.check_names}
#else:
#if $input_sequences.is_of_type( 'fastq' ):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            #if $single_paired.input_sequences.is_of_type('fastq')

--fastq-input
#else:
--fasta-input
#end if
"$input_sequences"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            '${single_paired.input_sequences}'

#end if

#if $split_reads:
--classified-out "${classified_out}" --unclassified-out "${unclassified_out}"
#end if

## The --output option was changed to redirect as it does not work properly is some situations. For example, on test database the tool classifies 4 reads but does not write them into a file if --output is specified. It does however print correct output into STDOUT. This behavior can be re-created with test database provided in test-data/test_db/ folder. This is the reason for incrementing version number from 1.1.2 to 1.1.3

> "${output}"
##kraken-translate --db ${kraken_database.fields.name} "${output}" > "${translated}"
]]>
</command>
<inputs>
<conditional name="single_paired">
<param name="single_paired_selector" type="select" label="Single or paired reads" help="--paired">
<option value="collection">Collection</option>
<option value="yes">Paired</option>
<option selected="True" value="no">Single</option>
</param>
<when value="collection">
<param format="fasta,fastq" label="Collection of paired reads" name="input_pair" type="data_collection" collection_type="paired" help="FASTA or FASTQ datasets" />
<param name="check_names" type="boolean" checked="False" truevalue="--paired --check-names" falsevalue="--paired" label="Verify read names match" help="--check-names" />
</when>
<when value="yes">
<param format="fasta,fastq" label="Forward strand" name="forward_input" type="data" help="FASTA or FASTQ dataset"/>
<param format="fasta,fastq" label="Reverse strand" name="reverse_input" type="data" help="FASTA or FASTQ dataset"/>
<param name="check_names" type="boolean" checked="False" truevalue="--paired --check-names" falsevalue="--paired" label="Verify read names match" help="--check-names" />
</when>
<when value="no">
<param format="fasta,fastq" label="Input sequences" name="input_sequences" type="data" help="FASTA or FASTQ datasets"/>
</when>

</conditional>
<param label="Output classified and unclassified reads?" name="split_reads" type="boolean" help="Sets --unclassified-out and --classified-out"/>

<conditional name="quick_operation">
<param name="quick" type="select" label="Enable quick operation?" help="--quick; Rather than searching all k-mers in a sequence, stop classification after a specified number of database hit">
<option value="yes">Yes</option>
<option selected="True" value="no">No</option>
</param>
<when value="yes">
<param name="min_hits" type="integer" value="1" label="Number of hits required for classification" help="--min-hits; min-hits will allow you to require multiple hits before declaring a sequence classified, which can be especially useful with custom databases when testing to see if sequences either do or do not belong to a particular genome; default=1"/>
</when>
<when value="no">
<!-- Do absolutely nothing -->
</when>
</conditional>

<param name="only_classified_output" type="boolean" checked="False" truevalue="--only-classified-output" falsevalue="" label="Print no Kraken output for unclassified sequences" help="--only-classified-output"/>

<expand macro="input_database" />
</inputs>
<outputs>
<data format_source="input_sequences" label="${tool.name} on ${on_string}: Classified reads" name="classified_out">
<filter>(split_reads)</filter>
</data>
<data format_source="input_sequences" label="${tool.name} on ${on_string}: Unclassified reads" name="unclassified_out">
<filter>(split_reads)</filter>
</data>
<data format="tabular" label="${tool.name} on ${on_string}: Classification" name="output" />
<!--<data format="tabular" label="${tool.name} on ${on_string}: Translated classification" name="translated" />-->
</outputs>

<tests>
<test>
<param name="single_paired_selector" value="no"/>
<param name="input_sequences" value="kraken_test1.fa" ftype="fasta"/>
<param name="split_reads" value="false"/>
<param name="quick" value="no"/>
<param name="only-classified-output" value="false"/>
<param name="kraken_database" value="test_db"/>
<output name="output" file="kraken_test1_output.tab" ftype="tabular"/>
</test>
</tests>
<help>
<![CDATA[
**What it does**

Kraken is a taxonomic sequence classifier that assigns taxonomic labels to short DNA reads. It does this by examining the k-mers within a read and querying a database with those k-mers. This database contains a mapping of every k-mer in Kraken's genomic library to the lowest common ancestor (LCA) in a taxonomic tree of all genomes that contain that k-mer. The set of LCA taxa that correspond to the k-mers in a read are then analyzed to create a single taxonomic label for the read; this label can be any of the nodes in the taxonomic tree. Kraken is designed to be rapid, sensitive, and highly precise.

-----

**Kraken options**

The Galaxy version of Kraken implements the following options::


--fasta-input Input is FASTA format
--fastq-input Input is FASTQ format
--quick Quick operation (use first hit or hits)
--min-hits NUM In quick op., number of hits req'd for classification
NOTE: this is ignored if --quick is not specified
--unclassified-out Print unclassified sequences to filename
--classified-out Print classified sequences to filename

--only-classified-output Print no Kraken output for unclassified sequences

------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All options are already documented inside the <inputs> section, duplicating here only complicates the wrapper maintenance.


**Output Format**

Each sequence classified by Kraken results in a single line of output. Output lines contain five tab-delimited fields; from left to right, they are::

1. "C"/"U": one letter code indicating that the sequence was either classified or unclassified.
2. The sequence ID, obtained from the FASTA/FASTQ header.
3. The taxonomy ID Kraken used to label the sequence; this is 0 if the sequence is unclassified.
4. The length of the sequence in bp.
5. A space-delimited list indicating the LCA mapping of each k-mer in the sequence. For example, "562:13 561:4 A:31 0:1 562:3" would indicate that:
a) the first 13 k-mers mapped to taxonomy ID #562
b) the next 4 k-mers mapped to taxonomy ID #561
c) the next 31 k-mers contained an ambiguous nucleotide
d) the next k-mer was not in the database
e) the last 3 k-mers mapped to taxonomy ID #562
]]>
</help>
<expand macro="citations" />
</tool>
1 change: 1 addition & 0 deletions tool_collections/kraken/kraken/macros.xml
70 changes: 70 additions & 0 deletions tool_collections/kraken/kraken/test-data/kraken_test1.fa
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
>gi|145231|gb|M33724.1|ECOALPHOA Escherichia coli K-12 truncated PhoA (phoA) gene, partial cds; and transposon Mu dI, partial sequence
CAAAGCTCCGGGCCTCACCCAGGCGCTAAATACCAAAGATGGCGCAGTGATGGTGATGAGTTACGGGAAC
TCCGAAGAGGATTCACAAGAACATACCGGCAGTCAGTTGCGTATTGCGGCGTATGGCCCGCATGCCGCCA
ATGAAGCGGCGCACGAAAAACGCGAAAGCGT

>gi|145232|gb|M33725.1|ECOALPHOB Escherichia coli K12 phoA pseudogene and transposon Mu dl-R, partial sequence
CTGTCATAAAGTTGTCACGGCCGAGACTTATAGTCGCTTTGTTTTTATTTTTTAATGTATTTGTACATGG
AGAAAATAAAGTGAAACAAAGCACTATTGCACTGGCACTCTTACCGTTACTGTTTACCCCTGTGACAAAA
GCCCGGACACCAGTGAAGCGGCGCACGAAAAACGCGAAAGCGT

>gi|145234|gb|M33727.1|ECOALPHOE Escherichia coli K12 upstream sequence of psiA5::Mu dI. is identical to psiA30 upstream sequence; putative (phoA) pseudogene and transposon Mu dl-R, partial sequence
TTGTTTTTATTTTTTAATGTATTTGTACATGGAGAAAATAAAGTGAAACAAAGCACTATTGCACTGGTGA
AGCGGCGCACGAAAAACGCGAAAGCGT

>gi|146195|gb|J01619.1|ECOGLTA Eschericia coli gltA gene, sdhCDAB operon and sucABCD operons, complete sequence
GAATTCGACCGCCATTGCGCAAGGCATCGCCATGACCAGGCAGGATACAAAAGAGAGTCGATAAATATTC
ACGGTGTCCATACCTGATAAATATTTTATGAAAGGCGGCGATGATGCCGCAAAATAATACTTATTTATAA
TCCAGCACGTAGGTTGCGTTAGCGGTTACTTCACCTGCCGTGACATCGACTGCATTATCAATTTGTTCCA
TCCAGGCGAAAAAGTTCAGCGTCTGTTCTGATGAGCTTGCATCCAGGTCAAGATCTGGCGCGGCTGAACC
TAATACGATGTTACCGTCATTTTTGTCCATCAGTCGTACACCGACCCCAGTTGCTTCGCCTGCACTGGTG
TTGCTCAACAAAGGCGTAGCACCAGTTGTCTTAGCCGTGCTATCGAAGGTTACGCCAAACTTTGGATACC
GGCATTCCGCTACCGTTGTCAGAAGCAGGCAGATCACAGTTGATCAAGCGAATGTCGACGGCCACTTTAT
TGCTATGATGCTCCCGGTTTATATGGGTTGTCGTGACTTGTCCAAGATCTATGTTTTTATCAATATCTTC
TGGATGAATTTCACAAGGTGCTTCAATAACCTCCCCCTTAAAGTGAATTTCGCCAGAACCTTCATCAGCA
GCATAAACAGGTGCAGTGAACAGCAGAGATACGGCCAGTGCGGCCAATGTTTTTTGTCCTTTAAACATAA
CAGAGTCCTTTAAGGATATAGAATAGGGGTATAGCTACGCCAGAATATCGTATTTGATTATTGCTAGTTT
TTAGTTTTGCTTAAAAAATATTGTTAGTTTTATTAAATTGGAAAACTAAATTATTGGTATCATGAATTGT
TGTATGATGATAAATATAGGGGGGATATGATAGACGTCATTTTCATAGGGTTATAAAATGCGACTACCAT
GAAGTTTTTAATTCAAAGTATTGGGTTGCTGATAATTTGAGCTGTTCTATTCTTTTTAAATATCTATATA
GGTCTGTTAATGGATTTTATTTTTACAAGTTTTTTGTGTTTAGGCATATAAAAATCAAGCCCGCCATATG
AACGGCGGGTTAAAATATTTACAACTTAGCAATCGAACCATTAACGCTTGATATCGCTTTTAAAGTCGCG
TTTTTCATATCCTGTATACAGCTGACGCGGACGGGCAATCTTCATACCGTCACTGTGCATTTCGCTCCAG
TGGGCGATCCAGCCAACGGTACGTGCCATTGCGAAAATGACGGTGAACATGGAAGACGGAATACCCATCG
CTTTCAGGATGATACCAGAGTAGAAATCGACGTTCGGGTACAGTTTCTTCTCGATAAAGTACGGGTCGTT
CAGCGCGATGTTTTCCAGCTCCATAGCCACTTCCAGCAGGTCATCCTTCGTGCCCAGCTCTTTCAGCACT
TCATGGCAGGTTTCACGCATTACGGTGGCGCGCGGGTCGTAATTTTTGTACACGCGGTGACCGAAGCCCA
TCAGGCGGAAAGAATCATTTTTGTCTTTCGCACGACGAAAAAATTCCGGAATGTGTTTAACGGAGCTGAT
TTCTTCCAGCATTTTCAGCGCCGCTTCGTTAGCACCGCCGTGCGCAGGTCCCCACAGTGAAGCAATACCT
GCTGCGATACAGGCAAACGGGTTCGCACCCGAAGAGCCAGCGGTACGCACGGTGGAGGTAGAGGCGTTCT
GTTCATGGTCAGCGTGCAGGATCAGAATACGGTCCATAGCACGTTCCAGAATCGGATTAACTTCATACGG
TTCGCACGGCGTGGAGAACATCATATTCAGGAAGTTACCGGCGTAGGAGAGATCGTTGCGCGGGTAAACA
AATGGCTGACCAATGGAATACTTGTAACACATCGCGGCCATGGTCGGCATTTTCGACAGCAGGCGGAACG
CGGCAATTTCACGGTGACGAGGATTGTTAACATCCAGCGAGTCGTGATAGAACGCCGCCAGCGCGCCGGT
AATACCACACATGACTGCCATTGGATGCGAGTCGCGACGGAAAGCATGGAACAGACGGGTAATCTGCTCG
TGGATCATGGTATGACGGGTCACCGTAGTTTTAAATTCGTCATACTGTTCCTGAGTCGGTTTTTCACCAT
TCAGCAGGATGTAACAAACTTCCAGGTAGTTAGAATCGGTCGCCAGCTGATCGATCGGGAAACCGCGGTG
CAGCAAAATACCTTCATCACCATCAATAAAAGTAATTTTAGATTCGCAGGATGCGGTTGAAGTGAAGCCT
GGGTCAAAGGTGAACACACCTTTTGAACCGAGAGTACGGATATCAATAACATCTTGACCCAGCGTGCCTT
TCAGCACATCCAGTTCAACAGCTGTATCCCCGTTGAGGGTGAGTTTTGCTTTTGTATCAGCCATTTAAGG
TCTCCTTAGCGCCTTATTGCGTAAGACTGCCGGAACTTAAATTTGCCTTCGCACATCAACCTGGCTTTAC
CCGTTTTTTATTTGGCTCGCCGCTCTGTGAAAGAGGGGAAAACCTGGGTACAGAGCTCTGGGCGCTTGCA
GGTAAAGGATCCATTGATGACGAATAAATGGCGAATCAAGTACTTAGCAATCCGAATTATTAAACTTGTC
TACCACTAATAACTGTCCCGAATGAATTGGTCAATACTCCACACTGTTACATAAGTTAATCTTAGGTGAA
ATACCGACTTCATAACTTTTACGCATTATATGCTTTTCCTGGTAATGTTTGTAACAACTTTGTTGAATGA
TTGTCAAATTAGATGATTAAAAATTAAATAAATGTTGTTATCGTGACCTGGATCACTGTTCAGGATAAAA
CCCGACAAACTATATGTAGGTTAATTGTAATGATTTTGTGAACAGCCTATACTGCCGCCAGTCTCCGGAA
CACCCTGCAATCCCGAGCCACCCAGCGTTGTAACGTGTCGTTTTCGCATCTGGAAGCAGTGTTTTGCATG
ACGCGCAGTTATAGAAAGGACGCTGTCTGACCCGCAAGCAGACCGGAGGAAGGAAATCCCGACGTCTCCA
GGTAACAGAAAGTTAACCTCTGTGCCCGTAGTCCCCAGGGAATAATAAGAACAGCATGTGGGCGTTATTC
ATGATAAGAAATGTGAAAAAACAAAGACCTGTTAATCTGGACCTACAGACCATCCGGTTCCCCATCACGG
CGATAGCGTCCATTCTCCATCGCGTTTCCGGTGTGATCACCTTTGTTGCAGTGGGCATCCTGCTGTGGCT
TCTGGGTACCAGCCTCTCTTCCCCTGAAGGTTTCGAGCAAGCTTCCGCGATTATGGGCAGCTTCTTCGTC
AAATTTATCATGTGGGGCATCCTTACCGCTCTGGCGTATCACGTCGTCGTAGGTATTCGCCACATGATGA
TGGATTTTGGCTATCTGGAAGAAACATTCGAAGCGGGTAAACGCTCCGCCAAAATCTCCTTTGTTATTAC
TGTCGTGCTTTCACTTCTCGCAGGAGTCCTCGTATGGTAAGCAACGCCTCCGCATTAGGACGCAATGGCG
TACATGATTTCATCCTCGTTCGCGCTACCGCTATCGTCCTGACGCTCTACATCATTTATATGGTCGGTTT
TTTCGCTACCAGTGGCGAGCTGACATATGAAGTCTGGATCGGTTTCTTCGCCTCTGCGTTCACCAAAGTG
TTCACCCTGCTGGCGCTGTTTTCTATCTTGATCCATGCCTGGATCGGCATGTGGCAGGTGTTGACCGACT
ACGTTAAACCGCTGGCTTTGCGCCTGATGCTGCAACTGGTGATTGTCGTTGCACTGGTGGTTTACGTGAT
TTATGGATTCGTTGTGGTGTGGGGTGTGTGATGAAATTGCCAGTCAGAGAATTTGATGCAGTTGTGATTG
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
C gi|145231|gb|M33724.1|ECOALPHOA 83333 171 83333:162
C gi|145232|gb|M33725.1|ECOALPHOB 83333 183 83333:174
C gi|145234|gb|M33727.1|ECOALPHOE 83333 97 83333:88
C gi|146195|gb|J01619.1|ECOGLTA 83333 3850 83333:3841
1 change: 1 addition & 0 deletions tool_collections/kraken/kraken/test-data/test_database.loc
Loading