Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate in all devteam's DMs #1151

Merged
merged 8 commits into from
Apr 4, 2017

Conversation

martenson
Copy link
Member

part of galaxyproject/tools-devteam#406
xref: #1060

this is an effort to address part of galaxyproject/tools-devteam#416

@bgruening
Copy link
Member

We have no way to test data managers yet, and the tests are failing because of pep8 warnings. @nsoranzo can we exclude dm's from pep8 testing. I don't feel good to change so much without testing in place.

@nsoranzo
Copy link
Member

nsoranzo commented Jan 30, 2017

@bgruening I started linting them, it's straightforward, the only issue is that they don't specify a Python requirement (which is the case also for other tools in this and other tool repos), so I think they may be executed under miniconda3's Python3 when using 17.01.
We have 3 options:

  1. support both Python 2/3, which sometimes requires adding a six requirement
  2. support only Python 2 by adding the corresponding requirement (less code changes, but not future-proof)
  3. migrate them to Python3 and add the corresponding requirement.

I'd vote for option 1 or 3. Ping @galaxyproject/iuc for opinions.

@mvdbeek
Copy link
Member

mvdbeek commented Jan 30, 2017

Given that we primarily support conda in the IUC and the ease with which we can install requirements I think there's no reason the keep python2 around IMO, so I'm voting to port the code to python3.

@bgruening
Copy link
Member

I'm also for porting, but I really scared about the missing test coverage ...

@nsoranzo
Copy link
Member

I'm not sure it makes sense to adopt data_manager_hisat_index_builder, this should probably be deprecated in favour of hisat2 and data_manager_hisat2_index_builder.

@nsoranzo
Copy link
Member

Similarly, data_manager_bwa_index_builder should be deprecated in favour of data_manager_bwa_mem_index_builder.

@nsoranzo
Copy link
Member

And we may also drop data_manager_fetch_genome_all_fasta in favour of the more complete data_manager_fetch_genome_dbkeys_all_fasta. Opionions @galaxyproject/iuc?

@mvdbeek
Copy link
Member

mvdbeek commented Mar 31, 2017

Apart from the compressed portion of data_manager_fetch_genome_dbkeys_all_fasta being broken right now (I'll fix this ASAP) I'd be in favor of that.

- Remove some tool_dependencies.xml
- flake8 Python files
- Fix Python3 compatibility
- Single-quote data and text parameters in <command>
- Reorder attributes per IUC coding style.
- Update bowtie2 requirement to 2.3.0
- Update picard requirement to 2.7.1
@nsoranzo
Copy link
Member

I've pushed a big cleanup which should fix the PEP8 errors for all data managers except the ones which I think we should drop (see above), adding also Python3 compatibility.

Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome contribution. Thanks a lot!
A few comments inline and I think we should really ensure that we increase the version number for all tools.

@@ -84,13 +96,14 @@ def _move_and_index_fasta_for_sorting( fasta_filename ):
line = line.split( None, 1 )[0][1:]
fasta_offsets[ line ] = offset
unsorted_fh.close()
current_order = map( lambda x: x[1], sorted( map( lambda x: ( x[1], x[0] ), fasta_offsets.items() ) ) )
current_order = [_[1] for _ in sorted( ( _[1], _[0] ) for _ in fasta_offsets.items() )]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one! :)

@@ -1,67 +1,60 @@
<tool id="data_manager_fetch_genome_all_fasta" name="Reference Genome" version="0.0.1" tool_type="manage_data">
<description>fetching</description>
<command interpreter="python">data_manager_fetch_genome_all_fasta.py "${out_file}" --dbkey_description ${ dbkey.get_display_text() }</command>
<command detect_errors="exit_code">python '$__tool_directory__/data_manager_fetch_genome_all_fasta.py' '${out_file}' --dbkey_description '${dbkey.get_display_text()}'</command>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we break this into multiple lines.

@@ -4,30 +4,28 @@
<requirement type="package" version="0.1.18">samtools</requirement>
<requirement type="package" version="1.56.0">picard</requirement>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

picard and samtools update?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit scared about that, that's why I haven't removed the tool_dependencies.xml. If you want to take a stab at it, feel free to add commits to this branch.

<exit_code range="1:" err_level="fatal" />
<exit_code range=":-1" err_level="fatal" />
</stdio>
<command detect_errors="exit_code">python '$__tool_directory__/data_manager_rsync.py' '${out_file}'</command>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple lines?

label="Choose Desired Data Tables Entries" dynamic_options="galaxy_code_get_available_data_tables_entries( __trans__, dbkey, data_table_names )"
refresh_on_change="dbkey"/>
<param name="data_table_names" type="select" display="checkboxes" multiple="True" optional="True" refresh_on_change="dbkey"
label="Choose Desired Data Tables" dynamic_options="galaxy_code_get_available_data_tables( __trans__ )" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are adding dynamic_options here into an IUC repo ;)

@@ -3,23 +3,21 @@
<requirements>
<requirement type="package" version="0.1.19">samtools</requirement>
</requirements>
<command interpreter="python">data_manager_sam_fasta_index_builder.py "${out_file}" --fasta_filename "${all_fasta_source.fields.path}" --fasta_dbkey "${all_fasta_source.fields.dbkey}" --fasta_description "${all_fasta_source.fields.name}" --data_table_name "fasta_indexes"</command>
<command detect_errors="exit_code">python '$__tool_directory__/data_manager_sam_fasta_index_builder.py' '${out_file}' --fasta_filename '${all_fasta_source.fields.path}' --fasta_dbkey '${all_fasta_source.fields.dbkey}' --fasta_description '${all_fasta_source.fields.name}' --data_table_name fasta_indexes</command>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple lines

@@ -3,14 +3,13 @@
<requirement type="package" version="324">ucsc-fatotwobit</requirement>
</requirements>
<description>builder</description>
<command interpreter="python">twobit_builder.py "${out_file}" --fasta_filename "${all_fasta_source.fields.path}" --fasta_dbkey "${all_fasta_source.fields.dbkey}" --fasta_description "${all_fasta_source.fields.name}"</command>
<command detect_errors="exit_code">python '$__tool_directory__/twobit_builder.py' '${out_file}' --fasta_filename '${all_fasta_source.fields.path}' --fasta_dbkey '${all_fasta_source.fields.dbkey}' --fasta_description '${all_fasta_source.fields.name}'</command>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiple lines

by checking first whether a file is tar compressed.
Separate download methods from methods that add to data tables, so that
the download methods can be tested in isolation.
@mvdbeek
Copy link
Member

mvdbeek commented Apr 2, 2017

160dd11 should fix the problem @bgruening reported in galaxyproject/tools-devteam#458 and also introduces some doctests at least for the download methods.

@bgruening
Copy link
Member

Pretty cool, thanks for fixing! Any explanation why this broke? Did UCSC changed something?

@mvdbeek
Copy link
Member

mvdbeek commented Apr 3, 2017

tar.gz/tar.bz2 downloads had been broken since I added this functionality (How did I not notice that? I even added test-data to test it ...). I think UCSC previously used zips, but I'm not sure about that.

- flake8 and fix Python3 compatibility
- Single-quote data and text parameters in <command>
- Reorder attributes per IUC coding style.
@bgruening
Copy link
Member

@martenson can you grant IUC permissions on all these repositories.
@galaxyproject/iuc I excluded data_managers from automatic upload, because we can not test them. But on the other hand no one will test them if they are on master, people will test them on this branch, before merging. So do you think we should enable automatic pushed for DM as well?

@nsoranzo
Copy link
Member

nsoranzo commented Apr 3, 2017

So do you think we should enable automatic pushed for DM as well?

Yes, and for packages too (even if there are very few of them being added/updated now).

@mvdbeek
Copy link
Member

mvdbeek commented Apr 3, 2017

Yes, I think we should push automatically. The master branch should always be the latest and greatest, and I think it would be super annoying to fix bugs if on top of a potential bug there is the uncertainty of whether we may have already fixed that in the master branch.

@bgruening
Copy link
Member

Thanks @martenson, @nsoranzo, @mvdbeek! I'm ready for merging - anything else I miss?

@nsoranzo
Copy link
Member

nsoranzo commented Apr 3, 2017

As mentioned above, I've removed the deprecated data_manager_bwa_index_builder, data_manager_fetch_genome_all_fasta and data_manager_hisat_index_builder from this PR, they can safely remain in tools-devteam's legacy/ directory.

Unfortunately Travis has issues today, so the build logs seem empty, I've restarted one job.

@bgruening
Copy link
Member

Looks good. @martenson feel free to merge as soon as you have granted the permissions to IUC.
Thanks.

@martenson
Copy link
Member Author

All permissions granted. Should we deprecate the left devteam's DMs?

@martenson
Copy link
Member Author

Thanks all for help!

@martenson martenson merged commit 8652f36 into galaxyproject:master Apr 4, 2017
@martenson martenson deleted the migrate-in-dms branch April 4, 2017 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants