-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing decoration items with UCSC genome browser #48
Comments
working on this, should have a solution soon |
Sorry to be a bother but have you had a chance to look at this? |
Sorry for the delay - just published bigtools 0.5.1 which should fix the fieldCount and definedFieldCount when an autosql is passed. |
Awesome, it's working for me. I'll approve the bioconda PR once it passes tests :) |
Hi @jackh726, Found an issue with the new release that I think is requiring the autosql argument to be passed everytime. I have a standard three column bedfile and when I try:
I get
But if I add the SQL argument, it works:
And here
I am also noticing that I have to filter out lines starting with the BED comment character ( For now, I can add the Cheers, |
v0.5.2 should at least not error when not passing an autosql (this only happened when passing the bed via stdin) I still need to add the following:
|
I have not had time to put together a test case to share yet, but now it seems to me that the
? |
Ah no, that should work but I made a mistake. Will update with fix either later today or tomorrow. |
Great thank you! Sorry for all the questions/comments, but I think I am also noticing that setting the zoom level on the command line is not respected when I check after the file is created with bigbedinfo. |
No worries - the extra use/testing is helpful. I'll take a look at that too |
Awesome thanks! I was playing with the zoom level because I thought it might fix an issue I am seeing where some data will disappear as I zoom in when using decorator tracks. |
Ah that's weird, if you can get a reproduction, I can look into it. |
Thanks, I made a test case here that reproduces the issue in this region of hg38 chrX:47191183-47195924, and uploaded it here: (and my commands are in |
Just wondering if you got a chance to look at this? Thanks in advance! |
Sorry, this fell off my radar. I'm looking at this now - will let you know. |
So, I ran out of time looking at this for now, but here are my general thoughts: It doesn't seem like this is an issue with the data that is written. I.e. it doesn't look like there's a bug that is causing some data to not be written to the final bigBed. Rather, I think this is more likely a quark fundamentally attributed to zooms. My hunch is that because you're writing the bigBed from stdin, the heuristics for calculating/including zoom levels just has a poor interaction here. And really, I think it all comes down to the decoration bigBed. I think likely the calculated zoom sizes are just too big. The easiest check here is just write the declaration bigBed with nzooms=0, which should force the genome browser to use the primary data. I have wanted to add an option to be a little bit more explicit about what zooms to write, so this might be the push to do that. I expect you would then want to manually set the initial zoom to be small. |
I'll try to get back to this maybe later today or this week and test out my theory about writing the decoration bigBed with no zooms (if you don't get to it). |
Thanks for starting to look into this! To help out a little I just tested creating from stdin and from a pre-made bed file, and both with a variety of zoom levels (including 0), and unfortunately none of the combinations made these items visible on UCSC. I extended the script and trackhub so that it also contains a ucsc test just to confirm that it works with the OG tooling: So I don't think it is something I can fix by forcing the zoom level. (reminder: I am using bigtools 0.5.1 since 0.5.2 since the |
Thanks, this is helpful. I'll keep looking into it, but I'm really not sure! |
Just being a squeaky wheel. Thanks in advance! |
Sorry! Gotten back to this a bit. Still looking, but it's definitely a zoom-related issue with the decoration track. Using the ucsc read bigBed with the bigtools decoration bigBed results in the same pattern that you see above (no zooms). Manually specifying 4 zooms (haven't yet added the ability to specify zoom sizes, so the default sizes are 40,160,640,2560) actually results in no decorated items showing. Next step here is for me to manually specify the zooms and see if that fixes it. As a side point, I'm noting that the |
Just making a note that I am still interested in this. I'd very much like to move this into our production pipeline since it is a fantastic speed up. Thanks again for the tool! |
Gah! Sorry! I keep forgetting to get back to this... really sorry about the slow progress here. |
So, have been investigating this a bit further. And, well, I'm perplexed. To recap: previously, making a hub with the ucsc data file and bigtools decoration file did not work, but ucsc decoration file and bigtools data file did work. I added the functionality to manually specify zooms (instead of bigtools calculating them, which is a bit different than ucsc), tried that and it didn't work. So, these none of 0 zooms, 4 zooms (different), or 4 matched zooms work.
I then looked into this, in case it's related. Which ended up being it's own rabbit hole. Essentially, rather than focusing on Along the way, I tried making a bigBed with a single zoom of 1 bp, and realized there was a bug in bigtools (essentially, when there are overlapping regions, I was not counting past the first region). I fixed this, but it still doesn't fix the decoration problem (which, is not that surprising given that no zooms also doesn't work). So...still working on it. (I have a testing branch here with ongoing fixes: https://github.com/jackh726/bigtools/tree/dec-testing2) |
Okay next weird thing - subsetting the input bed file where the fourth column == "m6A" seems to result in identical outputs... |
Thanks for starting to investigate! I will say the second thing doesn't surprise me that much. For example, depending on the size of the region I was making for the test case different fibers in different regions disappeared. Would it be helpful to ask UCSC anything? I know a person or two on the browser team. |
Yeah, at this point its probably worth asking the ucsc folks. I've tried "minimizing" the decoration file in different ways prior to running it through bigtools (subsetting to only m6A, removing FIRE lines, only including regions overlapping the target window) and each time i can't reproduce the problem in the genome browser. Subsetting the data file to one entry that is missing the decorations and one that isn't does reproduce the problem though. I might have some time to poke at this a bit more today. I'm really just perplexed because it doesn't seem like a file format issue (given that just subsetting the initial bed file has yet to reproduce the issue) and some of the decorations are correct even in the reproduction... |
I reached out the to creator of the decorator tracks and linked the issue and asked for any advise he might have. I do know he's has since started working on his masters degree so not sure he'll have time to look. Did I understand you correctly in that you were able to get a single read version that recreates the issue? If so I'd add it to the track hub if you could share the file. |
I did have one idea. There is an option in ucsc bedtobigbed that allows a 1bp overlap between blocks in bed12 files which is something I take advantage of with the m6a track specifically. Is that something that might hit some corner case in bigtools? |
My test hub is here: https://users.abdenlab.org/hueyj/bigtools-test/hub.txt re. 1bp overlap (#49?) I think that makes no difference - ucsc does some checks that needs that flag, and bigtools doesn't do those |
For my hub (as it is right now, might be down - we're having server troubles):
|
I was starting to dig through the kent genome browser code to see exactly how the decorator tracks are loaded/processed to see if there's something special happening that relies on exact bigBed layout or something, then we started having server trouble. |
Btw, I just published version 0.5.3, which should fix #48 (comment) and a few other things I found when testing this. It doesn't fix the main issue though, which I unfortunately still haven't found the cause of. |
Great, I will update to that version for the other fixes! One more corner case I have thought of for this data. These bed12 files do sometimes have a zero-length "block/exon". Do you think that could be causing an issue in bigtools? |
Nevermind, on the zero length. That is not something I am doing here, got confused with another tool I am working on |
-a
with bedtobigbed
Went ahead and renamed the issue to reflect the current state of things - figured there's enough background that even it's better to just keep it here than split it out (even though the original issue is solved) |
Hello,
I have found a discrepancy in the number of fields and some other metadata when using bigtools vs ucsc and it is having an impact on how things appear in the browser.
Basically, when I test with
bigbedinfo
I see differences in field number and other metadata columns (the correct number of fields is 17):I have uploaded all the data in my test to:
https://s3-us-west-2.amazonaws.com/stergachis-public1/index.html?prefix=Mitchell/temp/bigtools-test/
And here is the script I am running:
Any help with an obvious error on my part would be great!
Cheers,
Mitchell
The text was updated successfully, but these errors were encountered: