Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select position range in BigWig #61

Open
multimeric opened this issue Nov 13, 2024 · 3 comments
Open

Select position range in BigWig #61

multimeric opened this issue Nov 13, 2024 · 3 comments

Comments

@multimeric
Copy link

I'm hoping to "cut down" a BigWig by only selecting a number of chromosomes. It would be nice if there were a bigtools bigSelect --chroms chr1,chr2 that did so, and/or bigtools bigSelect --bed regions.bed for more advanced use cases.

Of course this can be done using the Rust/Python API, but I'm after a more user friendly solution I can suggest to others who want to do this.

@jackh726
Copy link
Owner

Yeah, this certainly would be helpful! There are the chrom/start/end options available in bigwigtobedgraph, but that isn't quite as powerful as what you propose. There is also bigtools intersect and bigtools chromintersect that are most undocumented and not in good shape, that I previously did a little bit in this area. (But they are far from what you want!)

It's pretty trivial to have a tool that does a read -> filter -> write, but better would be able to efficiently copy over entire blocks of the file and just reindex as needed. For chromosomes this is really easy, but for specific regions its a bit more difficult, since you have to think about if you want to just copy over unfiltered blocks at the expense of some blocks being smaller or if you want to maintain the fact that most all blocks are full.

I'm not sure that I'll get to this very quickly, but when I find some time, I'd be happy to take a stab at this.

@multimeric
Copy link
Author

If you can specifically optimise selecting chromosomes, then I think it's worth making a separate subcommand for that. The BED file selection is more complex and not actually as important for my use case.

@ghuls
Copy link
Contributor

ghuls commented Dec 12, 2024

It's pretty trivial to have a tool that does a read -> filter -> write, but better would be able to efficiently copy over entire blocks of the file and just reindex as needed. For chromosomes this is really easy, but for specific regions its a bit more difficult, since you have to think about if you want to just copy over unfiltered blocks at the expense of some blocks being smaller or if you want to maintain the fact that most all blocks are full.

Copying whole blocks would also be useful for:
#52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants