-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any chance of hosting the VoxCeleb datasets? #34
Comments
wow -- https://www.robots.ox.ac.uk/~vgg/data is an awesome collection of datasets. I think collecting them under http://datasets.datalad.org/?dir=/labs/vgg (or may be even just straight on the top level?) . Some are an easy job for the crawler. Running now
to see what happens for voxconverse one... Result you can see at https://github.com/yarikoptic/demo-vgg-voxconverse (I am not redistributing any data file there, so to
The voxceleb is trickier due to all the split archives, and our crawler can fetch them but then we really would need to re-distributed extracted files after manual "cat"ing them all together (attn @joonson) overall -- the chance exists, but needs thinking/time investment to make it happen. Interested to join the effort? ;-) |
It would be awesome to be able to download a certain number of files instead of the giant archives. |
I do get the incentive and it should be possible |
The downloads are very slow from their site, the mirrors do not always work, and their google drive link is dead.
https://www.robots.ox.ac.uk/~vgg/data/voxceleb/
The text was updated successfully, but these errors were encountered: