-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create test sets for all languages. #158
Comments
Yes, or we incorporate the code here completely. |
Ah, well we'd have to update the notebooks as well, as they point directly to the forked version |
Some languages, e.g. |
test_letter_a_new.zip |
Got to |
Oh, it's "British Sign Language". What the heck? https://en.wikipedia.org/wiki/British_Sign_Language |
test_ba_thru_btg_new.zip |
Oh yeah, maybe we should do a blacklist for all the language codes that have issues according to the tables in the appendix of https://arxiv.org/abs/2103.12028. |
Following #157, check what languages are not covered in https://github.com/juliakreutzer/masakhane/tree/master/jw300_utils/test, and create custom test sets for those. @juliakreutzer I think I can give this a go, but do I need to do a pull request to... your forked version of masakhane-mt?
Alternate language code list, looks the same: https://opus.nlpl.eu/opusapi/?languages=True&corpus=JW300
The text was updated successfully, but these errors were encountered: