-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Languages other than English #3
Comments
Sure. So ExplainToMe currently does 3 things.
Currently #1, #2 do not care about language, mostly dealing with HTML and webpage metadata. #3 cares about language, but mostly dealing with Most likely start by supporting those languages. I am interested in doing non-romance languages, but we'll see how far we get |
Cool. I take it you only use sumy as the summarisation platform? It seems to support Czech, French, German, Portuguese, Slovak, and Spanish out-of-the-box (the stop words for these languages are included in the package).
|
Correct. Sumy provides the right framework for building document Summarizer as well as the most popular techniques implemented. My main concern about adding more languages is I can't really attest to their accuracy in an intuitive way. My experience with cross-language NLP is that techniques vary on effectiveness based on latent cultural features. |
I'd love to help with Portuguese (Brazilian Portuguese). |
Awesome. Where I would start looking is under |
Heads up I'm making some changes that will be pushed upstream maybe this or next week. It shouldn't effect any code in The code however does move a lot of files around. Mostly I've split the application into the flask server that only displays the webpage and a summarization backend which runs asynchronously on aws lambda. I've mostly been running the public heroku server for demo, but it's getting costly to maintain it even if it's not that much every month |
Hi there, thanks for the cool project. The bottom of the README says the support for other languages is a thing to look forward to -- could you elaborate on it a bit? Any particular plans? Let me know if you're looking for contributors that could handle different languages.
The text was updated successfully, but these errors were encountered: