-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically choose preposition-article contraction in Portuguese #294
base: master
Are you sure you want to change the base?
Conversation
@@ -110,6 +114,8 @@ var abbreviations = { | |||
'hu': abbreviationsHu, | |||
'lt': abbreviationsLt, | |||
'nl': abbreviationsNl, | |||
'pt-BR': abbreviationsPt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And maybe also item for "generic" Portuguese 'pt': abbreviationsPt,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR copies the abbreviations object to two keys for consistency with the translation files, but the client is expected to perform some sort of locale matching. Even if we were to set pt
, the environment’s locale may be something else like pt-AO
or even pt-US
. There are plenty of libraries that can perform locale matching, such as locale-utils.
I’m not necessarily opposed to setting the language-only locale, but I think we’d want to do so consistently for all languages and resource types, and I’m not sure that would be feasible for a situation like zh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't see the problem with this grammar use for all Portuguese dialects - grammar expressions will match only Portuguese street names even if pt-US
will be used in US with English names. Or there is a difference in Portuguese articles usage inpt-BR
and pt-US
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think abbreviations and grammar rules should be available regardless of the specific Portuguese locale in use, but it should be up to the client to choose a country to default to. For now the two locales share the same abbreviations and grammar rules, but that wouldn’t necessarily be the case in the future for all languages, so I’d be hesitant to create an expectation that clients can look up grammars without performing locale matching first, which they have to do when getting a translated instruction.
If it’s a major inconvenience for clients to perform locale matching themselves, then we could have this library depend on locale-utils, but there are larger libraries with more robust locale matching and I wouldn’t want to force clients to use the more rudimentary logic in locale-utils.
5971e65
to
b641a6b
Compare
Issue
In #283 (review), @xendez and @jppcel contributed new translations for Portuguese and pointed out that certain names would need a different preposition-article contraction depending on the grammatical gender of the name. This PR adapts the French grammatical system that @yuryleb implemented in #252.
I came up with the list of rules and tests by querying OpenStreetMap for
name
tags onhighway
ways in Lisbon and Rio de Janeiro. (I’m not sure if those two cities are representative of road type designations elsewhere in Lusophone countries, but we can always manually add more road type designations.) I isolated the road type designations but stripping all but first word of each multiword name, then removing duplicates, given names, and acronyms. Finally, I looked up the grammatical gender of each word in the English and Portuguese Wiktionaries and identified a limited set of lexical patterns. (Hopefully my limited Spanish didn’t bias the final rules in any way.)Because I started with road names, these rules are somewhat unlikely to have great results against the place names that one would see in the
destination
andwaypoint_name
variables. But again it’s just a start.A starter list of abbreviations has also been added based on some common abbreviations I found on street names in Lisbon and Rio de Janeiro.
Before merging, it’d be great to get feedback on the following:
destination
,junction_name
,way_name
, orwaypoint_name
?Tasklist
Requirements / Relations
Depends on #283.
/cc @danpaz