Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change US-specific spellings #590

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

tovmasharrison
Copy link
Contributor

Description

I have changed US-specific spellings.

Checklist:

Copy link
Collaborator

@cefoo cefoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tovmasharrison!

Thank you so much for your PR!

I'm not sure about some introduced changes, perhaps @bittlingmayer could give us his perspective.

I've left some comments below. Please let me know what you think. Thanks!

@@ -40,7 +40,7 @@ It was organised by **TAUS**.
| --- | --- |
| 13:00 - 17:00 | **Preconference session:** [**Quality estimation workshop**](https://www.taus.net/events/conferences/quality-estimation-workshop) <br>Amir Kamran, Anderson Vaz, Stephen Tyler, Adam Bittlingmayer, Artur Aleksanyan |
| 13:00 - 17:00 | **Preconference session:** [**Unleashing the power of advanced speech technology: A tutorial**](https://www.taus.net/events/conferences/unleashing-the-power-of-advanced-speech-technology-a-tutorial) <br>Sravya Popuri, Paco Guzman |
| 13:00 - 18:00 | **Preconference session:** [**Introduction to GenAI in localization**](https://www.taus.net/events/conferences/genai-in-localization) <br>Konstantin Dranch |
| 13:00 - 18:00 | **Preconference session:** [**Introduction to GenAI in localisation**](https://www.taus.net/events/conferences/genai-in-localisation) <br>Konstantin Dranch |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated link doesn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad

@@ -74,7 +74,7 @@ seo:
* **[MT Summit 2023](https://machinetranslate.org/mtsummit2023)** - 4 - 8 September - Macau 🇲🇴
* **[WAT 2023](https://machinetranslate.org/wat2023)** - 4 September - Macau 🇲🇴
* **[CoCo4MT 2023](https://machinetranslate.org/coco4mt-2)** - 4 - 5 September - Macau 🇲🇴
* **Seattle area localization meetup** - 13 September - Seattle 🇺🇸
* **Seattle area localisation meetup** - 13 September - Seattle 🇺🇸
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about changing the event name to conform with UK spelling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, shouldn't it be written with capital letters Seattle Area Localization Meetup? This way, it won't be detected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should write Seattle Localization User Group meetup

@@ -74,7 +74,7 @@ This pre-recorded course is designed for developers with intermediate knowledge

## Localization Institute

The **Machine Translation Master Class** is offered by the Localization Institute and taught by Peng Wang.
The **Machine Translation Master Class** is offered by the Localisation Institute and taught by Peng Wang.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about changing an organization name to conform with UK spelling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should not

@@ -1,14 +1,14 @@
---
parent: Events
title: Seattle Localization User Group meetup
description: "Panel Discussion: Applications of Large Language Models (LLMs) in Localization"
title: Seattle Localisation User Group meetup
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about changing an event and an organization name to conform with UK spelling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@@ -14,11 +14,11 @@ BERTScore was invented as an improvement on [n-gram](/n-gram)-based metrics like
>
> [...] First, such methods often fail to robustly match paraphrases.
>
> [...] Second, n-gram models fail to capture distant dependencies and penalize semantically-critical ordering changes.
> [...] Second, n-gram models fail to capture distant dependencies and penalise semantically-critical ordering changes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are quotations taken directly from the paper. In this and several other cases throughout this PR, there are instances of literal quotations. I'm not sure about changing their spellings.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should not.

@@ -19,15 +19,15 @@ seo:
type: VirtualLocation
url: https://eamt.org

organizer:
eder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@tovmasharrison
Copy link
Contributor Author

Hi @cefoo !

Unfortunately, the changes I made for words that were part of an organisation name were my mistake. The test for checking US-specific words does not detect words that are part of a name.

I will revert those changes. However, there are some cases such as the word penalize, that is not part of a name and will be detected. As you mentioned, maybe @bittlingmayer can give us his perspective.

@bittlingmayer
Copy link
Collaborator

Well, I said it's a hard problem. :-)

Luckily, we only need to solve it within our very limited scope, and for now it's only a suggestion, and not running in CI.

I think even the following will deal with 99% of it:

  • List of patterns to ignore (e.g. lines that start with >, or words followed by<!-- lint-ignore-us-spelling -->)
  • List of names to ignore (e.g. Localization Institute)
  • List of pages to ignore (e.g. events.md)

@bittlingmayer
Copy link
Collaborator

/newsletter can also be ignored, I'd say.

@@ -74,7 +74,7 @@ seo:
* **[MT Summit 2023](https://machinetranslate.org/mtsummit2023)** - 4 - 8 September - Macau 🇲🇴
* **[WAT 2023](https://machinetranslate.org/wat2023)** - 4 September - Macau 🇲🇴
* **[CoCo4MT 2023](https://machinetranslate.org/coco4mt-2)** - 4 - 5 September - Macau 🇲🇴
* **Seattle area localization meetup** - 13 September - Seattle 🇺🇸
* **Seattle Area Localization Meetup** - 13 September - Seattle 🇺🇸
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should really just be

Seattle Localization User Group meetup

@@ -104,13 +104,13 @@ There was not enough time to answer all of the audience questions during the mee
> >
> > As a reviewer, I push back when people label “ablated” datasets, that is, smaller versions of a larger dataset, as low-resource.
> >
> > Real low-resource languages are noisier, include code-switching, have different scripts, non standardized orthography (that is, same word can be spelled differently in the same dataset).
> > Real low-resource languages are noisier, include code-switching, have different scripts, non standardised orthography (that is, same word can be spelled differently in the same dataset).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't touch anything in a blockquote

@@ -27,7 +27,7 @@ seo:

The online symposium was held by the Department of Translation and Interpeting Studies at Bar Ilan University.

> Machine translation (MT) has had an increasing effect on multilingual communication and understanding in a globalized world.
> Machine translation (MT) has had an increasing effect on multilingual communication and understanding in a globalised world.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't touch anything in a blockquote

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants