fix: re-pull Atlas translations for mounted edx-platform #1024

kdmccormick · 2024-03-21T21:53:22Z

This follows up on #993

Description

The new Atlas-based translations system stores important translations artifacts in edx-platform.

Unfortunately, when edx-platform is bind-mounted, these translations artifacts are overwritten. This means that edx-platform developers will see 404s on various i18n-related assets in Django-rendered LMS and CMS pages.

We work around this in the same way we do for static assets, egg_info, and node_modules: Just re-generate the translations artifacts as part of the dev init job, using mounted-directories.sh . This is a waste of time and bandwidth, but it's the system we have, for now at least.

Reviewer notes

@OmarIthawi , I've lifted the entire i18n block from the Dockerfile into mounted-directories.sh. That means that we will re-run every i18n command when a user is provisioning a bind-mounted edx-platform directory with tutor dev launch.

Does that look right? Do we need all of those lines, or can we take some out? Keep in mind that the i18n lines from the Dockerfile will have already been executed, so we only need to re-run whose artifacts are within bind-mounted directories, namely edx-platform.

I've also added rm -rf calls, because Atlas told me that I can't pull into an existing directory. Does that seem right?

Long-term notes

Heads up @regisb and @DawoudSheraz , this adds even more time and repetative downloads to tutor dev launch. I think we should move to a system where the init scripts copy egg_info, node_modules, assets, and i18n files from the Docker image instead of re-downloading and re-generating them. I have a version of this working on my experimental branch but I haven't had the time yet to polish it and make a PR.

The new Atlas-based translations system stores important translations artifacts in edx-platform. Unfortunately, when edx-platform is bind-mounted, these translations artifacts are overwritten. This means that edx-platform developers will see 404s on various i18n-related assets in Django-rendered LMS and CMS pages. We work around this in the same way we do for static assets, egg_info, and node_modules: Just re-generate the translations artifacts as part of the dev init job, via mounted-directories.sh. This is a waste of time and bandwidth, but it's the system we have, for now at least.

kdmccormick · 2024-03-21T22:22:22Z

Wait, I just realized that this fixes LMS pages, but not Studio pages. In order to fix Studio, we need to run this command in the cms container, not the lms container:

./manage.py cms compilejsi18n

The mounted-directories.sh script currently only runs in lms. I think we should have two scripts: mounted-directories-lms.sh and mounted-directories-cms.sh. The bulk of the script would remain in the former, but ./manage.py cms compilejsi18n would go in the latter.

DawoudSheraz · 2024-03-22T06:16:49Z

Thanks @kdmccormick . I agree that mounted-platform does need to perform redundant operations and if we can leverage the already completed operations in Dockerfile, that can certainly improve DevEx (or maybe skip processing in Dockerfile if platform is mounted?).

regisb · 2024-03-22T09:28:38Z

😭😭😭😭😭😭😭😭😭

regisb · 2024-03-22T09:33:51Z

This issue is very depressing. It would be great if we could work on a long-term solution; copying existing egg_info/node_modules/ assets/i18n from the Docker container would only partially resolve the situation, as we would need to update those based on the new values in edx-platform. I.e: if edx-platform modifies npm requirements, we should update the values in node_modules. I don't know how to do that in a way that is both reliable and time-efficient.

kdmccormick · 2024-03-22T12:36:55Z

@regisb right, I have struggled with this issue for a very long time 😅 Copying the artifacts out of the image is only the latest on a long list of things I've tried. I agree it is not ideal.

What do I think the perfect solution would be? Simply put, we would need to do 2 things:

move the artifacts outside of the repository
make the image a bit more cache-friendly

That way, any change to package.json, asset sources, or translation inputs would just trigger a rebuild of the relevant image layers, thus updating the artifacts. In other words, we should use Docker the way it's intended 🙂

(2) is actually quite doable, it's (1) that's been hard. Our tools just really want to write their artifacts into the repo. Give that, what I would try is:

in the Dockerfile: generate the /openedx/edx-platform artifacts, and copy them to /openedx/artifacts
in the bind-mounted repo: delete any existing artifacts, and replace them with symlinks to /openedx/artifacts

Worth trying?

brian-smith-tcril

Considering the context of

We work around this in the same way we do for static assets, egg_info, and node_modules

This approach seems reasonable to me.

I'm curious as to where else in tutor dev this may become an issue considering the other places we've needed to make tutor changes for FC-12

OmarIthawi

Thanks @kdmccormick!

😭😭😭😭😭😭😭😭😭

@regisb Yes, I don't think we've accounted for all and every use case for Tutor atlas pull. It's indeed difficult to say. There are parts of Tutor that I don't fully understand :/

For what's it worth, if we're going to add all those lines twice in Tutor, we should consider making use of the edx-platform make file.

The Tutor openedx Dockerfile should also be refactored imho.

OmarIthawi · 2024-03-22T15:40:02Z

tutor/templates/jobs/init/mounted-directories.sh

+./manage.py lms pull_plugin_translations --verbose --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }}
+./manage.py lms pull_xblock_translations --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }}
+atlas pull --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }} \
+	translations/edx-platform/conf/locale:conf/locale \
+	translations/studio-frontend/src/i18n/messages:conf/plugins-locale/studio-frontend
+./manage.py lms compile_xblock_translations
+./manage.py cms compile_xblock_translations
+./manage.py lms compile_plugin_translations
+./manage.py lms compilemessages -v1
+./manage.py lms compilejsi18n
+./manage.py cms compilejsi18n


Suggested change

./manage.py lms pull_plugin_translations --verbose --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }}

./manage.py lms pull_xblock_translations --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }}

atlas pull --repository='{{ ATLAS_REPOSITORY }}' --revision='{{ ATLAS_REVISION }}' {{ ATLAS_OPTIONS }} \

translations/edx-platform/conf/locale:conf/locale \

translations/studio-frontend/src/i18n/messages:conf/plugins-locale/studio-frontend

./manage.py lms compile_xblock_translations

./manage.py cms compile_xblock_translations

./manage.py lms compile_plugin_translations

./manage.py lms compilemessages -v1

./manage.py lms compilejsi18n

./manage.py cms compilejsi18n

make ATLAS_OPTIONS="--repository={{ ATLAS_REPOSITORY }} --revision={{ ATLAS_REVISION }} {{ ATLAS_OPTIONS }}" pull_translations

DawoudSheraz · 2024-03-22T16:53:56Z

@regisb right, I have struggled with this issue for a very long time 😅 Copying the artifacts out of the image is only the latest on a long list of things I've tried. I agree it is not ideal.

What do I think the perfect solution would be? Simply put, we would need to do 2 things:

move the artifacts outside of the repository

make the image a bit more cache-friendly

That way, any change to package.json, asset sources, or translation inputs would just trigger a rebuild of the relevant image layers, thus updating the artifacts. In other words, we should use Docker the way it's intended 🙂

(2) is actually quite doable, it's (1) that's been hard. Our tools just really want to write their artifacts into the repo. Give that, what I would try is:

in the Dockerfile: generate the /openedx/edx-platform artifacts, and copy them to /openedx/artifacts

in the bind-mounted repo: delete any existing artifacts, and replace them with symlinks to /openedx/artifacts

Worth trying?

Some thoughts

in the Dockerfile: generate the /openedx/edx-platform artifacts, and copy them to /openedx/artifacts

These will be generated against mounted platform, I suppose (?). Does this mean we might be able to get rid of mounted-directories.sh, or atleast platform part of it?

One small thing that I faced recently, echoing similar to what Régis mentioned above "we would need to update those based on the new values in edx-platform. I.e: if edx-platform modifies npm requirements, we should update the values in node_modules.". I mounted the edx-platform master branch (not update to latest master) and ran tutor on nightly. The image build fine but during init task, npm install failed due to connect error. When I attempted launch again, only the starting parts of image build were used from cached. All the steps involving edx-platform requirements were re-executed. I understand once image was built but failed during init, tutor dev do init and tutor dev start -d would have been worked. But given the fact that many layers in image were re-built because of init failing midway but doing enough to change edx-platform that caused Image to rebuild parts of it, making the process cache friendly can be weirdly challenging.

kdmccormick · 2024-03-22T17:11:41Z

But given the fact that many layers in image were re-built because of init failing midway but doing enough to change edx-platform that caused Image to rebuild parts of it, making the process cache friendly can be weirdly challenging.

Yes, this exactly our caching problem right now--we rebuild static assets and various other stages if anything in edx-platform changes!

It does not have to be like that. In kdmccormick#34, I have made it so that assets are only rebuilt if asset sources change, python reqs are only reinstalled if requirements change, etc.

OmarIthawi · 2024-03-23T03:56:51Z

@regisb @kdmccormick I've thought about it and I think this pull request shouldn't be merged as is.

1- This bug is a problem in the edx-platform for not providing a fallback to gettext. It should provide a dummy fallback for both LMS and CMS so this pull request shouldn't be needed.

2- If this problem is related to both how building Docker image and mounting volumes works, it might make sense to change the settings of conf/locale directories to be in /openedx/artificats or somewhere else. Unlike node_modules, it's perfectly fine for conf/locale and conf/pluings-locale to be out of date, like one year out of date. So, unless the developer is particularly interested in previewing the latest translations, we shouldn't bother running atlas pull except in build time.

TL;DR; mounted-directories.sh shouldn't run atlas pull to compensate an edx-platform gettext bug.

OmarIthawi · 2024-03-23T05:27:14Z

@regisb @kdmccormick I think the proposed fix below makes this pull request redundant:

feat: fix missing gettext functions if translations aren't pulled in development environments openedx/edx-platform#34416

I recommend not merging this pull request because it's not a Tutor bug.

kdmccormick · 2024-03-26T19:27:28Z

Superseded by openedx/edx-platform#34416

brian-smith-tcril reviewed Mar 22, 2024

View reviewed changes

OmarIthawi reviewed Mar 22, 2024

View reviewed changes

OmarIthawi mentioned this pull request Mar 23, 2024

feat: fix missing gettext functions if translations aren't pulled in development environments openedx/edx-platform#34416

Merged

4 tasks

kdmccormick force-pushed the kdmccormick/mounted-i18n branch from df11483 to d913781 Compare March 25, 2024 20:52

kdmccormick closed this Mar 26, 2024

kdmccormick deleted the kdmccormick/mounted-i18n branch March 26, 2024 19:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: re-pull Atlas translations for mounted edx-platform #1024

fix: re-pull Atlas translations for mounted edx-platform #1024

kdmccormick commented Mar 21, 2024 •

edited

Loading

kdmccormick commented Mar 21, 2024

DawoudSheraz commented Mar 22, 2024

regisb commented Mar 22, 2024

regisb commented Mar 22, 2024

kdmccormick commented Mar 22, 2024

brian-smith-tcril left a comment

OmarIthawi left a comment •

edited

Loading

OmarIthawi Mar 22, 2024

DawoudSheraz commented Mar 22, 2024

kdmccormick commented Mar 22, 2024

OmarIthawi commented Mar 23, 2024

OmarIthawi commented Mar 23, 2024 •

edited

Loading

kdmccormick commented Mar 26, 2024

fix: re-pull Atlas translations for mounted edx-platform #1024

fix: re-pull Atlas translations for mounted edx-platform #1024

Conversation

kdmccormick commented Mar 21, 2024 • edited Loading

Description

Reviewer notes

Long-term notes

kdmccormick commented Mar 21, 2024

DawoudSheraz commented Mar 22, 2024

regisb commented Mar 22, 2024

regisb commented Mar 22, 2024

kdmccormick commented Mar 22, 2024

brian-smith-tcril left a comment

Choose a reason for hiding this comment

OmarIthawi left a comment • edited Loading

Choose a reason for hiding this comment

OmarIthawi Mar 22, 2024

Choose a reason for hiding this comment

DawoudSheraz commented Mar 22, 2024

kdmccormick commented Mar 22, 2024

OmarIthawi commented Mar 23, 2024

OmarIthawi commented Mar 23, 2024 • edited Loading

kdmccormick commented Mar 26, 2024

kdmccormick commented Mar 21, 2024 •

edited

Loading

OmarIthawi left a comment •

edited

Loading

OmarIthawi commented Mar 23, 2024 •

edited

Loading