Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix directory targets with empty subdirs #11226

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ElectreAAS
Copy link
Collaborator

@ElectreAAS ElectreAAS commented Dec 18, 2024

Incorporates and replaces #11213, replaces #10931 and #11203, fixes #10609, #11117, #11214.
Work mostly done by myself, with various helpful insights by @maiste and @art-w.

The core of the changes are in dune_targets.ml, where I changed the representation of targets (Targets.Produced.t) from a flat structure to a hierarchical one, mirroring the file system. This makes all traversals easier to reason about, and prevents problems of hitting dirA/dirB/fileB before dirA/fileA, along with the intended feature of actually processing directories properly.

Other noteworthy changes are:

  • Local.Target.create now also creates directories, not just files
  • Build_system.file_exists now also checks for directories, not just files
  • Dune_cache_storage.Artifacts.Metadata_entry.{to,of}_sexp have slightly changed semantics, I added the possibilty to have no digest but instead a simple "<dir>" to indicate a directory. I wouldn't be surprised to hear this has long ramifications. Same concern for Targets.Produced.{digest,to_dyn}.

I'll add for the record that a lot of functions named do_something_file also work on directories, but not all, and that is rather confusing. If I'm not the only one with this sentiment I could do a refactor to make this clear everywhere.

@ElectreAAS ElectreAAS changed the title WIP: directory targets with empty subdirs WIP: fix directory targets with empty subdirs Dec 18, 2024
@ElectreAAS
Copy link
Collaborator Author

ElectreAAS commented Dec 21, 2024

Below are the initial text of the PR, and a comment I made after working on this for a week. They are now outdated, I'm just keeping them here for posterity.

This PR is related to... a lot of things actually.

For some context, let's take this test file from #11116 & #11117:
Why isn't the empty directory output/child restored from cache like output/file?

I'm not really sure, and the original author of this piece of code (rleshchinskiy) doesn't seem very active in this project at this time.
What I am sure of is that breaking the logic of collect is easy to do accidentally, but that at least fixes one problem.

Debugging this is tricky, as a lot of tests fail with a cryptic

Error: Is a directory
-> required by _build/default/a
[1]

I'll note however that the problems likely originates in either #9407, #9470, or #9535.


Belated update before the holidays:
I changed the internal representation of targets (Targets.Produced.t) to be hierarchical - it felt like the right move to detect broken logic, and was mentioned in the comments as something to be done at some point.
From that change I saw that the 'empty-dir' test passed, but a whole lot of other tests didn't. I thought I could work backwards from that to a point where all would pass, but the explanation was harsh: I had broken the caching mechanism, and the only reason the 'empty-dir' test passed was because nothing was cached and everything was rerun 🤦.
When I finally fixed everything else, obviously the 'empty-dir' test didn't pass (the cache worked like it did before my changes, wow!).
At this point in time I have pushed my changes, and they don't break anything, but also don't fix anything...
My hunch now is that this traversal only stores files as metadata, which means that this entries list only contains metadata about files, which means they're the only things getting restored. I tried making a of_files_plus_dirs to be used at that call site, but couldn't get it to work so far. Also I don't know how to generate digests of empty directories to actually store them in metadata...

Hope this progress report isn't too arcanic for reviewers 🙂

@maiste maiste added the shared-cache Shared artefacts cache label Dec 23, 2024
@ElectreAAS ElectreAAS force-pushed the cache-empty-dir branch 2 times, most recently from ed32398 to c181da2 Compare December 24, 2024 14:40
@ElectreAAS ElectreAAS force-pushed the cache-empty-dir branch 7 times, most recently from f720346 to 9852aad Compare January 10, 2025 15:59
@ElectreAAS ElectreAAS marked this pull request as ready for review January 10, 2025 16:05
@ElectreAAS ElectreAAS changed the title WIP: fix directory targets with empty subdirs Fix directory targets with empty subdirs Jan 10, 2025
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Signed-off-by: Ambre Austen Suhamy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug engine shared-cache Shared artefacts cache
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[regression] promotion of directory targets has trouble with directories starting in @
2 participants