-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change log generator misses some changes #107
Comments
This is an important bug to have fixed as it is a bit disrupting for the workflow we are envisaging for ALP and Minimal-VM. |
There shouldn't be any wrong entries, just missing ones. Since the composed image description may have many different source fragments from different source files, keg tracks from which source files and lines keys where imported that end up in the final dictionary and writes them to a file. The change log generator uses that information to generate the log (or logs specific to profiles in case of multi-build). Simply using the commit message of all involved files may include unrelated commits, hence the source tracking by keg. However, as stated, there are some cases that slip through, i.e. when a key value was truncated as in the above sample, or removed entirely. Keg cannot know from looking at the current state that there used to be a key in a previous state. You could work around this by replacing removed trailing lines in keys with empty lines instead of completely cutting them. Removed keys could be replaced by Fixing this for good is tricky. Essentially we would need to check out the previous version from the recipes repo, build the image dictionary with keg and dump it somehow, then generate the image dictionary of the current version and look for deletions and somehow figure out which commits removed them. This would make change log generation significantly more complicated than it already is. I'll try to look into this some time, but I don't see a quick solution. |
I've spent some time pondering a few approaches and I don't see a good path forward to dealing with those missed changes. To reiterate, the change log generator does not detect removed entries at the end of a key, e.g. the removal of the last package from a list of packages in a namespace, or the removal of a key. We could rebuild the previous image dictionary and find drop entries and keys, but finding the commits in the commit history that deleted them is tricky. You can ask git for commits matching a string or a regex in a file, but this could return false positives or only part of the relevant commit history, and as such couldn't be considered reliable. I don't think it's worth spending the effort to implement it. When the keg compose_kiwi_description service thinks a newly generated image description has no changes compared to the existing one even though it may have some is a bit of a nuisance to deal with manually, but once we have an automated way to refresh image descriptions in the build service this becomes a more serious issue. So I suggest the following approach: We generally try not to make changes in keg-recipes that would slip by the change log generator. That means when removing a list entry at the end of a key, we replace it with a newline or better a comment. That means _namespace_foo:
package:
- pkg1
- pkg2
- pkg_to_remove becomes _namespace_foo:
package:
- pkg1
- pkg2
# removed; comment for source tracking And instead of simply removing keys, they get set to Null: _namespace_foo: Null # removed; comment for source tracking This way the removed lines are still captured by the source tracker and the change log generator will find the appropriate commits. There may still be commits slipping through that don't adhere to this. To address this, I suggest to implement additional checks in the compose_kiwi_description service that compare the generated files with the existing versions in case no changes were detected, to avoid running into the situation where we miss a description refresh. @rjschwei What do you think? |
I am not a fan of such conventions. In #148 a different solution emerged :D |
#148 only makes sure that the image description is regenerated if there are any changes, but it still may miss change log entries as described above. Those conventions would be in place to prevent that. It's not ideal because it is easily overlooked, but I think it's the most pragmatic approach. |
I don't quite understand. We look for file changes to generate the image description. Once we have the files that actually change we can also know what changed, comprehending a |
It's trivial to figure out what has changed from comparing the generated to the existing files, but that doesn't give you the information which commits caused the changes for generating the change log. The change log generator operates on the source tracking information that keg logs. Keg keeps track of where each line in the dictionary it constructs originated from. With that information the change log generator can collect all commits that affected all source lines of a profile. But as said, in case a key was removed or truncated, there is nothing in the source log that would point to the commit that removed it, as it's simply not there. So unless that commit also changed another part of the profile, the generated change log will lack that commit. It's a bit of a corner case and can be avoided as described, and with the latest update |
Sorry for being dense about this, but I still do not understand. Let's construct a hypothetical example in the hopes it will help me along. Lets say I have a yaml file that contains
Form your original example and that was in the commit with sha 12345. This is used in the image generation. now in the next commit we end up with
so there's a new sha for this commit, say sha 12346. Since we already know that the file that contained this information has changed I can git diff the commits. In the diff
then the git diff will show me that What am I missing? |
Keg doesn't inspect differences between yaml files. To generate a change log for a given image description, or rather profile of an image description, as profiles are separate images in the end, keg keeps track of the files and line numbers each key in the image dictionary originated from and writes that to a log. The change log generator reads that information and queries git for all commits that affected each dictionary key since the last image description refresh. So if we have a file
it will ask git for commits that affected lines 1-3 in Keg could theoretically look at differences between the old and new image dictionary, and sift through the commit diffs to figure out which commits match the changes, but I think this would be tricky to do, possibly error prone, and IMHO not really worth the effort considering that the missed deletion situation is somewhat of a corner case, and can be avoided. |
I've bashed my head against this for a while and tried a few things. Diffing the image dictionaries is tricky, because there are keys within lists and comparing lists doesn't really work well. However, there might be another approach. Keg produces logs of all source lines used in the dictionary for each profile. The source logs could be recreated by compose_image_description (as we capture the commit ID that was used for the previous image description). The changelog generator could use that information to find lines that were included in the old image description but are absent in the new image description. 'git blame --reverse' can be used to find the commit when a given line was present last. From there it should be possible to work out the next commit, which naturally should be the one that removed the line, and include the commit message in the change log. It may be a bit cumbersome but I think it should work. What do you think @rjschwei ? |
Keg source tracking stores the source information of each key in the internal dictionary which can be used to generated a change log from the commit info. Not all changes are caught though. If a key was changed from e.g.:
to
the source info generated by keg for
some_key
would be line 1 to 3 for whatever filesome_key
was imported from. The only line that changed was line 4 though, so querying the git commit history for changes in line 1-3 would not return the commit, and hence no change log entry is generated for that change.The text was updated successfully, but these errors were encountered: