Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Scanning only splits up the layer #1375

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

ikelos
Copy link
Member

@ikelos ikelos commented Nov 30, 2024

Scanning had been rigged to chop up a layer's readable bits into the chunks that make it up from the lower layer. Unfortunately it did this on chunks in the lower layer, rather than chunks readable in the higher layer. This meant chunks in Intel were already read in page sized blocks. This vastly simplifies the reading of data and brings non-linear layer searching in sync with linear layer searching (because there's no good reason they should be different).

This doesn't directly help with the yarascanning issue that spawned its investigation, but it will vastly simplify things. The big question is whether it speeds things up or slows them down...

Since this is core functionality I'd like all eyes on it before it lands please...

Scanning had been rigged to chop up a layer's readable bits into the
chunks that make it up from the lower layer.  Unfortunately it did this
on chunks in the lower layer, rather than chunks readable in the higher
layer.  This meant chunks in Intel were already read in page sized
blocks.  This vastly simplifies the reading of data and brings
non-linear layer searching in sync with linear layer searching (because
there's no good reason they should be different).
@ikelos ikelos force-pushed the feature/scan-one-layer-deep branch from df7b8a4 to 90c048f Compare November 30, 2024 14:43
Copy link
Contributor

@gcmoreira gcmoreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look later, but from a quick glance, it seems like this new implementation will always yield a list with just one tuple, which doesn't seem ideal. Is there any chance we can improve that as well? Also, I think these changes will also require updates to the docstring.

@ikelos
Copy link
Member Author

ikelos commented Dec 1, 2024

Probably it will, but I think that's ok. It allows for the gathering of data from disparate places, which might be useful one day for a different type of layer. I think that going back and ripping out how it used to work may not provide much benefit, and I'd rather do it when we do major renovations on the scanning system instead. This might not be a very useful PR, everything works as is, I don't think it provides significant speed-up (particularly since it'll need to do re-lookups again) but it's also not detrimental and should make the scanning easier to understand (even though it took me 3 attempts to get it all rigth). Happy to leave this sit here, but a useful exercise in case we ever hit similar situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants