Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Introduce a Method to Automatically Identify the Physical Memory Layer #1351

Open
eve-mem opened this issue Nov 14, 2024 · 7 comments

Comments

@eve-mem
Copy link
Contributor

eve-mem commented Nov 14, 2024

Description

Currently, in Volatility3, there is no automatic mechanism to identify which layer represents the 'physical layer' in a given memory image. While a few plugins attempt to infer the physical layer in roundabout ways (e.g., finding the intel layer and getting the next lowest), it would be good to standardize it.

A standardized method for determining the physical layer would improve plugin reliability and reduce redundancy in plugin-specific logic.

Motivation

A few plugins require knowledge of the physical layer for accurate memory analysis. The lack of a uniform mechanism to identify it leads to some repetitive code across plugins, and might lead to some inaccuracies if assumptions about the physical layer are incorrect. It would be great if there a way central way to do this in vol.

As support for more architectures and swap grows, identifying the 'physical layer' becomes increasingly important, and it's not as straightforward as it might initially appear.

Additional Context

This enhancement would help avoid future pitfalls of the current strategies used by some plugins and parts of the framework. For example:

(At least I think of all these examples could benefit form some central mechanism, happy to be shown I'm wrong..!)

Also affects this currently open PR- #1321

Thanks
🦊

@ikelos
Copy link
Member

ikelos commented Nov 14, 2024

The core problem here is we all have a concept of what a "physical layer" should be and in general we can all come to an agreement about it, but that's not specific enough when it comes certain possibilities, more specifically where a memory layer is actually made up of several different components (such as swap, or compressed memory regions).

Situations involving nesting (such as virtualization where physical memory of the guest can live within the virtual memory of the host) can usually be dealt with by "one layer below the paging layer" and that tends to work, and mostly that's how people have gotten past the situation. This again runs up against the problem of layers being a tree and not a simple one-on-one stack. How do you choose which parent you actually needed, did they want the swap, or the RAM or both in some weird accessible way? How should we stitch them together? This is why there is not and has not been work towards, providing a single unified mechanism. Layers expose which sub layers make them up (through the dependencies field), and specific, well known layers (like intel) have named children (memory_layer) and that's why those techniques are used, because on the whole they work, but there are certain situations they don't work which would then require massively hacky solutions to provide and we'd be right back where we were...

This gets worse if the model were a more flexible graph structure, where one layer be stored encoded (for example compressed) chunks of the virtual layer, next to unencoded chunks of the virtual layer.

So I'm happy to discuss mechanisms that could be used to describe and allow access to these things appropriately, but I haven't found a good one that can completely describe all possible situations accurately yet...

@ikelos
Copy link
Member

ikelos commented Nov 14, 2024

The core problem here is we all have a concept of what a "physical layer" should be and in general we can all come to an agreement about it, but that's not specific enough when it comes certain possibilities, more specifically where a memory layer is actually made up of several different components (such as swap, or compressed memory regions).

Situations involving nesting (such as virtualization where physical memory of the guest can live within the virtual memory of the host) can usually be dealt with by "one layer below the paging layer" and that tends to work, and mostly that's how people have gotten past the situation. This again runs up against the problem of layers being a tree and not a simple one-on-one stack. How do you choose which parent you actually needed, did they want the swap, or the RAM or both in some weird accessible way? How should we stitch them together? This is why there is and has not been work towards, providing a single unified mechanism. Layers expose which sub layers make them up (through the dependencies field, and specific, well known layers (like intel) have named children (memory_layer) and that's why those techniques are used, because on the whole they work, but there are certain situations they don't work for which would then require massively hacky solutions to provide and we'd be right back where we were...

This gets worse if the model were a more flexible graph structure, where one layer could be stored encoded (for example compressed) chunks of the virtual layer, next to unencoded chunks of the virtual layer.

So I'm happy to discuss mechanisms that could be used to describe and allow access to these things appropriately, but I haven't found a good one that can completely describe all possible situations accurately yet...

@eve-mem
Copy link
Contributor Author

eve-mem commented Nov 14, 2024

Yes, i completely agree. It seems like it should be easy until you really start thinking about it.

E.g. if you're scanning for something and there is a normal memory layer but also a few swaps, you probably would actually scan them all. Probably not as some weird contiguous thing but you would scan them all.

Maybe it's something like adding a get physical layer function that returns a list of layer names. With intel layers maybe that could return the layer below as we do now?

But it probably needs thinking about and mapping out the different options and people can agree what they mean by "physical layer".

I don't think this needs to be a high priority, especially not above the parity bits.

Does need tracking and we can start referencing this issue in TODOs etc so things don't get lost.

@ikelos
Copy link
Member

ikelos commented Nov 14, 2024

Every layer has a dependencies property that contains all the layers that it depends upon, so that should already be doable? The order isn't guaranteed I believe though, so you'd need to test each item to figure out what type of layer it was?

@ikelos
Copy link
Member

ikelos commented Nov 14, 2024

We could have a helper function that takes a parent layer and a method with signature (context, layer_name) that would then run against each dependency? It could maybe pre-filter on specific classes of layer? Sounds fairly straightfoward to knock up, but it would need each "thing you do to a physical layer" splitting off. Or we just implement the loops in each place that uses a physical layer? I'm definitely keen to get clear of using the specific name memory_layer in the code. It'll work, but only for intel layers and it's not very dynamic...

@eve-mem
Copy link
Contributor Author

eve-mem commented Nov 14, 2024

Yeah, at the moment it feels a helper function like that would work and a for each loop.

@Abyss-W4tcher
Copy link
Contributor

Additional discussion (#1506 (comment)):

Need to be really careful about this. I feel at some point soon we're going to have to go back and find all these instances that get the physical layer that way (this ignores swap space and hardwires the expectation of a memory_layer into the code, which is only on intel and could theoretically change in the future). Technically it should go through the .dependencies of the layer and choose one of them (somehow). I won't push that at the moment, and I think we could find all instances of this by looking for the string memory_layer, just putting in a heads-up here...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants