-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core, plugin] : Virtual mappings dumping and caching #1237
base: develop
Are you sure you want to change the base?
[Core, plugin] : Virtual mappings dumping and caching #1237
Conversation
Thanks, it looks like an interesting idea. I'm afraid it's another one that will likely take me some time to consider the implications of, and I'm also on holiday next week, but do please nudge me after a month if I haven't added any comments on it by then... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm very squeamish about this whole thing. It's a nice idea but it may easily lead to us returning incurrate results, and it's kinda bolted on the side of the framework. I think there might be nicer way of doing this, where basically whenever a layer would go to do this work, it checks if the value is already cached and if not it generates the cache as it does the manual bit. Then next run the cache will have a hit? We already have a caching mechanism/plan built in for dealing with symbol tables (which hardly gets used these days), so I feel with judicious use of uniquely naming the data so it can be found again we might be able to make this all automatic/in the background and workable for everyone.
The difficulty will be identifying the specific image we've got and making sure we accurately map it to the cache. I was going to suggest the dtb and a hash of the top level table should unqiuely identify a particular image, but this applies to all translation layers of any type. 5:S Some of them will have extremely large section_start/section_end segments, so it's really only intel that has so many of them and I'm surprised this saves that much time since the mapping method should just read the page tables from the layer and use that pretty efficiently. It depends how often the tables are reread, but I believe we cache them during a run?
Looking at your stats, this takes 7' 11" total for a task that normally takes 4' 36", so it isn't profitable until the second run at the earliest and really takes about 3 runs to start really being useful. I'd be very interested to see from a profiler where the bulk of that time is spent, both without and the process of building and then using the map.
My biggest concern is that it's either very manual (which may lead to a lot of user error and make the feature not worth using) or we make it automatic (and then if it makes a mistake it's a pig to identify and guide users through fixing it).
Either way, I would definitely need to see repeated runs of this with the results being verified as identical, and with the wrong file being passed in to see if it can detect errors if we decide to use it automatically (such as two paused VM snapshots of the same machine where lots of different processes have been started between the runs).
Caching stuff is fraught with issues which is why I'm not going to be rushing this in and even when we do get it in a smoother state, I'll probably still be debating whether it's worthwhile. This might take me some time to figure out what's the worst performing bit of the existing code, and then if there's anything that can be done to improve that specific aspect...
# Avoid decompressing and deserializing the file | ||
# more than once. Saves time, but costs more RAM. | ||
if not hasattr(self, "_virtmap_cache_dict"): | ||
with open(filepath, "rb") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, file handling is tricky. If this is running as a web interface on a remote server, standard open operations won't work. This should be a constructed ResourceAccessor
class's open
method so that it can open URL and compressed files automatically. It will then accept a URL rather than a file path.
@@ -551,6 +597,12 @@ def _scan_iterator( | |||
assumed to have no holes | |||
""" | |||
for section_start, section_length in sections: | |||
# Check the virtmap cache and use it if available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is essentially going to generate it, why don't we... try to look it up in a cache and if it fails, do the reset of the code and then store those results into the cache? Rather than a completely separate plugin, that speeds up everything without costing any additional time? It also means the data can be stored in the local on-disk cache (like processed symbol file) which saves us having to worry about the user getting the data into the system, or messing with the UI at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating it here would fit if an automatic mapping cache generation is implemented. This would result in filling the cache and adding processing time without the user requesting it ?
An additional plugin, also allows to unify a single method to save the cache, instead of it being randomly generated depending on the plugin first ran. As said, it is not a "ground-breaking" feature, from which the explicit core integration might come with concerns for users (this PR doesn't really add much more core logic, if the feature isn't turned on).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, isn't that exactly what this is generating here? Why not just save it rather than generating it in a different plugin as part of a separate step? This could also build up the cache as needed, so if parts haven't been computed yet, it does the full task, but if they are it uses the cache?
Implementing a cache (that might be wrong unless it's matched to the image very carefully) already seems like quite a deep-rooted feature to me...
@@ -398,6 +406,49 @@ def run(self): | |||
plugin_config_path, | |||
interfaces.configuration.HierarchicalDict(json_val), | |||
) | |||
if args.virtmap_cache_path: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just not so keen adding this functionality this way. What happens if we discovered we've missed something in the file format, or there's a more efficient way of storing the data, or something else that crops up. The more we expose this to the user and make them do work to get it, the more can go wrong. There's a comment further down that suggests a different way of working the cache that might solve a bunch of problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, however both situation (manual / automatic) might produce different issues. With a manual implementation, at least it is easy to remove the argument and see if it was the problem (although it hopefully shouldn't, but I understand the concern).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but it'll still be bug reports we're trying to diagnose it from, so recommending --clear-cache
seems as straightforward.
A list containing mappings for a specific section of this layer""" | ||
|
||
# Check if layer is fully constructed first | ||
if self.context.config.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What object has the requirement on this? It should be that the layer that uses it has an optional requirement on it, so that it'll get saved into any config files that get constructed. The layer identifier isn't even close to unique (almost all plugins use the same layer name, and class) so this will go badly when you have multiple layers you want to use this on (or a config you want to work for multiple images).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The layer_identifier should be unique ? Example layer_identifier
s :
volatility3.framework.layers.intel.WindowsIntel32e.layer_name
-> identifies the kernel layer (TranslationLayerRequirement
name)volatility3.framework.layers.intel.WindowsIntel32e.layer_name_Process5948
-> identifies the process 5948
I might have missed something, but it shouldn't be possible to have a duplicate layer string identifier in the layers pool ?
This specific "config"/"cache" is intended to be used for a unique memory capture, as even a dump from the same kernel a few seconds later would have different mappings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layer name is unique for a run of volatility, but they'll likely all say primary1
or memory_layer1
or something. Across runs they're unlikely to even be different.
The process layers, similarly, won't be different across different images that have processes with the same pid... I don't think a dump from a few seconds later would have a different cache? The layer name would likely be the same, and many of the process layer names would too, but also, it could match a wildly different image...
@@ -247,7 +250,12 @@ def run(self): | |||
default=[], | |||
action="append", | |||
) | |||
|
|||
parser.add_argument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not yet comfortable with this machinery. It seems really bodged into the workings. 5:S
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, that's not a helpful comment. It's just a feeling I get but it doesn't feel like it's been smoothly integrated into how everything works, but rather has lots of external moving bits (like files the user has to pass in, which get stashed in a non-unique place in the config to get used where they're needed. As I say, I think there may be a better way that evades both the file issues, improves on the uniqueness and wouldn't require user interaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of this cache might silently fill the filesystem without noticing the user, although the generated file is xz
compressed. Indeed, everything could be stored in the cache, but then this would imply that --clear-cache
command might also remove it, when trying to fix a symbol cache problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unintentionally filling the disk is a potential issue, but we can build cache checks on runs to clear out old files if it becomes a problem. What's the expected filesize for one of these xz files once it's made?
We'd want --clear-cache
to do exactly that, it's supposed to set you back to square 1, so you can rule the cache out as a contributing factor to a problem.
Thanks for the review :) The main idea, was to provide a way for users to use an optional functionality that would allow saving time by avoiding generating the same deterministic data (input and logic never changes) every plugin run. As you pointed out, it is not integrated "automatically" in the framework, as I wanted it to be managed by the user, leaving him the task of selecting the right cache file for the right dump. This hasn't been tested out on a lot of samples, but basically it calls internal framework api and stores the result (like an exported A great application case you could consider, might be when a developer implements a scanning plugin, and needs a lot of trial and error to find needles in the virtual pages. This allows him to save quite some time on each test.
If this plugin doesn't output the same results twice, then the problem would reside inside the With the wrong file being passed, I am afraid the time spent to verify the correctness of the mappings would require to regenerate the mappings, hence losing the initial idea of this feature (saving time and CPU cycles) ? Making a hash of the DTB would work, but for samples taken from a different runtime kernel :/ . I wrapped my head around this problem a bit, but I couldn't find an easy solution. Making a SHA256 of the sample would be too long, and be necessary on each run (storing it in a metadata circles the problem back) ... |
Sure, I applaud the idea and I think it could be worthwhile using it all the time, but that would need three things:
|
Alright, I am processing all your comments. I am thinking of an automated design, but the core of the problem relies on identifying a memory sample and more granularly a layer. Additionnaly, instead of storing all the layers (of a sample) inside one cache file, requiring to load it entirely when needed, an approach where each layer cache is directly written to a file would speed-up the search and access. Even if we get layers from different samples "mixed" in the cache, this shouldn't be a problem if the layer identifier is strong. For example, we might consider the following sample code : stacked_layers = [f"{l.config['class']}.{l.name}" for l in current_layer.context.layers.values()]
"""['volatility3.framework.layers.physical.FileLayer.base_layer', 'volatility3.framework.layers.crash.WindowsCrashDump64Layer.memory_layer', 'volatility3.framework.layers.arm.WindowsIntel32e.layer_name']"""
layer_identifier = hashlib.sha256("unique identifier") # DTB, first mapped page ...
virtmap_filename = os.path.join(
constants.CACHE_PATH,
"virtmap_" + layer_identifier.hexdigest() + ".cache",
)
cache_content = {"metadata":{"stacked_layers":stacked_layers}, "mappings" : layer_mappings} # layer_mappings format is {"section": _scan_iterator_results}
with open(virtmap_filename, "w+") as f:
json.dump(cache_content, f)
The
Determining if a cache file already exists is straightforward, and does not require a direct content scan. Of course, this relies on the cache filenames not being messed with. As you pointed out, this automatic feature should be easily togglable (on/off), and also never enabled if the cache is not available (if #410 gets merged one day). What are your thoughts on this design ? This would pop the plugin, most of the cli code, to mostly concentrate inside |
Yeah, so a per-layer design is ok, but identifying the layers is going to be quite a task. It's probably easiest to do it on a hash of a certain number of bytes, but if that's not spread across the entire layer then things like multiple memory snapshots may come back reporting as the layer cache hit. Honestly, it's going to be really difficult identifying one layer from another layer reliably. If there's a clever way of doing that, then go for it. Am I right in thinking that these maps will be dependent on the DTB (and therefore a process layer won't have the same sections as a kernel layer) or not? |
These maps depend on the DTB (per-process and kernel), and the number of sections depends on the virtual pages allocated to the layer. This is a difficult task, because highest level mappings might look the same between two samples, but PTE's could have been mapped and unmapped, resulting in small skipped sections. Apart from doing a SHA256 of the entire sample on each Volatility run, relying explicitely on the sample filepath (like the config ?), or trying sketchy things like gathering system clocks from the upper layer, there isn't for now an efficient way to uniquely identify a layer. Initially, I designed this feature to work the same way as the In the current state, I am afraid that the automatic path will introduce more issues than benefits... So, I guess it should be left as a side feature for now, as increasing the risk of returning inaccurate results automatically is what we really want to avoid ? |
Hi 👋,
Following an observation made while investigating WIndowsAArch64 (#161 (comment)), I noticed that multiple plugins were scanning and physically mapping the entire virtual address space :
volatility3/volatility3/framework/interfaces/layers.py
Lines 235 to 238 in 6b739f6
to later do bytes searching on the virtual (kernel) layer directly. However, on large memory samples, or heavily populated address spaces, this process takes quite some time, and needs to be reproduced on each plugin run.
As the input (the section to scan) and output (the virtual->physical mappings) are constant for a unique memory sample, there is no need to reproduce the calculations.
This PR introduces a new plugin
windows.virtmapscanner
and a new framework argument--virtmap-cache-path
. Here is a demo :windows.filescan
run :windows.virtmapscanner
, and save it to a file :windows.filescan
plugin, providing the framework with the virtual mappings cache filepath :The time spent running
windows.virtmap
is already profitable ! This does not only applies to thewindows.filescan
plugins, but any that does an entire virtual layer scanning (windows.netscan
,windows.threads
...).I'll be glad to hear your thoughts, and improvements, about this idea !
Notes :
volshell
integration hasn't been done yet