-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for windows.strings revmap offsets #1043
base: develop
Are you sure you want to change the base?
Fix for windows.strings revmap offsets #1043
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for this, but there's a couple of points about it. Mostly, why does dropping the shifting work? There's a couple bits where we should use the layer mask functions rather than the (quick and easy, but potentially wrong in the future) fixed page sizes, and lastly there's a couple of list updates that potentially double entries up which definitely need a look. Those, sadly, are showstoppers, so this can't get merged until they're resolved.
Thanks for giving this another pass. I'm still really uneasy about the indexing of the revmap, and it feels like we're not quite certain how it's working or how it's supposed to work. Hopefully one of these evenings or on the weekend, I'll get a chance to dig into it and figure out what's going wrong, but this feels too much of a "it just works now" solution and I much prefer understanding how and why and making sure it's doing the right thing (otherwise we'll just have to come back and fix it in the future when that one corner case turns up and takes us hours to figure out)... 5:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider (that unfortunately I only just thought of) is that if there are multiple layers (like a QEMU layer, or AVML layer or similar) then the offsets in the strings file won't map up to the "physical" addresses returned. I'm not sure how best to deal with that, but it probably need a little consideration.
Options include generating our own strings style tool, or jiggering the virtual to physical map to translate down to the lowest possible layer (DataLayer
) or adding a debug message pointing out that the mapping will have been inaccurate if the physical layer isn't a DataLayer
?
cur_set.append( | ||
{ | ||
"region": "Kernel", | ||
"pid": -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a BaseAbsentValue
derived class. Maybe NotApplicable
?
( | ||
str(string.strip(), "latin-1"), | ||
item.get("region", "Unallocated"), | ||
item.get("pid", -1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also should default to a BaseAbsentValue
derived class.
_phy_layer_name, | ||
) = mapval | ||
for offset_to_page_within_mapping in range( | ||
0, phy_mapping_size, 0x1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is hardcoding the page size (which then agrees with the value 12 also hardcoded everywhere). That's ok, but I think it would be better to pull them out and make them constants in the plugin (just so they stay the same if the code ever changes).
|
||
# for each page within the mapping we need to store the phy_offset and | ||
# the matching virt_offset | ||
for offset_to_page_within_mapping in range(0, phy_mapping_size, 0x1000): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard coded 0x1000
# physical_page number. | ||
physical_page = ( | ||
phy_mapping_size + offset_to_page_within_mapping | ||
) >> 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard coded 12
(f"Process {process.UniqueProcessId}", offset) | ||
physical_page = ( | ||
phy_offset + offset_to_page_within_mapping | ||
) >> 12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard coded 12
offset_within_page = phys_offset & 0xFFF | ||
|
||
mapping_entry = revmap.get( | ||
phys_offset >> 12, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard coded 12
There is then also the issue of multiple physical layers (eg, swap space, etc). At the moment, I think we throw the layer name from the mapping value away. I don't know whether to hold this PR up to get all these edge cases sorted or just ok it with the BaseAbsentValue and hardcoding changes? |
It would be pretty easy to make our own version of "strings". You could import from the various yarascan plugins using a very basic regex yara signature of X number of printable characters. That'll handle all the address translation etc, and would allow pid filtering etc. Nothing stopping a user doing that right now (i know i have myself in the past) so it would really be a convenience function than anything new. I could make a rough example if that sounds like it might be useful? It would mean that if yarascan supports the page files etc, then so does this plugin. It also means it won't work at all unless the user has installed the extra Yara packages. |
Actually if you wanted to go the "own version of the strings tool" type plugin, rather than "reading the results of the actual strings utility" type plugin then you could even use the RegexScanner to make a basic strings type plugin - therefore not needing yara extremely rough example here: https://github.com/eve-mem/volatility3/tree/regex_based_strings (would need to also handle unallocated space, etc, etc) On e benefit I see with keeping the current "reading the results of the actual strings utility" type plugin is that as a user I can provided a curated list of strings I'm interested in translating - rather than just everything in one go. |
Yep, so the curated list is important and valuable (string scanning can be notoriously slow for python over the So then the question still exists, do we merge this before doing a rework of the strings plugin, or do we just dive straight into the rework (and use this as motivation to get it written)? I'm happy with either, but I would still like any changes mentioned above making if we are going to merge this in (I think they're mainly consolidating fixed values so they're changing in one location, or derived from the space itself)... |
It's a really good question @ikelos, right now the strings plugin doesn't really working at all - it would be great to sort that - the changes here sort a lot of that. Thanks to the hard work from @kevthehermit. So at least windows strings works mostly as expected with some edge cases missed. If redoing the strings again is going to take "a while" which it feels like it might, e.g. probably not before the next release? Then it makes sense to me to merge this, then rehaul it all more generically covering all the bases in slower time (3-9 months?) To rehaul would need to design how users input a strings file from a page file etc, so they could provide 5+ strings files which all line up with different layers in vol - so how do we make it easy to do that on the command line etc. That's just one part of it, with other bits that need thinking about too. To me it just feels like it might take a while? I am just a random (admittedly keen!) user, I'm happy with whatever you and the other core devs think is the best way to approach it. Give it some thought and if there are bits i can help with let me know. |
I agree with everything here :) The current strings plugin is broken, this "patches that" but in all my testing i have been restricting strings to a specific pid, when i try to run it without defining a pid it never completed, I had it running on a 4Gb Win11 dump ~ 145 Pids for 18 hours and it had still not finished with output, So yes i think it would be better to start again and think about a more sensible way thats also more agnostic against OS. But in the mean time this does at least make it usable if not "perfect" |
Building the reverse map will take a long time, if you know of a quicker way to determine which memory block belong to which processes, I'm all for it (the maps should all share the kernel, so there may be something we can do there to scan the kernel and then only the differing entries from the process maps?). I'm happy to merge this, just need those last batch of comments addressed first please... |
Hello @ikelos and @kevthehermit, TL;DR - if we use an interval tree we might be able to speed up the strings plugin when there are manageable number of strings to lookup, but it means adding an optional dependency to vol. Caveat: Lots more testing needed on many other samples! I was trying to work out different ways to make the creating the rev maps faster. Currently the strings plugin makes a dict that makes every page of physical memory to the matching virtual pages. Once it's made, its extremely fast as it's all just look ups. The downside is it can take a long time to create, on the test 512MB windows sample I normally use it takes about 5 minutes. I'm also testing on a16GB sample now but it's taking a very long time to finish so I don't have results for that yet. With @kevthehermit having a sample still creating a reverse map after 18 hours I thought it might be useful to try some other options. I've made some rough proof of concept examples here for two different ways that could be possible options (they'll need more work if they seem like good ideas): develop...eve-mem:volatility3:windows_strings_intervaltree It's based on the current changes for this PR, my git-fu is not good enough to let me make a branch from this though before it's committed... The changes give three options for creating the reverse maps. The current method, a tree method (using interval tree), and something I've called "ordered". I'm not suggesting for the actual plugin these options exist, I think we should pick one and stick with it. This just felt like an easy way to share my experiment. The tree method uses an interval tree (ref: https://pypi.org/project/intervaltree/ and https://en.wikipedia.org/wiki/Interval_tree) as it's a really lovely way to build a structure that has start points and end points. You simply add all the ranges to the tree and then later you can ask which ranges can be found at point X. The tree takes care of making that all efficient and ensuring you get the right answer. It would mean adding an optional dependency though. If we wanted to, we could probably make a tree type structure ourselves that would be quite fast. The ordered method is my attempt at something not using any extra dependency's. It's very hard to read and is likely extremely inefficient for what it is. The idea is to put all the mappings from all the processes in order, and then place all the strings to lookup in order too. That way as you loop though checking for matches you can begin to remove mappings from the list once they're no longer possible and you can stop early too. However you're still making many many comparisons per string until you find the correct mapping. As far as I can tell each method gives the same results. I run some tests, using each method to look up 1,10, 100, 1,000, 10,000, and 100,000 strings and timed how long they took on this 512MB sample. Just once for each, it would be better to do more rounds and average them out. I didn't use a config either so vol is scanning for the symbols each time. To me it's pretty clear that the current method is not really affected by the number of strings to lookup. Once that reverse map is created it's extremely fast - the only issue being that it can take a long time to create in the first place. The interval tree method seems to be pretty fast and it's the approach I prefer personally. The crazy "ordered" approach also seems to work, and maybe with some effort we could make an approach to lookup up offsets that wasn't this crazy and was still a little faster. I'm sure if I added more and more lines to search both the interval tree and the "ordered" method would end up being slower overall as they require more post processing per string, with enough lookups I'd imagine it could end up being significantly faster to stick with the current method. Equally 100,000 is quite a lot of strings - I'd guess that in most cases that would be useful enough - I know that's more than I'd personally need. With this small memory sample we're just talking about the difference between 40 seconds and 5 minutes which isn't that huge, but with the larger memory samples that initial creation time with the current method is prohibitive. I've managed to run a test with the tree method on my 16GB sample but I've not yet been able to finish the current method. I've not looked at multi threading the reverse map - that would help speed things up too. What do you both think? Should I do more testing and experiments around this? Any particular method that would be preferable? Raw data if it's useful:
|
Well, I still think building a reverse mapping will be useful for lots of things. I'm interested in the speed savings you get from the interval tree, but it sounds like it's reimplementing grep, so I need to figure out what tricks it's doing. I still think if we do proper some jiggery pokery just involving the page tables, we should be able to much more efficiently build up a suitable structure letting us look up an offset and determine which processes and where that offset maps up to. Unfortunately I haven't really had time to dedicate to it, so, I guess keep going with this. I'm not overly keen on a whole dependency for what's presumably quite a small component, particularly if the technology is static and not constantly being improved, however if it's used often and in other places, or it's too complicated to easily write ourselves/verify what we've written then I'd be ok with it. Hopefully that gives you an idea of my thinking? This also is a plugin specific dependency, not a framework-wide one, and we can already deal with that as with Yara and the crypt plugins, so again, less of a huge concern... |
Thanks for the quick reply! The interval tree is a fast data structure for this kind of problem rather than a grep type thing. I think it was originally made with dates and time spans in mind. It's not actually searching for any of the strings if you see what I mean. I agree with you though about adding a whole dependency just for strings. I also thought there might be some cleverness to be had in the page maps, just looking at the top level you can see all the holes where there would be no possible mappings, and you know what virtual address ranges they cover. By making a allow/deny list by looking at the top (or maybe also the second level) for the pages maps per process you should have a, hopefully, quick way of knowing which virtual address will translate. That only helps in the VtoP direction, I've not got any good ideas for PtoV yet - I'll keep thinking! Please shout if you get any good ideas. In the mean time this probably isn't worth holding up this PR for - probably better that strings actually gives the correct answer even if it does take a while. |
Ok, I'm still not sure I understand how it's faster then, but I'll have a dig through when I get some time. It was more that we have as many maps as there are processes, but a huge amount of the map should be the same between the kernel map and the process map, although thinking about it, we must already cope with that somehow. Once we work on unifying processes, we might be able to make the plugin generic too. In general, my thinking was just take the diff from the kernel map as the "this is really in this process" map? We could also later apply OS specific knowledge to know what's actually part of the process and what isn't (VADs, etc)? I'll keep having a think, but I feel sure there are improvements to be made... |
Yes, I keep thinking there is something that can be done too. I have a few more ideas I'll play with, if you hear nothing from me it's because they were terrible :D |
Hello again @ikelos and @kevthehermit, sorry for spamming this PR. I have had some luck with a very basic implementation of a interval tree style way of doing the reverse mappings, but just using basic python therefore not needing any extra dependencies. It can't do all the fancy things a real interval tree can do but it seems to do the part I needed reasonably well. It doesn't try to produce a balanced tree or anything clever like that. I've added it to the same experimental branch here: develop...eve-mem:volatility3:windows_strings_intervaltree It's under the In terms of speed it seems to be coming pretty close to the real interval tree on the smaller memory sample. What do you think of this approach? Raw numbers again if it helps:
|
So I've had a chance to look at this, and I really like the custom version. It doesn't require additional modules but gives a a massive speed up (I've tested it out and it's blazingly fast). It appears to give the same/better results (although, it tends to go through the strings a process at a time, whereas it might be better to generate all the maps and then start outputting so that all string entries are next to each other (although sorting can be done by the UI/later). I'd ditch the other two methods since they all achieve the same results (and there's no point having a flag between the methods if they produce identical results). At the moment we throw away the actual layer it was read from (making the assumption there's only one physical layer, which isn't always true, for example swap spaces, etc), but certainly eve's custom one is better than the current strings implementation. Next steps are to get it its own PR, so it's tidied up and the other methods aren't cluttering it, then once it goes through review we can get it merged, and if it proves @kevthehermit 's works out faster we can study it to see why and what tweaks we might want. Does that sound ok to everyone? |
(Also, I think the custom version can inherit from dataclasses.dataclass and get a bunch of memory saving for free that way). |
Thanks for taking a look @ikelos. I'll have a go at working this into a separate PR and we can go from there. |
@eve-mem, I forget, did a separate PR come in for this? I've pushed a testing branch Happy if you've got one in the werks to review it, but this was a pretty signficant speed-up you and @kevthehermit came up with and I'd like to get it in the tree... |
@ikelos - I never got finished with something that worked in all cases. I had some problems with that speed up code hitting a recursion limit on some windows samples and I've been tinkering away trying to come up with ways around it without much luck yet (both a brain power and not enough time problem). Using the actual interval tree package seems to work fine - but that's a whole new dependency for quite a niche use. |
Why would their intervaltree work, but yours not? Could you take a look in their code and see if they've got some special recursion handling? I didn't hit any of these on my spin on your patch ( |
Essentially there is an actual interval tree, while mine is quite a basic tree. I thought I'd be able to cut out lots of parts of an interval tree that aren't needed, and the result is something that's hardly an interval tree at all. I'm still keen to get this resolved, a strings plugin that works and is quick enough to use is really important. Sorry it's taking me so long! |
This PR fixes a bug with
windows.strings
ref: #876Output has been compared from Vol2.6 to ensure the correct offsets are being returned now.
This PR also changes the output to better fit the new renderer separating results into parts instead of a string
.join
that was used before.As a significant change to the output, the version has also been bumped to
2.0.0
Example of these changes working and compared to vol2.6 can be seen in the issue thread