Generic: Add first attempt at pgdscan plugin #1321

eve-mem · 2024-10-25T15:57:39Z

Hi! 👋

This PR adds a basic pgdscan plugin.

I often find myself in the situation where no ISF is available for my linux sample. The debugging symbols are not provided and no information such as system map or kallsyms was collected when a memory capture is performed. I know it's sometimes a similar situation for others.

Without an ISF vol is quite limited in what it can do, and rightly so! You need the information on the complex strutures in order to correctly parse the memory.

This plugin is designed to help in this, disappointingly common, ISF-less situation. It will scan through the memory and locate heuristically what are likely to be PGDs for the various processes in the memory. You don't have an ISF so it cannot tell you the pid or comm etc.

The user part of the address space can then be dumped out allowing analysis in other tools (e.g. strings, yara, hex editor, ghidra, etc). While not as powerful as vol with a full ISF it allows you to explore the user address space in a way that would have otherwise been impossible.

Sometimes all you really need to do is find the user process you care about and dig into it's private memory - and this plugin should help with that.

It currently only supports Intel32e. I've tried my best to make it generic by reading as much information as possible from the intel layer. I simply don't have a lot of samples with 32bit OSes on to test with.

It would be possible to modify existing plugins such as linux.bash or linux.vmayarascan to accept an offset to a PGD and still provide the same results. Any plugin that focuses on scanning private memory to find results and doesn't rely on the kernel ISF (other than to parse the pslist etc) could be made to work this way.

Here is some example output:

(volatility3) eve@xps:~/Documents/volatility3$ python vol.py -r pretty -f linux-sample-1.dmp pgdscan
Volatility 3 Framework 2.11.0
Formatting...0.00               PDB scanning finished                      
  | PGD offset |     size | config
* |  0x1605000 |        0 |      -
* |  0x1ee6000 |  4239360 |      -
* |  0x4407000 |  4268032 |      -
* |  0x450a000 |   544768 |      -
* |  0x4572000 |   835584 |      -
* |  0x4590000 |  2850816 |      -
* | 0x1ac16000 |  2031616 |      -
* | 0x1aca1000 |  4517888 |      -
* | 0x1acf5000 |   200704 |      -
<snip>

N.B. the size 0 PGD is for the kernel itself and so it's actually an expected result.

This example shows dumping out the memory regions for one of the recovered PGDs and running file on the results. (I think there is probably improvements to be made to ensure that pages that are close together get mapped to a single file. At the moment I just use the output of mapping() directly.)

(volatility3) eve@xps:~/Documents/volatility3$ python vol.py -r pretty -f linux-sample-1.dmp pgdscan --dump --offset 0x4572000
Volatility 3 Framework 2.11.0
Formatting...0.00               PDB scanning finished                      
  | PGD offset |   size | config
* |  0x4572000 | 835584 |      -
(volatility3) eve@xps:~/Documents/volatility3$ file pgd.0x4572000.start.0x*
pgd.0x4572000.start.0x1223000.dmp:      data
pgd.0x4572000.start.0x1224000.dmp:      data
pgd.0x4572000.start.0x1225000.dmp:      data
pgd.0x4572000.start.0x1226000.dmp:      data
pgd.0x4572000.start.0x400000.dmp:       ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, missing section headers at 14768
pgd.0x4572000.start.0x401000.dmp:       data
pgd.0x4572000.start.0x402000.dmp:       data
pgd.0x4572000.start.0x602000.dmp:       data
pgd.0x4572000.start.0x603000.dmp:       data
pgd.0x4572000.start.0x7fd341093000.dmp: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, missing section headers at 47552
pgd.0x4572000.start.0x7fd341094000.dmp: data
pgd.0x4572000.start.0x7fd34109c000.dmp: data
<snip>

Here is an example of saving out a config for that same PGD and dropping into volshell with the config. In volshell we can then investigate the private memory as normal.

(volatility3) eve@xps:~/Documents/volatility3$ python vol.py -f linux-sample-1.dmp pgdscan --save-configs --offset 0
x4572000
Volatility 3 Framework 2.11.0
Progress:  100.00               PDB scanning finished                      
PGD offset      size    config
Progress:    0.00               Scanning memory_layer using PageGlobalDirectoryScanner
0x4572000       835584  pgd.0x4572000.json
(volatility3) eve@xps:~/Documents/volatility3$ python volshell.py -c pgd.0x4572000.json
Volshell (Volatility 3 Framework) 2.11.0
Readline imported successfully  PDB scanning finished  

    Call help() to see available functions

    Volshell mode        : Generic
    Current Layer        : primary
    Current Symbol Table : None
    Current Kernel Name  : None

(primary) >>> db(0x400000)
0x400000    7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00    .ELF............
0x400010    02 00 3e 00 01 00 00 00 64 17 40 00 00 00 00 00    ..>.....d.@.....
0x400020    40 00 00 00 00 00 00 00 f0 32 00 00 00 00 00 00    @........2......
0x400030    00 00 00 00 40 00 38 00 09 00 40 00 1c 00 1b 00    [email protected]...@.....
0x400040    06 00 00 00 05 00 00 00 40 00 00 00 00 00 00 00    ........@.......
0x400050    40 00 40 00 00 00 00 00 40 00 40 00 00 00 00 00    @.@.....@.@.....
0x400060    f8 01 00 00 00 00 00 00 f8 01 00 00 00 00 00 00    ................
0x400070    08 00 00 00 00 00 00 00 03 00 00 00 04 00 00 00    ................
(primary) >>>

I'm not happy with how I've messed with build_configuration() in order to produce a config file that can be loaded into volshell. It feels like there must be an easier way...!

I'm messing with the guts of a config and private reading values out of the intel layer, there is likely to be lots of ways to do this better/smarter.... 🙈

I welcome any pointers or advice!

Thanks again!
🦊

volatility3/framework/plugins/pgdscan.py

eve-mem · 2024-11-12T06:42:44Z

I've now updated this to merge output files when pages are 'close' enough together. e.g. before the region from 0x400000 was saved to three files as that is the results from the intel layer mappings, where as now they become a single file.

I've fixed the imports too.

Output example:

(volatility3) eve@xps:~/Documents/volatility3$ python vol.py -r pretty -f linux-sample-1.dmp pgdscan --dump --offset 0x4572000
Volatility 3 Framework 2.11.0
Formatting...0.00               PDB scanning finished                      
  | PGD offset |   size | configScanning memory_layer using PageGlobalDirectoryScanner
* |  0x4572000 | 835584 |      -
(volatility3) eve@xps:~/Documents/volatility3$ file pgd.0x4572000.start.0x*
pgd.0x4572000.start.0x1223000.dmp:      data
pgd.0x4572000.start.0x400000.dmp:       ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, missing section headers at 14768
pgd.0x4572000.start.0x602000.dmp:       data
pgd.0x4572000.start.0x7fd341093000.dmp: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, missing section headers at 47552
pgd.0x4572000.start.0x7fd34129d000.dmp: data
pgd.0x4572000.start.0x7fd3414a8000.dmp: data
pgd.0x4572000.start.0x7fd3414b9000.dmp: data
pgd.0x4572000.start.0x7fd3416be000.dmp: data
pgd.0x4572000.start.0x7fd3418c8000.dmp: data
pgd.0x4572000.start.0x7fd3418f5000.dmp: data
pgd.0x4572000.start.0x7fd34190d000.dmp: data
pgd.0x4572000.start.0x7fd341921000.dmp: data
pgd.0x4572000.start.0x7fd341931000.dmp: data
pgd.0x4572000.start.0x7fd341967000.dmp: data
pgd.0x4572000.start.0x7fd341973000.dmp: zlib compressed data
pgd.0x4572000.start.0x7fd341999000.dmp: data
pgd.0x4572000.start.0x7fd3419b2000.dmp: data
pgd.0x4572000.start.0x7fd3419d6000.dmp: data
pgd.0x4572000.start.0x7fd341a02000.dmp: data
pgd.0x4572000.start.0x7fd341a0f000.dmp: data
pgd.0x4572000.start.0x7fd341c4b000.dmp: data
pgd.0x4572000.start.0x7fd341e56000.dmp: data
pgd.0x4572000.start.0x7fd342060000.dmp: data
pgd.0x4572000.start.0x7fd342075000.dmp: data
pgd.0x4572000.start.0x7fffc716c000.dmp: data
pgd.0x4572000.start.0x7fffc71ff000.dmp: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=528965576148051e8930732ea044bbf35982a785, stripped

Thanks! 🦊

ikelos

It somewhat feels like this plugin is jumping through hoops to make use of the scanner framework? If it's not useful (because it's agnostic of the layer it's scanning) then we can just implement something similar that runs through all the pages manually? The main benefit of the scanner is page overlaps and that will never be a problem here, so it's almost overkill to try and use it?

Otherwise this seems pretty cool. There's a number of places where you should carefully double check that start + length and end do mean the same thing and there isn't an off by one error. They usually only turn up years down the line, which is why I'm mentioning checking them twice now before it goes in.

It feels like you should be able to use the structures to identify different types of tables if needed, but I don't feel this will be too hard to extend out to other architectures. Just one minor check needs adding and then I think it can go in if you're happy with it?

ikelos · 2024-11-12T23:12:07Z

volatility3/framework/plugins/pgdscan.py

+        current_start, current_length = sorted_mappings[0]
+        current_end = current_start + current_length
+
+        for start, length in sorted_mappings[1:]:


This requires that sorted_mapping contains more than one element. Might be worth a check before we call this (presumably you can just return that one if needed).

ikelos · 2024-11-12T23:18:56Z

volatility3/framework/plugins/pgdscan.py

+        # this is the string used page struct to pack the full page of pointers into ints
+        self._pack_string = (
+            self._intel_class._entry_format[0]
+            + self._intel_class._entry_format[1] * self._number_of_pointers_per_page


Also somewhat hacky, buy you could presumably just copy the last character _number_of_pointers_per_page - 1 number of times. Still kinda hacky (and still relies on the format being a single letter, but it's likely and allows for both alignment and no alignment value.

Yup that's a nice idea.

ikelos · 2024-11-12T23:23:41Z

volatility3/framework/plugins/pgdscan.py

+        ):
+            return None
+
+        # read size from layer strcutre


Typo: structure

ikelos · 2024-11-12T23:26:41Z

volatility3/framework/plugins/pgdscan.py

+            ),
+            requirements.BooleanRequirement(
+                name="save-configs",
+                description="Save configuration JSON file to a file for each recovered PGD",


Keep an eye out for enhancements to the config system that should allow configs to be more reusable across plugins that have different requirements (TranslationLayerRequirement rather than ModuleRequirement, for example).

ikelos · 2024-11-12T23:29:05Z

volatility3/framework/plugins/pgdscan.py

+        layer = self.context.layers[self.config["primary"]]
+
+        # Try to move down to the highest physical layer
+        if layer.config.get("memory_layer"):


We don't yet have a suitable way of guaranteeing this is the lower layer (and this may not work if the lower layer has been swapped out, etc), but until we have something better this is ok. Be nice to flag it with a FIXME or a TODO, just so we can find it again in the future...

Yes - it's something that does pop up a fair bit. I couldn't see an issue tracking it. Do you think it's worthwhile making one? (e.g. so that it's "TODO: Re issue XXXX update to a more suitable way of guaranteeing this is the lower layer")

Yeah, we never explicitly made one, but it might be good to see how many other issues might depend on it? Happy for you to spin that up, or shout and I can do it too...

ikelos · 2024-11-12T23:34:08Z

volatility3/framework/plugins/pgdscan.py

+            # build a new layer for this likely pgd
+            temp_context = self.context.clone()
+            temp_layer_name = self.context.layers.free_layer_name("IntelLayer")
+            # temp_layer_name = "primary" # I would like to use the name primary but not sure how?


I think if you just use a prefix of primary rather than IntelLayer, it should do it as long as that layer doesn't already exist (otherwise it'll come out as primary1.

I will have a play - from memory I think a layer with the 'primary' name already exists (at least in my test samples)

ikelos · 2024-11-12T23:37:58Z

volatility3/framework/plugins/pgdscan.py

+                # TODO: Fix this. It seems like an ungly hack and must to the wrong way
+                # to make a new config with a new primary layer?
+                conf = {}
+                for key, value in dict(temp_layer.build_configuration()).items():


That's how I would/have done it. Definitely kinda of hacky, but I'm working on making the components of a config more reusable (by tagging their requirement type so it can be applied to "best guess" requirments of a similar type).

new_config = {} config_dict = dict(primary.build_configuration()) for entry in config_dict: # Volatility 1.2 support new_config["kernel.layer_name." + entry] = config_dict[entry] # Volatility <1.2 support new_config["primary." + entry] = config_dict[entry] json_str = json.dumps(new_config, sort_keys=True, indent=2)

I mean if it's how you would have thought to do it that's got to be a compliment! :D I'll reword the TODO so it's worded more professionally and make a note to revisit it when you get time to add those config bits.

eve-mem · 2024-11-13T09:21:38Z

Yeah, I made the scanner mostly because it seemed like the right thing to do but I could just run through the layer manually (that's exactly what my scruffy vol shell script that inspired this plugin does). I'll rejig it.

Re other architectures I do think it would be fairly easy to add them - I just don't have any samples to test with (and I've been too lazy so far to make one). I've also got much, much, less experience with them. I think in the last 5 years I've only ever seen Intel32e... 🙈

Generic: Add first attempt at pgdscan plugin

e4072c7

github-advanced-security bot found potential problems Oct 25, 2024

View reviewed changes

volatility3/framework/plugins/pgdscan.py Fixed Show resolved Hide resolved

volatility3/framework/plugins/pgdscan.py Fixed Show resolved Hide resolved

volatility3/framework/plugins/pgdscan.py Fixed Show resolved Hide resolved

eve-mem added 2 commits November 12, 2024 06:34

Add _merge_mappings_with_gap to pgdscan

6195eeb

Fix unused an duplicate imports in pgdscan

18ecbc7

ikelos reviewed Nov 12, 2024

View reviewed changes

eve-mem mentioned this pull request Nov 14, 2024

Enhancement: Introduce a Method to Automatically Identify the Physical Memory Layer #1351

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic: Add first attempt at pgdscan plugin #1321

Generic: Add first attempt at pgdscan plugin #1321

eve-mem commented Oct 25, 2024

eve-mem commented Nov 12, 2024

ikelos left a comment

ikelos Nov 12, 2024

ikelos Nov 12, 2024

eve-mem Nov 13, 2024

ikelos Nov 12, 2024

ikelos Nov 12, 2024

ikelos Nov 12, 2024

eve-mem Nov 13, 2024

ikelos Nov 13, 2024

ikelos Nov 12, 2024

eve-mem Nov 13, 2024

ikelos Nov 12, 2024

eve-mem Nov 13, 2024

eve-mem commented Nov 13, 2024

Generic: Add first attempt at pgdscan plugin #1321

Are you sure you want to change the base?

Generic: Add first attempt at pgdscan plugin #1321

Conversation

eve-mem commented Oct 25, 2024

eve-mem commented Nov 12, 2024

ikelos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eve-mem commented Nov 13, 2024