Speedup page class attributes accesses with shared lru_cache #1528

Abyss-W4tcher · 2025-01-04T19:46:37Z

Hi,

While working on new capabilities involving the instantiation of thousands of page objects, I noticed that _intel_vmemmap_start was a severe bottleneck in get_contents call:

# Nice tool to trace execution and spot functions execution time
$ viztracer --log_func_exec "get_contents" -- vol.py -f sample.bin linux.pagecache.DumpFs

get_contents time: ~31 ms
- _intel_vmemmap_start: ~30ms

While we were currently using functools.cached_property, it was only efficient for a page instance, which was not relevant as most of the time we only dump its content before discarding it. To enhance the design, I tied each page object hash to its vmlinux hash, which is verified by functools to hit the cache or not. This prevents overlaps between multiple vmlinux in a same sample, and only applies to the needed attributes.

After this patch, get_contents execution time dropped to ~1ms (on cache hits after first call).

ikelos

This one looks ok to me, but I'd like a second opinion from @atcuno or @gcmoreira who understand the linux side of things better. If these values are per layer, then shouldn't the caller cache them? I just weary of over engineering the solution and __eq__ and __hash__ need careful planning when being overridden to make sure unexpected consequences don't crop up. I would think getting the module from the object in the hash would be very slow? If the caching isn't helping, how does using lru_cache improve the situation?

Abyss-W4tcher · 2025-01-05T23:48:02Z

This design is a glorified PAGE_ATTRS_GLOBAL_CACHE = {} setup, in which we could have {vmlinux.__hash__(): {"vmemmap_start": XXX, "pageflags_enum": YYY}}. I understand the careful planning, as __hash__ and __eq__ override could be problematic if someone tried to compare page instances by Python objects (and not by my_page.vol.offset, which should be the thing to do).
As observed with viztracer, the call to retrieve the vmlinux module was in the microseconds order. Indeed, if we look at get_module_from_volobj_type, it only iterates over self._modules and does simple attributes accesses.
Here is the current caching mechanism (pseudo-code ish):

page1 = page(vmlinux=myvmlinux)
# no cache hit, ~30 ms
page1._intel_vmemmap_start
# cache hit thanks to functools.cached_property, instant result
page1._intel_vmemmap_start

page2 = page(vmlinux=myvmlinux)
# no cache hit, ~30 ms
page2._intel_vmemmap_start
# cache hit thanks to functools.cached_property, instant result
page2._intel_vmemmap_start

As you can see, the cache only applies to a page instance, but we know for sure that _intel_vmemmap_start will be identical for any inode attached to a specific vmlinux. So, in the end, with the proposed tweak and functools.lru_cache:

page1 = page(vmlinux=myvmlinux)
# no cache hit, ~30 ms
page1._intel_vmemmap_start
# cache hit thanks to functools.lru_cache, instant result
page1._intel_vmemmap_start

page2 = page(vmlinux=myvmlinux)
# cache hit thanks to functools.lru_cache, instant result <--
page2._intel_vmemmap_start

# Different vmlinux
page3 = page(vmlinux=myvmlinux_2)
# no cache hit, ~30 ms
page3._intel_vmemmap_start

ikelos · 2025-01-07T00:58:58Z

But in your example, page1 == page2, doesn't it? That seems very bad to me. Telling them that we've overridden __eq__ for a slight speed up, but caused equality not to mean what it's supposed to mean is very bad. No other object in the framework explicitly requires that you very the offsets match, because that's what __eq__ is supposed to do!

Surely the caller can cache that more effectively than altering the reality of equality within the framework for that type of object alone, to save some milliseconds? If it isn't going to change, then looking it up once for the kernel should be fine?

It's probably also shouldn't be part of the object at all, the give away being that self is only used within the method to recover the kernel? This is another reason why giving object methods they don't need isn't always useful even if it's convenient (regarding our conversations around module). So the more I look at this, the more I think that method is completely misplaced and needs moving rather than the equality and hashing of the object shuffling around...

ikelos

I can't let this go in, there's too many things that don't work about it, sorry... 5:S

ikelos · 2025-01-07T00:59:39Z

volatility3/framework/symbols/linux/extensions/__init__.py

+        return isinstance(other, page) and self.__hash__() == other.__hash__()
+
+    # Use the vmlinux to identify cache pools.
+    def __hash__(self):


Redefining these is bad (see main comments).

ikelos · 2025-01-07T01:00:25Z

volatility3/framework/symbols/linux/extensions/__init__.py

@@ -2545,7 +2566,8 @@ def pageflags_enum(self) -> Dict:

        return pageflags_enum

-    @functools.cached_property
+    @property
+    @functools.lru_cache


This method doesn't need to be a property at all, it only uses self to find the kernel, if anything this should be a LinuxUtility that takes in a kernel, or live somewhere else, there's no benefit to it living here.

Abyss-W4tcher · 2025-01-07T08:27:49Z

I understand, I will think about moving it somewhere else, maybe directly in this PR. It could end up being a good example of what a first idea looks like and how it is finally implemented.

speedup page attributes accesses with shared lru_cache

8e0b261

Abyss-W4tcher changed the title ~~Speedup page attributes accesses with shared lru_cache~~ Speedup page class attributes accesses with shared lru_cache Jan 4, 2025

ikelos reviewed Jan 5, 2025

View reviewed changes

ikelos requested review from atcuno and gcmoreira January 5, 2025 21:29

ikelos self-requested a review January 7, 2025 00:59

ikelos requested changes Jan 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup page class attributes accesses with shared lru_cache #1528

Speedup page class attributes accesses with shared lru_cache #1528

Abyss-W4tcher commented Jan 4, 2025

ikelos left a comment

Abyss-W4tcher commented Jan 5, 2025 •

edited

Loading

ikelos commented Jan 7, 2025

ikelos left a comment

ikelos Jan 7, 2025

ikelos Jan 7, 2025

Abyss-W4tcher commented Jan 7, 2025 •

edited

Loading

Speedup page class attributes accesses with shared lru_cache #1528

Are you sure you want to change the base?

Speedup page class attributes accesses with shared lru_cache #1528

Conversation

Abyss-W4tcher commented Jan 4, 2025

ikelos left a comment

Choose a reason for hiding this comment

Abyss-W4tcher commented Jan 5, 2025 • edited Loading

ikelos commented Jan 7, 2025

ikelos left a comment

Choose a reason for hiding this comment

ikelos Jan 7, 2025

Choose a reason for hiding this comment

ikelos Jan 7, 2025

Choose a reason for hiding this comment

Abyss-W4tcher commented Jan 7, 2025 • edited Loading

Abyss-W4tcher commented Jan 5, 2025 •

edited

Loading

Abyss-W4tcher commented Jan 7, 2025 •

edited

Loading