-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new plugin: windows.notepad.Notepad #1082
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got to say, I'm pretty uncomfortable about this plugin? I wouldn't ever add it to framework/plugins
. It feels as though it's entirely heuristic based without any formal forensic analysis or backing. It might work occasionally but I'm concerned about adding it to what is otherwise a forensic framework that may be relied upon for evidence. It also feels as though this could be applied to programs other than notepad and still return partial results? Is this notepad specific or just a general text search through program VADs?
At the moment this is still in limbo and I'm not sure whether it'll ever get added, I need to think over it and discuss it with the others a bit I'm afraid...
proc_id = "Unknown" | ||
try: | ||
proc_id = proc.UniqueProcessId | ||
proc_layer_name = proc.add_process_layer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer proc_layer_name being given a bogus default initalization, much like proc_id
above, although proc_id
above should initialize to a number or None
. To support the unknown
name, change the debug message to be proc_id or 'Unknown'
although note that if proc_id
is then 0 this will result in the layer being referred to as Unknown.
if private and protection == 'PAGE_READWRITE' and vad_size < 16777216 and tag in ('VadS', 'Vadm'): | ||
try: | ||
data = proc_layer.read(vad_start, vad_size, pad=True) | ||
if not data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're padding, so this will always return a binary string of vad_size
, so this check should'nt ever fire unless vad_size
is 0, which you could check before attempting the read, so this doesn't seem useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, that makes sense. It was part of the code from the VadInfo plugin that read the VADs in chunks. I've removed the chunking and forgot to this check, so it wasn't intentional, and even if it would fire, it would break out of the loop over the VAD list, which I certaintly didn't intend to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to add this first time round, but this much data doesn't belong in the program code, it should be in auxiliary file. Also pointing out the code where it came from, where the code is just another arbitrary program doesn't really act as evidence that this is a suitable or accurate means of identifying appropriate strings...
|
||
# I need this huge set for filtering out garbage strings, as I rely on a memory pattern to find the displayed text | ||
# Taken from https://github.com/glmcdona/binary2strings/blob/02f0c983aeede7ed9acbee96a43f9142dc7ba404/src/binary2strings.hpp#L57 | ||
COMMONCRAWL_CHARS = {0x20, 0x65, 0x61, 0x69, 0x6f, 0x74, 0x6e, 0x72, 0x73, 0xa, 0x6c, 0x64, 0x63, 0x75, 0x68, 0x6d, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This large a chunk of data doesn't belong in the plugin source, it should be an additional data file.
Yeah, totally understandable, I agree with you. Being based exclusively on heuristics of where the searched text might be, this is not the best approach for a forensics framework that aims to produce reliable output. I plan on studying NT and LFH heap internals so maybe I could figure something out with that, but for now I think it would be best to make this PR a draft. I'm very much a beginner in writing code to work with memory dumps, and I'm eager to learn, so I hope I don't abandon this idea :D |
That sounds like the best course of action, that way people can find and use the code if they want it, but it's clearly not part of the main codebase. Good luck with the heap support! I believe the volatility 2 notepad plugin only works up to a particular version of windows, but if you can find one that legitimately reads the right structures to get the actual contents then we'd love to have it included as a core plugin... Thanks very much for your contribution, we're happy to leave it here in draft for people to find and we look forward to seeing what you come up with in the future! 5:D |
Adds a new plugin that is an alternative to volatility2's notepad plugin. Instead of parsing heap structures, it uses simple memory pattern matching to find potential places where the displayed text might reside. It can produce false positives, but I've tried to reduce them to a minimum by filtering strings shorter or equal to 8 characters, and using a table of unicode characters seen in CommonCrawl to remove strings that contain unicode characters that are technically valid, but didn't appear even once in the wild web.
Partially addresses issue #710.