Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux basic support for scanned tasks #1012

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

eve-mem
Copy link
Contributor

@eve-mem eve-mem commented Oct 5, 2023

Hello!

This PR adds some basic support for tasks found via scanning to some of the existing linux plugins. I've not done them all as I wanted views on this first. Is this a good way to do it? Would it be better to allow users to provide a physical address to the plugins rather than using the linux.psscan plugn?

It would help work towards 924.

I've added them in slightly different ways to the existing plugins, perhaps one method is the best, or perhaps it doesn't really matter.

Here is an example of what this would allow, using the linux-sample-5 file, (sha1: d1edf3635f2726033a81fab12364e03a111bba74).

First see that pid 2282 is not found with the normal pslist plugin:

python vol.py -f linux-sample-5.dmp linux.pslist --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
OFFSET (V)      PID     TID     PPID    COMM    File output

Therefore can't be used in lsof etc, as this pid isn't found

python vol.py -f linux-sample-5.dmp linux.lsof --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     Process FD      Path

<no results>

python vol.py -f linux-sample-5.dmp linux.elfs --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     Process Start   End     File Path       File Output

<no results>

python vol.py -f linux-sample-5.dmp linux.psaux --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     PPID    COMM    ARGS

<no results>

However it is found with psscan:

$ python vol.py -f linux-sample-5.dmp linux.psscan --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
OFFSET (P)      PID     TID     PPID    COMM    EXIT_STATE

0x1d24c7c0      2262    2282    2257    apache2 TASK_RUNNING

So by passing the --scan option to the plugins that have been modified, lsof, psaux, and elfs mean that the results for this task are displayed:

lsof:

python vol.py -f linux-sample-5.dmp linux.lsof --scan --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     Process FD      Path

2282    apache2 0       /dev/null
2282    apache2 1       /dev/null
2282    apache2 2       /var/log/apache2/error.log
2282    apache2 3       socket:[6225]
2282    apache2 4       socket:[6226]
2282    apache2 5       pipe:[6237]
2282    apache2 6       pipe:[6237]
2282    apache2 7       /var/log/apache2/other_vhosts_access.log
2282    apache2 8       /var/log/apache2/access.log
2282    apache2 10      anon_inode:[1518]

psaux:

python vol.py -f linux-sample-5.dmp linux.psaux --scan --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     PPID    COMM    ARGS

2282    2257    apache2 /usr/sbin/apache2 -k start

elfs:

python vol.py -f linux-sample-5.dmp linux.elfs --scan --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
PID     Process Start   End     File Path       File Output

2282    apache2 0x7fd330a17000  0x7fd330a2c000  /lib/x86_64-linux-gnu/libgcc_s.so.1     Disabled
2282    apache2 0x7fd33e45a000  0x7fd33e465000  /lib/x86_64-linux-gnu/libnss_files-2.13.so      Disabled
2282    apache2 0x7fd33e666000  0x7fd33e670000  /lib/x86_64-linux-gnu/libnss_nis-2.13.so        Disabled
2282    apache2 0x7fd33e871000  0x7fd33e886000  /lib/x86_64-linux-gnu/libnsl-2.13.so    Disabled
2282    apache2 0x7fd33ea89000  0x7fd33ea90000  /lib/x86_64-linux-gnu/libnss_compat-2.13.so     Disabled
2282    apache2 0x7fd33ec91000  0x7fd33ec95000  /usr/lib/apache2/modules/mod_status.so  Disabled
2282    apache2 0x7fd33ee97000  0x7fd33ee9a000  /usr/lib/apache2/modules/mod_setenvif.so        Disabled
2282    apache2 0x7fd33f09b000  0x7fd33f09e000  /usr/lib/apache2/modules/mod_reqtimeout.so      Disabled
2282    apache2 0x7fd33f29f000  0x7fd33f2a7000  /usr/lib/apache2/modules/mod_negotiation.so     Disabled
<SNIP>

@eve-mem eve-mem marked this pull request as draft October 5, 2023 09:08
@eve-mem
Copy link
Contributor Author

eve-mem commented Oct 5, 2023

Sorry - as soon as I submitted this I wanted to double check that this apache task wasn't actually a thread and that's why lsof, etc can't see it.

It turns out it is, so when passing the --pid filter to them they can't find the task due to how pid v tgid is displayed with different plugins. e.g. this issue #981

So it's not the ability to scan for tasks that is helping find these extra bits of information, they would have been displayed if no pid filter was given to the existing plugins. I need to find a good example where scanning for tasks is actually finding more information before this should be looked at.

@gcmoreira
Copy link
Contributor

Hey @eve-mem! there is still something weird here.

In your first output:

$ python vol.py -f linux-sample-5.dmp linux.pslist --pid 2282
Volatility 3 Framework 2.5.1
Progress:  100.00               Stacking attempts finished
OFFSET (V)      PID     TID     PPID    COMM    File output

The create_pid_filter function filters by task.pid, so even if 2282 is a TID, it should appear there.

The another estrange thing of your report is that currently linux.psscan doesn't accept any argument. Have you modified your code to do it? Not sure why you didn't have the following error:

$ python3 vol.py -f linux-sample-5.dmp linux.psscan --pid 2282
Volatility 3 Framework 2.5.2
usage: volatility [-h] [-c CONFIG] [--parallelism [{processes,threads,off}]] [-e EXTEND] [-p PLUGIN_DIRS] [-s SYMBOL_DIRS] [-v]
                  [-l LOG] [-o OUTPUT_DIR] [-q] [-r RENDERER] [-f FILE] [--write-config] [--save-config SAVE_CONFIG] [--clear-cache]
                  [--cache-path CACHE_PATH] [--offline] [--single-location SINGLE_LOCATION] [--stackers [STACKERS ...]]
                  [--single-swap-locations [SINGLE_SWAP_LOCATIONS ...]]
                  plugin ...
volatility: error: unrecognized arguments: --pid 2282

Could you please double-check all this again?

@eve-mem
Copy link
Contributor Author

eve-mem commented Dec 12, 2023

Hello @gcmoreira - thanks for taking a look at this - even while it was marked as draft - I really appreciate it.

The --pid option for psscan is added in this PR also which is why it isn't working for you there, sorry for that confusion.

For 2282 not appearing in pslist, it's because it's a thread. e.g. filtering for 2262 and you will see it there.

$ python vol.py -f linux-sample-5.dmp linux.pslist --pid 2262 --threads
Volatility 3 Framework 2.5.2
Progress:  100.00               Stacking attempts finished
OFFSET (V)      PID     TID     PPID    COMM    File output

0x88001b7ff140  2262    2262    2257    apache2 Disabled
0x88001b5e7840  2262    2267    2257    apache2 Disabled
0x88001ec540c0  2262    2268    2257    apache2 Disabled
0x88001b5fd880  2262    2269    2257    apache2 Disabled
0x880019c618c0  2262    2270    2257    apache2 Disabled
0x88001ee08140  2262    2271    2257    apache2 Disabled
0x88001b7ff840  2262    2272    2257    apache2 Disabled
0x88001f6360c0  2262    2273    2257    apache2 Disabled
0x88001ec25080  2262    2274    2257    apache2 Disabled
0x88001f4fa800  2262    2275    2257    apache2 Disabled
0x88001d2438c0  2262    2276    2257    apache2 Disabled
0x88001d2431c0  2262    2277    2257    apache2 Disabled
0x88001d246740  2262    2278    2257    apache2 Disabled
0x88001d246040  2262    2279    2257    apache2 Disabled
0x88001d249780  2262    2280    2257    apache2 Disabled
0x88001d249080  2262    2281    2257    apache2 Disabled
0x88001d24c7c0  2262    2282    2257    apache2 Disabled          <---- here it is
0x88001d24c0c0  2262    2283    2257    apache2 Disabled
0x88001d24f800  2262    2284    2257    apache2 Disabled
0x88001d24f100  2262    2285    2257    apache2 Disabled
0x88001d253840  2262    2286    2257    apache2 Disabled
0x88001d253140  2262    2287    2257    apache2 Disabled
0x88001d257880  2262    2288    2257    apache2 Disabled
0x88001d257180  2262    2289    2257    apache2 Disabled
0x88001d25a8c0  2262    2290    2257    apache2 Disabled
0x88001d25a1c0  2262    2291    2257    apache2 Disabled
0x88001d25d740  2262    2292    2257    apache2 Disabled

If you used lsof for example with no filter for we don't see 2282 either, but we will have 2262. When lsof etc get's it's tasks from pslist it isn't passing the include_threads as True.

2262    apache2 0       /dev/null
2262    apache2 1       /dev/null
2262    apache2 2       /var/log/apache2/error.log
2262    apache2 3       socket:[6225]
2262    apache2 4       socket:[6226]
2262    apache2 5       pipe:[6237]
2262    apache2 6       pipe:[6237]
2262    apache2 7       /var/log/apache2/other_vhosts_access.log
2262    apache2 8       /var/log/apache2/access.log
2262    apache2 10      anon_inode:[1518]

What's happening when psscan is finding the task struct for 2282 we can then pass it to lsof and the other plugins and they'll happily extract out the information for them - but for all the ones that aren't the leaders it's not really useful. We can get the same information from the leader really (I don't think threads can have different opened files etc? maybe I am wrong on that)

So if we continued with this PR to add in checks to show the difference clearly between threads etc. I still need to find a sample where it's possible to scan for an actual task but mm etc haven't yet been cleared. If that doesn't happen then it's not really useful to add in the --scan option to the other plugins.

@gcmoreira
Copy link
Contributor

@eve-mem very sorry, I got to this issue through another ticket and I haven't noticed that actually had the draft label.

Anyway, I know it's a thread. What I am saying is:

  1. TIDs (task.pid) are unique, and it's actually "the" kernel task identifier. User PIDs (task.tgid) are not. It's a group identifier.
  2. The task filter function currently filters by TID (task.pid) which IMO is correct.

Given these two statements, and providing you haven't changed the task filter function, I believe there is a bug somewhere. Because, using "--pid" cannot return more than 1 task/raw.

Given your output, it seems in vol3 already or in your new code, for some reason, it's using the 'task.tgid' as task filter instead of 'task.pid'.

I hope I explained it better, but don't worry. It's better you finish with your changes and we will check this once it's ready. Sorry again 🙏🏻

@eve-mem
Copy link
Contributor Author

eve-mem commented Dec 13, 2023

Hello again @gcmoreira

Firstly re my comment about it being a draft PR.

I'm worried I across passive aggressive without meaning to. I really truly appreciate all of your, @ikelos , and @atcuno help, advice, and guidance. (And of course everyone else who points me in the right direction, but you three have been very generous with your time)

The way I see it if your place of work offered a volatility3 plugin/code review service I'd be looking at spending many 1000s to get the same level of help.

Instead you're all here giving up your own time - for free - to patiently explain things to me, time you could be spending with family or other things you'd rather be doing. So as honestly as I can get across in text form - thank you! I really mean it. Indeed really this extremely long comment...!

I view "ready for review" to mean; @eve-mem thinks this is probably good and correct, but if there are any mistakes or ways of doing this better I'd love to hear them.

"Draft" to mean; @eve-mem knows this PR isn't good enough to waste people's time with, but would welcome any and all comments. Bits that I think probably do need doing, but right now its not hitting the mark.

Then "closed" on my PRs to mean; @eve-mem thought this was a good idea, but after more thought or some obvious issues pointed out by others means it really isn't the right way to be doing things and should just be forgotten completely.

Next re this PR and change as a whole.

I set this to draft as soon as I realised that the "extra" information wasn't actually new - vol was already displaying it.

I had thought I'd found a sample where being able to pass tasks found by scanning could provide extra information that wasn't easily accessible before - hurray something useful! But i was wrong, I'd just found a thread, vol would already show this information - so I set it to draft.

It's why I try to include snippets of output, volshell, etc - so that when I make a mistake it's easy to point out. I also try to use the "linux-sample-N.dmp" files in the examples as they're somewhat easy to get hold of, meaning that it's easier for people to "trust but verify" by running the against the same image. Hopefully making it easier to spot my mistakes etc.

My motivation for this PR is to slowly get all of vol2 capabilities moved across where it makes sense. I see many of the vol2 windiws plugins being able to accept an offset to objects and then provide the information on those (e.g. to save people jumping into volshell, and lower that barrier to entry)

I thought it would make sense for the linux plugins to be able to work off tasks that have been found by scanning, maybe one found that way and not by walking the list could have some useful information. Spurred on by the linux psxview plugin from vol2, i was hoping that in some cases it is possible that something is hiding from the list or exited just recently enough that having this option in vol3 would mean finding those extra bits of information.

However, right now I've not seen it, maybe it'll never happen or so rare it's not worth people's time on this PR and it's better spent on other parts of vol3.

If i do find an example where this option genuinely finds more information I'll probably make it ready again once i can prove to myself it is helping, rather than just finding threads and thinking they are whole new processes as I did before.

Right now all the tasks i see that aren't in the normal list have had mm cleared etc, meaning that even if you pointed vol3 at them there is no extra information to get - it's already been cleared.

If you or anyone else knows it'll never be possible to get extra information this way I'd happily close the PR too. Or if keeping this on the backlog as draft is cluttering things we can close it too.

Then lastly the pid/tid/tgid discussion

I'm worried I've come across saying "volatility3 is wrong, we need to change it right away!!!!" And it's really not what i mean. I've not used the best examples (some being proxies, but actually completely wrong) and not explained myself well enough, I think for the most part every in core is completely correct - it's just a finessing point that I loved to see. I'm going to try and add a comment on the issue I'd raised to really cleanly explain what I mean. I'll just take me a little bit of time to make sure I'm really being clear with what i actually mean.

Thank you for spending lots of your own time responding with examples etc, that all takes a long time to do and I really appreciate it.

@ikelos
Copy link
Member

ikelos commented Apr 28, 2024

Any progress on how volatility should handle pid/tid/tgid? The bug that referenced this was marked stale, and I'm keep it doesn't languish if there's already been work done on it? A lot of our plugins should have had the functionality separated out into class method that can just be handed a process, so hopefully changing how the processes are handed in shouldn't be too tough (the linux pslist just has simple output, but it's share between different ways of getting the process list if I recall correctly). So it sounds like it should be possible to get something going, it was whether we wanted to take a bigger bite out of the whole abstract process object idea that was going to wrap all platform processes in the same (or similar) API? Can't remember where we got to with this though, so figured I'd ask... 5:)

@eve-mem
Copy link
Contributor Author

eve-mem commented Apr 30, 2024

Hello @ikelos - I think that the final consensus was to leave PID meaning TGID in the pslist plugin and then PID as PID in other plugins. I still feel a little weird about it, but i can see where people are coming from.

Re actually adding support for scanned tasks, it is something I'd still like to add. I've yet to find a solid example where it does actually find new and useful data that i can share and use as a reference.

I did have a go at making a overly simple generic processes here to have a single way to get some details about a process. Although it looks like i very marked it ready for review... 🤦
#1000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants