Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux: Add support for threads in both lsof and sockstat plugins. #1263

Conversation

gcmoreira
Copy link
Contributor

This PR includes:

  • lsof plugin: source code refactored.
  • lsof plugin: Added the device column to complete the inode information. An inode number is specific to the filesystem/device it belongs to.
  • lsof/sockstat plugins: Add threads support.
    • Threads may or may not share the file descriptor table with the thread group leader, depending on whether the CLONE_FILES flag is included in the clone() syscall.
    • Also, once started, a thread can unshare the fd table with its parent. Refer to the unshare() libc syscall wrapper man page, unshare(2).
    • Additionally, note that the Linux lsof command in user space includes thread listings by default as well.
    • Now, there are two columns to identify the thread group ID (PID) and the task/thread ID (TID).
  • Added inode getters from both, the dentry and file structs. From kernels +3.9 the file struct caches the inode pointer. So, when possible, we get this value.
  • Improve smear protection in these plugins and various APIs (Pointers smear memory protection #1243)
pyhon3 ./vol.py -r pretty \
    ../linux-sample-1.bin \
    linux.lsof 
Volatility 3 Framework 2.10.0
  |  PID |  TID |         Process |  FD |                       Path | Device |      Inode | Type |       Mode |                        Changed |                       Modified |                       Accessed |    Size
* |  322 |  322 |           udevd |   0 |                  /dev/null |    0:5 |       1520 |  CHR | crw-rw-rw- | 2014-06-24 10:22:36.468000 UTC | 2014-06-24 10:22:36.468000 UTC | 2014-06-24 10:22:36.468000 UTC |       0
...
* | 3968 | 3973 |           gdbus |  17 |             /tmp/vteNVRRHX |    8:1 |       7132 |  REG | -rw------- | 2014-06-24 10:41:49.976800 UTC | 2014-06-24 10:41:49.976800 UTC | 2014-06-24 10:41:46.448800 UTC |   29281
* | 3968 | 3974 |    dconf worker |   1 | /home/vol/.xsession-errors |    8:1 |     928155 |  REG | -rw------- | 2014-06-24 13:00:20.836818 UTC | 2014-06-24 13:00:20.836818 UTC | 2014-06-24 10:29:34.176798 UTC |    5249
...
python3 ./vol.py -r pretty \
    ../linux-sample-1.bin \
    linux.sockstat.Sockstat 
Volatility 3 Framework 2.10.0     
  | NetNS |  PID |  TID | FD |    Sock Offset |       Family |      Type |                  Proto |                             Source Addr | Source Port |                       Destination Addr | Destination Port |       State |                                         Filter
* |     - |  322 |  322 |  3 | 0x88001b61b0c0 |      AF_UNIX | SEQPACKET |                      - |                       /run/udev/control |        3693 |                                      - |                - | UNCONNECTED |                                              -
...
* |     - | 2652 | 2657 |  7 | 0x88001adf0880 |      AF_UNIX |    STREAM |                      - |                                       - |        7183 |        /var/run/dbus/system_bus_socket |             7184 | ESTABLISHED |                                              -
* |     - | 2652 | 2657 |  9 | 0x88001adf0200 |      AF_UNIX |    STREAM |                      - |                                       - |        7216 |        /var/run/dbus/system_bus_socket |             7217 | ESTABLISHED |                                              -

- lsof plugin: source code refactored
- lsof plugin: Added the 'device' column to complete the inode information. An inode number is specific to the filesystem/device it belongs to.
- lsof/sockstat plugins: Add threads support. Threads may or may not share the file descriptor table with the thread group leader, depending on whether the CLONE_FILES flag is included in the clone() syscall. Also, once started, a thread can unshare the fd table with its parent. Refer to the unshare() libc syscall wrapper man page, unshare(2). Additionally, note that the Linux lsof command in user space includes thread listings by default as well. Now, there are two columns to identify the thread group ID  (PID) and the task/thread ID (TID).
- Added inode getters from both, the dentry and file structs. From kernels +3.9 the file struct cached the inode pointer. So, when possible, we get this value.
- Improve smear protection in these plugins and various APIs (volatilityfoundation#1243)
Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, just needs the merge conflicts (around process name and the pid/tid return) fixing up please...

from typing import List, Callable
import logging
from datetime import datetime
from dataclasses import dataclass, astuple, field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these are built-in, but please can we stick to the style guide we use, which says all imports should be modules rather than methods, classes or objects please (typing gets a free pass)? This prevents people importing lsof.astuple and thinking that's where it's defined. See https://google.github.io/styleguide/pyguide.html#22-import

This also applies to datetime which was previously correctly imported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm sorry but that's the Google style guide.. is it mentioned anywhere that we stick to Google's? why?

We should stick to the Python's styles guide which is the PEP8 ... PEP 8 – Style Guide for Python Code , for which that's perfectly fine.. actually there is an example that shows that's correct:

# Correct:
from subprocess import Popen, PIPE

Also for classes:

When importing a class from a class-containing module, it’s usually okay to spell this:

from myclass import MyClass
from foo.bar.yourclass import YourClass

If this spelling causes local name clashes, then spell them explicitly:

import myclass
import foo.bar.yourclass

and use myclass.MyClass and foo.bar.yourclass.YourClass.

Copy link
Member

@ikelos ikelos Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually from the Volatility style guide from 2014 which is mostly in alignment with PEP8 with some extra requests. I thought that was a bit terse and difficult to read, so I pointed you at Google's style guide which is longer with more discussion, but here's the original text:

  • Importing specific objects or functions is discouraged.
    ** It pollutes the namespace and causes confusion. Previous versions of volatility allowed people to import functions from files they weren't defined in.
  • "from blah import *" is strongly discouraged.
    ** Again, namespace pollution and inappropriate imports.

The entire rest of the code base is formatted that way and it's better to have a style guide than not. I honestly didn't expect this request to be a big sticking point? Can you explain why allowing namespace pollution is a better stylistic option than our current one, or is your disagreement just that the documentation was provided by Google and that code comes out slightly longer?

The style guide needs updating (it still includes formatting that's handled by black now, rather than yapf at the time) and it could do with reformatting since some justifications are at the same level as the bullet points they're justifying.

Copy link
Contributor Author

@gcmoreira gcmoreira Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I will take care of this.

I'm not saying it's better, but it's unclear to me why this is a concern now. I've never heard it mentioned in any of my previous contributions or by others.

We'll also need to fix this, if we want to make an exception with Typing, in about 50 other imports. Otherwise, there are 244 imports to address.

Also, Volatility2 is a separate project, so we can't expect people to refer to it for guidance on writing code for Volatility3. Volatility3 should have its own dedicated style guide page, or clear instructions of which style guide we use.

I wonder if there's a way to include that as a rule in Black. Any idea? By the way, Black follows Python's PEP8 style guide.

I also learned we have a .style.yapf file, which configures the Yapf Python formatted from Google.

This .style.yapf file doesn't specify any based_on_style, which means it's defaulting to PEP8 as the style guide. However, it seems we can switch it to Google's style guide if preferred. See formatting-style

based_on_style = google

Maybe, if we are going to use Google's style guide we should use Yapf instead of Black.
Otherwise, it might be a good idea to remove that file, since it creates confusion about which formatter the project actually uses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can go back through previous pull requests. Where I've spotted it I should have always asked for it to be corrected (and provided the rationale). Most people stick with it, I'm disappointed I missed it for 244 imports! 5:( It's possible I had quibbled a lot about the PRs and was feeling sorry for them having to fix so many things, but I feel it's saved us for lots of cross-imports between plugins and the core so I think it's a worthwhile guideline to have.

Yep, fair point, the style guide never got pulled across to vol 3 I don't think and I didn't want to sit down and rewrite it again (I didn't even tidy up the one we had), but we should.

I don't think black has that as a flag (I don't think black has many flags, to be fair). If you find one, it would be great to apply it to our automatic black check, so that it's applied to all PRs evenly when people submit them!

Yeah, that's the style mechanism we used very early on, it was deprecated years ago but I guess I didn't remove the file. I've taken care of that now. Yapf was very bad, it would reweight something, and then because of the reweighting running it again would change it back, meaning it never reached a stable equilibrium. It's the reason we moved to black.

We don't use the Google style guide exclusively, but there are a couple of points in it that make sense, such as only importing modules (in fact, that's the main one I can think of), which we've carried over. As mentioned the main style guide is in general PEP 8.

@gcmoreira gcmoreira force-pushed the linux_lsof_refactoring_fixes_and_improvements branch from ddd8de5 to 32eaeeb Compare October 4, 2024 01:27
@gcmoreira
Copy link
Contributor Author

Ok, after a very long day fighting with my friend GIT and the changes in #1271 + related commits.. I think I managed to resolve all the conflicts

@gcmoreira
Copy link
Contributor Author

@ikelos all done

Copy link
Member

@ikelos ikelos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for making the changes!

@ikelos ikelos merged commit eb74437 into volatilityfoundation:develop Oct 7, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants