Skip to content

Commit

Permalink
Merge pull request #83 from trailofbits/readme-variant
Browse files Browse the repository at this point in the history
More usability and doc improvements
  • Loading branch information
Boyan-MILANOV authored Dec 23, 2023
2 parents 08f98f2 + ba83176 commit 16aa3bd
Show file tree
Hide file tree
Showing 12 changed files with 208 additions and 74 deletions.
6 changes: 0 additions & 6 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,17 +13,11 @@ jobs:
language: "python"
python-version: "3.8"

lint-markdown:
uses: trailofbits/.github/.github/workflows/[email protected]
with:
language: "markdown"

all-lints-pass:
if: always()

needs:
- lint-python
- lint-markdown

runs-on: ubuntu-latest

Expand Down
214 changes: 176 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,150 @@
# Fickling

![Fickling image](./fickling_image.png)

Fickling is a decompiler, static analyzer, and bytecode rewriter for Python
[pickle](https://docs.python.org/3/library/pickle.html) object serializations.
You can use fickling to detect, analyze, reverse engineer, or even create
malicious pickle or pickle-based files, including PyTorch files.

Pickled Python objects are in fact bytecode that is interpreted by a stack-based
virtual machine built into Python called the "Pickle Machine". Fickling can take
pickled data streams and decompile them into human-readable Python code that,
when executed, will deserialize to the original serialized object.

The authors do not prescribe any meaning to the “F” in Fickling; it could stand
for “fickle,” … or something else. Divining its meaning is a personal journey
in discretion and is left as an exercise to the reader.
Fickling can be used both as a **python library** and a **CLI**.

Learn more about it in our [blog post](https://blog.trailofbits.com/2021/03/15/never-a-dill-moment-exploiting-machine-learning-pickle-files/)
and [DEF CON AI Village 2021 talk](https://www.youtube.com/watch?v=bZ0m_H_dEJI).
* [Installation](#installation)
* [Malicious file detection](#malicious-file-detection)
* [Advanced usage](#advanced-usage)
* [Trace pickle execution](#trace-pickle-execution)
* [Pickle code injection](#pickle-code-injection)
* [Pickle decompilation](#pickle-decompilation)
* [PyTorch polyglots](#pytorch-polyglots)
* [About pickle](#about-pickle)
* [Contact](#contact)

## Installation

Fickling has been tested on Python 3.8 through Python 3.11 and has very few dependencies.
It can be installed through pip:
Both the library and command line utility can be installed through pip:

```bash
python -m pip install fickling
```

This installs both the library and the command line utility.
## Malicious file detection

Fickling can seamlessly be integrated into your codebase to detect and halt the loading of malicious
files at runtime.

Below we show the different ways you can use fickling to enforce safety checks on pickle files.
Under the hood, it hooks the `pickle` library to add safety checks so that loading a pickle file
raises an `UnsafeFileError` exception if malicious content is detected in the file.

#### Option 1 (recommended): check safety of all pickle files loaded

```python
# This enforces safety checks every time pickle.load() is used
fickling.always_check_safety()

# Attempt to load an unsafe file now raises an exception
with open("file.pkl", "rb") as f:
try:
pickle.load(f)
except fickling.UnsafeFileError:
print("Unsafe file!")
```

#### Option 2: use a context manager

```python
with fickling.check_safety():
# All pickle files loaded within the context manager are checked for safety
try:
with open("file.pkl", "rb") as f:
pickle.load("file.pkl")
except fickling.UnsafeFileError:
print("Unsafe file!")

# Files loaded outside of context manager are NOT checked
pickle.load("file.pkl")
```

#### Option 3: check and load a single file

```python
# Use fickling.load() in place of pickle.load() to check safety and load a single pickle file
try:
fickling.load("file.pkl")
except fickling.UnsafeFileError as e:
print("Unsafe file!")
```

#### Option 4: only check pickle file safety without loading

```python3
# Perform a safety check on a pickle file without loading it
if not fickling.is_likely_safe("file.pkl"):
print("Unsafe file!")
```

#### Accessing the safety analysis results

You can access the details of fickling's safety analysis from within the raised exception:

```python

>>> try:
... fickling.load("unsafe.pkl")
... except fickling.UnsafeFileError as e:
... print(e.info)

{
"severity": "OVERTLY_MALICIOUS",
"analysis": "Call to `eval(b'[5, 6, 7, 8]')` is almost certainly evidence of a malicious pickle file. Variable `_var0` is assigned value `eval(b'[5, 6, 7, 8]')` but unused afterward; this is suspicious and indicative of a malicious pickle file",
"detailed_results": {
"AnalysisResult": {
"OvertlyBadEval": "eval(b'[5, 6, 7, 8]')",
"UnusedVariables": [
"_var0",
"eval(b'[5, 6, 7, 8]')"
]
}
}
}
```

If you are using another language than Python, you can still use fickling's `CLI` to
safety-check pickle files:

```console
fickling --check-safety -p pickled.data
```

## Advanced usage

### Trace pickle execution

Fickling's `CLI` allows to safely trace the execution of the Pickle virtual machine without
exercising any malicious code:

```console
fickling --trace file.pkl
```

### Pickle code injection

Fickling allows to inject arbitrary code in a pickle file that will run every time the file is loaded

```console
fickling --inject "print('Malicious')" file.pkl
```

## Usage
### Pickle decompilation

Fickling can be run programmatically:
Fickling can be used to decompile a pickle file for further analysis

```python
>>> import ast
>>> import pickle
>>> from fickling.pickle import Pickled
>>> print(ast.dump(Pickled.load(pickle.dumps([1, 2, 3, 4])).ast, indent=4))
>>> import ast, pickle
>>> from fickling.fickle import Pickled
>>> fickled_object = Pickled.load(pickle.dumps([1, 2, 3, 4]))
>>> print(ast.dump(fickled_object.ast, indent=4))
Module(
body=[
Assign(
Expand All @@ -46,35 +156,63 @@ Module(
Constant(value=2),
Constant(value=3),
Constant(value=4)],
ctx=Load()))])
ctx=Load()))],
type_ignores=[])
```

Fickling can also be run as a command line utility:
### PyTorch polyglots

```console
$ fickling pickled.data
result = [1, 2, 3, 4]
```
We currently support inspecting, identifying, and creating file polyglots between the
following PyTorch file formats:

This is of course a simple example. However, Python pickle bytecode can run
arbitrary Python commands (such as `exec` or `os.system`) so it is a security
risk to unpickle untrusted data. You can test for common patterns of
malicious pickle files with the `--check-safety` option:
* **PyTorch v0.1.1**: Tar file with sys_info, pickle, storages, and tensors
* **PyTorch v0.1.10**: Stacked pickle files
* **TorchScript v1.0**: ZIP file with model.json and constants.pkl (a JSON file and a pickle file)
* **TorchScript v1.1**: ZIP file with model.json and attribute.pkl (a JSON file and a pickle file)
* **TorchScript v1.3**: ZIP file with data.pkl and constants.pkl (2 pickle files)
* **TorchScript v1.4**: ZIP file with data.pkl, constants.pkl, and version (2 pickle files and a folder)
* **PyTorch v1.3**: ZIP file containing data.pkl (1 pickle file)
* **PyTorch model archive format**: ZIP file that includes Python code files and pickle files

```console
$ fickling --check-safety pickled.data
Warning: Fickling failed to detect any overtly unsafe code, but the pickle file may still be unsafe.
Do not unpickle this file if it is from an untrusted source!
```python
>> import torch
>> import torchvision.models as models
>> from fickling.pytorch import PyTorchModelWrapper
>> model = models.mobilenet_v2()
>> torch.save(model, "mobilenet.pth")
>> fickled_model = PyTorchModelWrapper("mobilenet.pth")
>> print(fickled_model.formats)
Your file is most likely of this format: PyTorch v1.3
['PyTorch v1.3']
```

We do not recommend relying on the `--check-safety` option for critical use
cases at this point in time.
Check out [our examples](https://github.com/trailofbits/fickling/tree/master/example)
to learn more about using fickling!

## About pickle

Pickled Python objects are in fact bytecode that is interpreted by a stack-based
virtual machine built into Python called the "Pickle Machine". Fickling can take
pickled data streams and decompile them into human-readable Python code that,
when executed, will deserialize to the original serialized object. This is made
possible by Fickling’s custom implementation of the PM. Fickling is safe to run
on potentially malicious files because its PM symbolically executes code rather
than overtly executing it.

The authors do not prescribe any meaning to the “F” in Fickling; it could stand
for “fickle,” … or something else. Divining its meaning is a personal journey
in discretion and is left as an exercise to the reader.

Learn more about fickling in our
[blog post](https://blog.trailofbits.com/2021/03/15/never-a-dill-moment-exploiting-machine-learning-pickle-files/)
and [DEF CON AI Village 2021 talk](https://www.youtube.com/watch?v=bZ0m_H_dEJI).

You can also safely trace the execution of the Pickle virtual machine without
exercising any malicious code with the `--trace` option.
## Contact

Finally, you can inject arbitrary Python code that will be run on unpickling
into an existing pickle file with the `--inject` option.
If you'd like to file a bug report or feature request, please use our
[issues](https://github.com/trailofbits/fickling/issues) page.
Feel free to contact us or reach out in
[Empire Hacking](https://slack.empirehacking.nyc/) for help using or extending fickling.

## License

Expand Down
7 changes: 4 additions & 3 deletions example/hook_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@

import numpy

import fickling.hook as hook
import fickling

# Set up global fickling hook
hook.run_hook()
fickling.always_check_safety()
# Eauivalent to fickling.hook.run_hook()

# Fickling can check a pickle file for safety prior to running it
test_list = [1, 2, 3]
Expand Down Expand Up @@ -41,5 +42,5 @@ def __reduce__(self):

# This hook works when pickle.load is called under the hood in Python as well
# Note that this does not always work for torch.load()
# This should raise "SafetyError"
# This should raise "UnsafeFileError"
numpy.load("unsafe.pkl", allow_pickle=True)
2 changes: 1 addition & 1 deletion example/pytorch_poc.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
# Define model
class TheModelClass(nn.Module):
def __init__(self):
super(TheModelClass, self).__init__()
super().__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 13, 5)
Expand Down
2 changes: 2 additions & 0 deletions fickling/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# fmt: off
from .loader import load #noqa
from .context import check_safety #noqa
from .hook import always_check_safety #noqa
from .analysis import is_likely_safe # noqa
# fmt: on

# The above lines enables `fickling.load()` and `with fickling.check_safety()`
Expand Down
5 changes: 5 additions & 0 deletions fickling/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -322,3 +322,8 @@ def check_safety(
with open(json_output_path, "a") as json_file:
json.dump(severity_data, json_file, indent=4)
return results


def is_likely_safe(filepath: str):
with open(filepath, "rb") as f:
return check_safety(Pickled.load(f)).severity == Severity.LIKELY_SAFE
8 changes: 8 additions & 0 deletions fickling/exception.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
class UnsafeFileError(Exception):
def __init__(self, filepath, info):
super().__init__()
self.filepath = filepath
self.info = info

def __str__(self):
return f"Safety results for {self.filepath} : {str(self.info)}"
16 changes: 0 additions & 16 deletions fickling/fickle.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
import re
import struct
import sys
import warnings
from abc import ABC, abstractmethod
from collections.abc import MutableSequence, Sequence
from enum import Enum
Expand Down Expand Up @@ -701,21 +700,6 @@ def has_non_setstate_call(self) -> bool:
object.__setstate__"""
return bool(self.properties.non_setstate_calls)

def check_safety(self):
from fickling.analysis import check_safety # noqa

safety_results = check_safety(self)
return safety_results

def is_likely_safe(self):
warnings.warn(
"The attribute .is_likely_safe will be deprecated."
"Use the attribute .check_safety instead.",
DeprecationWarning,
stacklevel=2,
)
return self.check_safety(self)

def unsafe_imports(self) -> Iterator[Union[ast.Import, ast.ImportFrom]]:
for node in self.properties.imports:
if node.module in (
Expand Down
9 changes: 8 additions & 1 deletion fickling/hook.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,12 @@


def run_hook():
# This is the global function hook
"""Replace pickle.load() by fickling's load()"""
pickle.load = loader.load


def always_check_safety():
"""
Alias for run_hook()
"""
run_hook()
9 changes: 2 additions & 7 deletions fickling/loader.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,10 @@
import pickle

from fickling.analysis import Severity, check_safety
from fickling.exception import UnsafeFileError
from fickling.fickle import Pickled


class SafetyError(Exception):
"""Exception raised when a file is deemed unsafe by fickling."""

pass


def load(
file,
max_acceptable_severity=Severity.LIKELY_SAFE,
Expand All @@ -27,4 +22,4 @@ def load(
# loaded after the analysis.
return pickle.loads(pickled_data.dumps(), *args, **kwargs)
else:
raise SafetyError(f"File is unsafe: {result.severity.name}")
raise UnsafeFileError(file, result.to_dict())
Binary file added fickling_image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 16aa3bd

Please sign in to comment.