Skip to content

Commit

Permalink
Merge pull request #56 from nsosio/feat/linter
Browse files Browse the repository at this point in the history
Added python linter
  • Loading branch information
nsosio authored Nov 16, 2023
2 parents cd9760a + d3281ef commit b32365e
Show file tree
Hide file tree
Showing 26 changed files with 290 additions and 188 deletions.
13 changes: 13 additions & 0 deletions .github/workflows/precommit.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: pre-commit

on:
pull_request:
branches: [main]

jobs:
pre-commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v3
- uses: pre-commit/[email protected]
37 changes: 37 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
default_stages: [commit]

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: end-of-file-fixer
- id: check-toml
- id: check-xml
- id: debug-statements
- id: check-builtin-literals
- id: check-case-conflict

- repo: https://github.com/psf/black
rev: 23.11.0
hooks:
- id: black

- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort

- repo: https://github.com/PyCQA/flake8
rev: 6.1.0
hooks:
- id: flake8
args: ["--config=setup.cfg"]
additional_dependencies: [flake8-isort]

ci:
autoupdate_schedule: weekly
skip: []
submodules: false
2 changes: 1 addition & 1 deletion NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Currently working on requirement and understanding different project constraints
## Early Investigation

The overall investigation assumes PyTorch as the performance base and all relevant understand should be built in context of that,
the specific benchmark might not be directly comparable to each other but it should provide a rough picture of the state of
the specific benchmark might not be directly comparable to each other but it should provide a rough picture of the state of
open source ML framework performance.

This generally is to port, add and support new things into burn and other platforms.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,4 +115,4 @@ Command: `./benchmark.sh --repetitions 10 --max_tokens 100 --device gpu --prompt
| ctranslate | - | - | - | - |
| tinygrad | - | 29.78 ± 1.18 | - | - |

*(data updated: 15th November 2023)
*(data updated: 15th November 2023)
11 changes: 6 additions & 5 deletions bench.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
import argparse
from collections import defaultdict
import logging
import sys
from collections import defaultdict

import numpy as np

from python_bench.llama_cpp import LlamaCPPBenchmark
from python_bench.ctranslate import CTranslateBenchmark, get_compute_types
from python_bench.llama_cpp import LlamaCPPBenchmark
from python_bench.tinygrad import TinyGradBenchmark

logging.basicConfig(
Expand Down Expand Up @@ -53,7 +53,8 @@
args = parser.parse_args()

logging.info(
f"Running benchmark with: max_tokens={args.max_tokens} prompt={args.prompt} repetitions={args.repetitions} gpu={args.gpu} nvidia={args.gpu}"
f"Running benchmark with: max_tokens={args.max_tokens} prompt={args.prompt} "
+ f"repetitions={args.repetitions} gpu={args.gpu} nvidia={args.gpu}"
)
report = defaultdict(lambda: defaultdict(float))
for quantize in ("Q8_0", "Q4_0"):
Expand All @@ -74,7 +75,7 @@
for compute_type in compute_types.intersection({"float16", "int8"}):
logging.info(f"Running ctranslate benchmark with {compute_type}")
ctranslate_bench = CTranslateBenchmark(
f"./models/llama-2-7b-hf-float16",
"./models/llama-2-7b-hf-float16",
gpu=args.gpu,
compute_type=compute_type,
).load_model()
Expand All @@ -86,7 +87,7 @@
"std": np.std(ctranslate_bench.results),
}

logging.info(f"Running tinygrad benchmark")
logging.info("Running tinygrad benchmark")
tinygrad_bench = TinyGradBenchmark(
"./models/llama-2-7b-hf",
quantize=False,
Expand Down
10 changes: 5 additions & 5 deletions benchmark.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

##############################################################################################
# Script: run_benchmarks.sh
# Description: This script runs benchmarks for a transformer model using both
# Rust and Python implementations. It provides options to customize the
# Description: This script runs benchmarks for a transformer model using both
# Rust and Python implementations. It provides options to customize the
# benchmarks, such as the prompt, repetitions, maximum tokens, device, and NVIDIA flag.
#
# Usage: ./run_benchmarks.sh [OPTIONS]
Expand Down Expand Up @@ -150,8 +150,8 @@ run_benchmarks() {
--prompt "$PROMPT" \
--sample-len $MAX_TOKENS \
--log-file $LOG_FILENAME
fi
fi

# Set options based on $DEVICE and $USE_NVIDIA
[ "$DEVICE" == "gpu" ] && PYTHON_DEVICE="--gpu"
[ "$USE_NVIDIA" == true ] && PYTHON_NVIDIA="--nvidia"
Expand Down Expand Up @@ -235,4 +235,4 @@ check_rust
check_jq
download_models
setup
run_benchmarks "$PROMPT" "$REPETITIONS" "$MAX_TOKENS" "$DEVICE" $USE_NVIDIA "$log_filename"
run_benchmarks "$PROMPT" "$REPETITIONS" "$MAX_TOKENS" "$DEVICE" $USE_NVIDIA "$log_filename"
7 changes: 3 additions & 4 deletions convert_to_safetensors.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
import argparse
import os
import logging
from collections import defaultdict
from typing import List
import os
import shutil
from collections import defaultdict

import torch
from safetensors.torch import load_file, save_file
Expand Down Expand Up @@ -80,7 +79,7 @@ def convert_file(pt_filename: str, sf_filename: str):
raise RuntimeError(f"The output tensors do not match for key {k}")


def convert_multi(input_dir: str, output_dir: str) -> List[str]:
def convert_multi(input_dir: str, output_dir: str) -> list[str]:
if os.path.exists(output_dir):
logging.warning(f"{output_dir} already exists!")
return []
Expand Down
6 changes: 3 additions & 3 deletions download.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

################################################################################
# Script: download.sh
# Description: Downloads files from a list of URLs specified in a JSON file.
# The JSON file should contain an array of objects, each with a 'url', 'file',
# and 'folder' property. The script checks if the file already exists before
# Description: Downloads files from a list of URLs specified in a JSON file.
# The JSON file should contain an array of objects, each with a 'url', 'file',
# and 'folder' property. The script checks if the file already exists before
# downloading it.
#
# Usage: ./download.sh --models <json_file> --cache <cache_file> --force-download
Expand Down
2 changes: 1 addition & 1 deletion models.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@
"file": "llama-2-7b-raw.zip",
"folder": "./models/llama-2-7b-raw"
}
]
]
3 changes: 2 additions & 1 deletion python_bench/benchmark.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations
from abc import ABC, abstractmethod

import logging
from abc import ABC, abstractmethod

logger = logging.getLogger(__name__)

Expand Down
2 changes: 1 addition & 1 deletion python_bench/ctranslate.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import os
import logging
import os
import time

import ctranslate2
Expand Down
6 changes: 4 additions & 2 deletions python_bench/llama_cpp.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
import time
import logging
from python_bench.benchmark import Benchmark
import time

from llama_cpp import Llama

from python_bench.benchmark import Benchmark

logging.getLogger("llama_cpp").setLevel(logging.ERROR)


Expand Down
44 changes: 20 additions & 24 deletions python_bench/tinygrad.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,17 @@
import json
import logging
import os
import time
from pathlib import Path
import json
from typing import Optional, Union

import numpy as np
from typing import Optional, Tuple, Union
from tinygrad.shape.symbolic import Variable
from tinygrad.jit import TinyJit, JIT_SUPPORTED_DEVICE
from tinygrad.nn.state import safe_load, torch_load, load_state_dict
from tinygrad.helpers import CI, dtypes, getenv
from tinygrad.jit import JIT_SUPPORTED_DEVICE, TinyJit
from tinygrad.nn import Embedding, Linear
from tinygrad.nn.state import load_state_dict, safe_load, torch_load
from tinygrad.shape.symbolic import Variable
from tinygrad.tensor import Tensor
from tinygrad.helpers import getenv, dtypes, CI
from typing import Optional, Tuple
from pathlib import Path
import json
import time
import numpy as np
from pathlib import Path
import logging

from python_bench.benchmark import Benchmark

Expand Down Expand Up @@ -43,7 +39,7 @@ def complex_mult(A, c, d):
return ro.cat(co, dim=-1)


def apply_rotary_emb(xq, xk, freqs_cis) -> Tuple[Tensor, Tensor]:
def apply_rotary_emb(xq, xk, freqs_cis) -> tuple[Tensor, Tensor]:
assert (
freqs_cis.shape[1] == xq.shape[1] and freqs_cis.shape[1] == xk.shape[1]
), f"freqs_cis shape mismatch {freqs_cis.shape} xq:{xq.shape} xk:{xk.shape}"
Expand Down Expand Up @@ -183,7 +179,7 @@ def __call__(
x: Tensor,
start_pos: Union[Variable, int],
freqs_cis: Tensor,
mask: Optional[Tensor],
mask: Union[Tensor, None],
):
h = x + self.attention(self.attention_norm(x), start_pos, freqs_cis, mask)
return (h + self.feed_forward(self.ffn_norm(h))).realize()
Expand Down Expand Up @@ -513,22 +509,22 @@ def convert_from_huggingface(weights, model):
keymap = {
"model.embed_tokens.weight": "tok_embeddings.weight",
**{
f"model.layers.{l}.input_layernorm.weight": f"layers.{l}.attention_norm.weight"
for l in range(len(model.layers))
f"model.layers.{layer}.input_layernorm.weight": f"layers.{layer}.attention_norm.weight"
for layer in range(len(model.layers))
},
**{
f"model.layers.{l}.self_attn.{x}_proj.weight": f"layers.{l}.attention.w{x}.weight"
f"model.layers.{layer}.self_attn.{x}_proj.weight": f"layers.{layer}.attention.w{x}.weight"
for x in ["q", "k", "v", "o"]
for l in range(len(model.layers))
for layer in range(len(model.layers))
},
**{
f"model.layers.{l}.post_attention_layernorm.weight": f"layers.{l}.ffn_norm.weight"
for l in range(len(model.layers))
f"model.layers.{layer}.post_attention_layernorm.weight": f"layers.{layer}.ffn_norm.weight"
for layer in range(len(model.layers))
},
**{
f"model.layers.{l}.mlp.{x}_proj.weight": f"layers.{l}.feed_forward.w{y}.weight"
f"model.layers.{layer}.mlp.{x}_proj.weight": f"layers.{layer}.feed_forward.w{y}.weight"
for x, y in {"gate": "1", "down": "2", "up": "3"}.items()
for l in range(len(model.layers))
for layer in range(len(model.layers))
},
"model.norm.weight": "norm.weight",
"lm_head.weight": "output.weight",
Expand All @@ -538,7 +534,7 @@ def convert_from_huggingface(weights, model):

class AbsmaxQuantizedLinear:
def __init__(self, in_features, out_features, bias=False):
assert bias == False
assert not bias
self.weight = Tensor.ones(out_features, in_features, dtype=dtypes.int8)
self.scale = Tensor.ones(out_features, dtype=dtypes.half)

Expand Down
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
llama_cpp_python==0.2.15
sentencepiece==0.1.99
ctranslate2==3.20.0
huggingface-hub==0.17.3
huggingface-hub==0.17.3
transformers==4.35.0
torch==2.1.0
# Using fixed commit (a72b3700) for tinygrad to ensure stability in benchmarking.
# Helps maintain reproducibility and guards against potential breaking changes.
git+https://github.com/tinygrad/tinygrad.git@a72b370066837af5b4d44eeb5c4fb30aebf5c502
git+https://github.com/tinygrad/tinygrad.git@a72b370066837af5b4d44eeb5c4fb30aebf5c502
8 changes: 4 additions & 4 deletions rust_bench/llama2-burn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ python3 dump_model.py <model_dir> <tokenizer_path>
```
Example: `python3 dump_model.py llama2-7b-chat tokenizer.model`

3. **Test the Tokenizer**: Finally, run the `test_tokenizer.py` script to load the tokenizer.model file and verify an example encoding and decoding. This script should be run in the same directory as the tokenizer file. Execute this script using the command:
3. **Test the Tokenizer**: Finally, run the `test_tokenizer.py` script to load the tokenizer.model file and verify an example encoding and decoding. This script should be run in the same directory as the tokenizer file. Execute this script using the command:
```
python3 test_tokenizer.py
```
Expand All @@ -70,7 +70,7 @@ python3 test_tokenizer.py

Inside the 'src/bin' folder, you will find Rust binaries: `convert`, `sample`, and `test`.

1. **Converting Dumped Weights**: The `convert` binary converts dumped weights into burn's model format. It saves them for further use. Execute this using the following command:
1. **Converting Dumped Weights**: The `convert` binary converts dumped weights into burn's model format. It saves them for further use. Execute this using the following command:
```
cargo run --bin convert <dump_path> <burn_model_name>
```
Expand All @@ -82,11 +82,11 @@ cargo run --bin test <tokenizer_filepath> <dump_path>
```
Example: `cargo run --release --bin test tokenizer.model params`

3. **Sampling Text**: The `sample` binary loads the converted burn model file and generates a sample output based on an input prompt. The model can run on either the cpu or gpu. Execute this using the following command:
3. **Sampling Text**: The `sample` binary loads the converted burn model file and generates a sample output based on an input prompt. The model can run on either the cpu or gpu. Execute this using the following command:
```
cargo run --bin sample <model_name> <tokenizer_filepath> <prompt> <n_tokens>
```
Example:
Example:
```
#export TORCH_CUDA_VERSION=cu113 # if running on gpu
cargo run --release --bin sample llama2-7b-chat tokenizer.model "Hello, I am " 10 cpu
Expand Down
Loading

0 comments on commit b32365e

Please sign in to comment.