perf(parser): reduce `Token` size to 8 bytes from 16 #8153

branchseer · 2024-12-28T05:50:44Z

Replace end: u32 with len: u16. Ends of long tokens (which are rare) are stored in lexer.long_token_ends;
Pack bools into bitflags;
Now that end is calculated from start + len, start must be properly set. In some places they were not. This PR fixes them and adds a debug-assertion check in the lexer.

graphite-app · 2024-12-28T05:50:53Z

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

codspeed-hq · 2024-12-28T05:57:36Z

CodSpeed Performance Report

Merging #8153 will degrade performances by 13.2%

_{Comparing branchseer:token_eight_bytes (84bc0de) with main (a69d15f)}

Summary

❌ 5 regressions
✅ 24 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`branchseer:token_eight_bytes`	Change
❌	`lexer[RadixUIAdoptionSection.jsx]`	23.9 µs	26.3 µs	-9.05%
❌	`lexer[antd.js]`	22.3 ms	25.3 ms	-11.95%
❌	`lexer[cal.com.tsx]`	5.5 ms	6.3 ms	-12.27%
❌	`lexer[checker.ts]`	13.2 ms	14.8 ms	-10.59%
❌	`lexer[pdf.mjs]`	3.6 ms	4.1 ms	-13.2%

# Conflicts: # crates/oxc_parser/src/lexer/token.rs # crates/oxc_parser/src/modifiers.rs

branchseer · 2025-01-08T12:53:12Z

My local bench run shows the same regression on lexer, but also shows noticeable improvements on parser.

I guess the lexer regression makes sense since the lexer now does more calculation but barely copies tokens on it own.

Here's my local bench result of parser:

cargo bench --bench parser --no-default-features --features parser -- --baseline arm64
    Finished `bench` profile [optimized] target(s) in 0.30s
     Running benches/parser.rs (target/release/deps/parser-63bb1ee5f39a6738)
parser/checker.ts       time:   [9.0814 ms 9.1062 ms 9.1461 ms]
                        change: [−2.0830% −1.7180% −1.2430%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  9 (9.00%) high mild
  5 (5.00%) high severe
parser/cal.com.tsx      time:   [5.0674 ms 5.0756 ms 5.0841 ms]
                        change: [−3.0402% −2.8004% −2.5644%] (p = 0.00 < 0.05)
                        Performance has improved.
parser/RadixUIAdoptionSection.jsx
                        time:   [6.1189 µs 6.1512 µs 6.2121 µs]
                        change: [−7.3368% −6.5958% −5.7452%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe
parser/pdf.mjs          time:   [2.9626 ms 2.9652 ms 2.9681 ms]
                        change: [−0.8735% −0.7438% −0.6150%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  7 (7.00%) high mild
  1 (1.00%) high severe
parser/antd.js          time:   [18.797 ms 18.815 ms 18.835 ms]
                        change: [−1.3170% −1.1375% −0.9604%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  3 (3.00%) high severe

I suspected it was a cpu arch thing so I ran the parser bench under rosetta 2, only to see even more improvements:

cargo bench --bench parser --no-default-features --features parser --target x86_64-apple-darwin -- --baseline x86_64
    Finished `bench` profile [optimized] target(s) in 0.11s
     Running benches/parser.rs (target/x86_64-apple-darwin/release/deps/parser-fbba37f46cde4093)
Benchmarking parser/checker.ts: Collecting 100 samples in estimated 5.4792 s (400parser/checker.ts       time:   [13.672 ms 13.703 ms 13.739 ms]
                        change: [−2.7911% −2.4699% −2.1505%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe
Benchmarking parser/cal.com.tsx: Collecting 100 samples in estimated 5.0544 s (70parser/cal.com.tsx      time:   [7.2164 ms 7.2259 ms 7.2374 ms]
                        change: [−4.1010% −3.9361% −3.7532%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe
Benchmarking parser/RadixUIAdoptionSection.jsx: Collecting 100 samples in estimatparser/RadixUIAdoptionSection.jsx
                        time:   [10.517 µs 10.525 µs 10.533 µs]
                        change: [−14.409% −14.120% −13.846%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
Benchmarking parser/pdf.mjs: Collecting 100 samples in estimated 5.3237 s (1200 iparser/pdf.mjs          time:   [4.4398 ms 4.4467 ms 4.4581 ms]
                        change: [−2.3575% −2.1154% −1.8130%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe
Benchmarking parser/antd.js: Collecting 100 samples in estimated 5.4254 s (200 itparser/antd.js          time:   [27.019 ms 27.038 ms 27.062 ms]
                        change: [−3.3556% −3.2324% −3.1183%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

Now I'm lost. @overlookmotel any insight on this?

overlookmotel · 2025-01-08T13:53:54Z

Thanks for investigating further. I did spend a couple of hours looking at this yesterday and scratching my head. I had to stop because I could feel a rabbit hole coming on and I had other tasks I needed to get on with!

Please give me a few days to mull it over and I'll come back to you with some ideas.

Also, #8298 may have an effect, as lots of work on Token involves converting it to Span. I have some stuff to investigate on that PR before it's ready to merge, but once it is merged, it may affect perf on this PR too (hopefully positively!).

One question in meantime:

I suspected it was a cpu arch thing so I ran the parser bench under rosetta 2, only to see even more improvements

What effect did you expect Rosetta 2 to have? Rosetta is an x86_64 emulator, right? (just checking I do know what I think I know!)

branchseer · 2025-01-08T16:42:21Z

Yeah I ran Rosetta 2 to check the bench result under x86_64. It was my wishful thinking that if Rosetta 2 gave the same result as codespeed, that would prove the improvements occur only on specific cpu archs (apple arm64).

branchseer added 4 commits December 28, 2024 12:46

replace token.end with token.len and pack bools

ab62c23

update token usages

c3d0ec3

cargo fmt

557c1be

fix cargo lint warnings

93f4f1a

github-actions bot added A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance labels Dec 28, 2024

branchseer changed the title ~~perf(parser): reduce Token size to 8 bytes from 12~~ perf(parser): reduce Token size to 8 bytes from 16 Dec 28, 2024

Boshen marked this pull request as draft December 28, 2024 06:03

Boshen assigned overlookmotel Dec 28, 2024

Merge branch 'main' into token_eight_bytes

84bc0de

# Conflicts: # crates/oxc_parser/src/lexer/token.rs # crates/oxc_parser/src/modifiers.rs

overlookmotel self-requested a review January 8, 2025 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(parser): reduce `Token` size to 8 bytes from 16 #8153

perf(parser): reduce `Token` size to 8 bytes from 16 #8153

branchseer commented Dec 28, 2024 •

edited

Loading

graphite-app bot commented Dec 28, 2024

codspeed-hq bot commented Dec 28, 2024 •

edited

Loading

branchseer commented Jan 8, 2025

overlookmotel commented Jan 8, 2025 •

edited

Loading

branchseer commented Jan 8, 2025

perf(parser): reduce Token size to 8 bytes from 16 #8153

Are you sure you want to change the base?

perf(parser): reduce Token size to 8 bytes from 16 #8153

Conversation

branchseer commented Dec 28, 2024 • edited Loading

graphite-app bot commented Dec 28, 2024

How to use the Graphite Merge Queue

codspeed-hq bot commented Dec 28, 2024 • edited Loading

CodSpeed Performance Report

Merging #8153 will degrade performances by 13.2%

Summary

Benchmarks breakdown

branchseer commented Jan 8, 2025

overlookmotel commented Jan 8, 2025 • edited Loading

branchseer commented Jan 8, 2025

perf(parser): reduce `Token` size to 8 bytes from 16 #8153

perf(parser): reduce `Token` size to 8 bytes from 16 #8153

branchseer commented Dec 28, 2024 •

edited

Loading

codspeed-hq bot commented Dec 28, 2024 •

edited

Loading

overlookmotel commented Jan 8, 2025 •

edited

Loading