-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build time regression #14256
Comments
A few more details: The build command I use is The latest timing file is (rename it to HTML, restriction from GitHub): Cargo Timing Jan 23 2025.txt The |
After |
related
Did the amount of code double since v38? i doubt that, but @waynexia can you maybe chart build time against amount of code?
The biggest jump is at the beginning of August 2024. Is the X axis position based on Commit Date of the current tip commit? |
Yes, 100% splitting out datasource is my next thing I would love to see (and I think it will help build times massively) Thank you @waynexia for working on this project. It will be appreciated by all 🙏 As @jayzhan211 says, Once we complete this epic (@buraksenn and @berkaysynnada are pretty close) Then I can help organize an effort to break out the data sources |
100% making build time better would be really appreciated |
Looking forward to the ongoing refactors!
Here is the script, I removed the target after every build: script#!/bin/bash
# Configuration
OUTPUT_FILE="build-times.csv"
START_DATE=$(date -d "8 months ago" +%Y-%m-%d)
END_DATE=$(date +%Y-%m-%d)
# Initialize CSV output
echo "date,commit_hash,build_time_seconds,success" > "$OUTPUT_FILE"
# Iterate day-by-day
current_date="$START_DATE"
while [[ "$current_date" < "$END_DATE" ]]; do
# Find latest commit for the day
git checkout main
commit_hash=$(git log --until="$current_date 23:59" --first-parent -1 --format=%H)
if [[ -n "$commit_hash" ]]; then
echo "Processing $current_date ($commit_hash)..."
# Checkout commit (detached HEAD)
git checkout --force "$commit_hash" > /dev/null 2>&1
# Clean and build
rm -rf target/release
cargo metadata --quiet > /dev/null
start_time=$(date +%s)
cargo build --release --timings --lib --quiet
build_exit_code=$?
end_time=$(date +%s)
build_time=$((end_time - start_time))
# Record results
echo "$current_date,$commit_hash,$build_time,$build_exit_code" >> "$OUTPUT_FILE"
else
echo "No commit found for $current_date"
fi
# Move to next day
current_date=$(date -d "$current_date + 1 day" +%Y-%m-%d)
done
# Return to original branch
git checkout main |
I wonder if the new analyzer rule caused the jump 🤔 ( One theory could be that the tree node walking code is substantial and slow to compile (it is a lot of templates 🤔 ) However, a new analyzer rule in |
After removing the One guess is that both I have some thoughts on this specific case, like putting the expansion somewhere else, or changing some Fn type parameters to trait objects. But I can't tell how good it will be, and the balance between runtime overhead and build time consumption. Or we can wait for those refactors until we figure out why |
Is your feature request related to a problem or challenge?
We observed a huge increase after upgrading datafusion GreptimeTeam/greptimedb#5417. I run a script to test the build time change day by day since
v38.0
, and here is the result:The build time keeps increasing and is almost doubled since
v38.0
. Given the codebase keeps adding new code, it's expected to see a trend of increasing, but there are still some obvious "platforms" and "jumps", which might be abnormal.This is the raw data:
build-times.csv
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: