Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot expectation value benchmarks #168

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Conversation

natestemen
Copy link
Member

@natestemen natestemen commented Jan 14, 2025

Description

This PR refactors the expectation value benchmarking script to ensure it works with the run_benchmarks.sh script. It also introduces new circuits to broaden the test of the expectation value testing. Scripts to visualize relative and absolute errors across different compilers over time are added, with one plot added to the README.

@jordandsullivan we have the option to plot relative or absolute error, but relative is much higher for ucc, which I don't understand the reason for.

@Misty-W Misty-W linked an issue Jan 14, 2025 that may be closed by this pull request
@natestemen natestemen marked this pull request as ready for review January 15, 2025 05:33
Copy link
Collaborator

@jordandsullivan jordandsullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work in this! Great getting to hack in person together.

I'm wondering why are the relative errors in the hundreds in the first place? What are the actual expectation values? I'd say we want to report errors in terms of a percent.

@natestemen
Copy link
Member Author

why are the relative errors in the hundreds

I think it's because the ideal values are coming out so close to 0 that the relative errors are blowing up. You can find all the most recent results in benchmarks/results/expval_2025-01-14_20.csv. E.g. for QFT the ideal expectation value (last column) is $\mathcal{O}(10^{-21})$ so it's easy to be $>100\%$ off.

ucc,qft,ZZZZZZZZZZ,4.7704895589362195e-18,4.771491994589455e-18,4759.8985323285315,-1.002435653235501e-21
qiskit,qft,ZZZZZZZZZZ,8.673617379884035e-19,8.68364173641639e-19,866.2542786051877,-1.002435653235501e-21
pytket,qft,ZZZZZZZZZZ,-3.2526065174565133e-18,3.2516040818032778e-18,3243.703544769454,-1.002435653235501e-21
cirq,qft,ZZZZZZZZZZ,2.168404344971009e-19,2.178428701503364e-19,217.31356965129692,-1.002435653235501e-21

Not sure what the best course of action here is.

@jordandsullivan
Copy link
Collaborator

jordandsullivan commented Jan 15, 2025

Okay just as a sanity check can you plot the simulated and ideal expectation values and standard deviation, similar to what I did here for #58 (where I was running on real hardware)?
Screenshot 2025-01-15 at 1 10 15 PM

This reminds me, we can also simply measure an array of observables in addition to ZZZZZZ like I did there. Maybe just adding in some that measure like XIIIIIII or XXXXXZZZZZ, etc.

Copy link
Collaborator

@jordandsullivan jordandsullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, a few suggestions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why the relative errors are in the hundreds? What are the actual expectation values we're getting? Are they what we'd expect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we shouldn't use the same observable on all circuits if it is giving answers that don't seem meaningful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per @willzeng , we want to split off the customization of specific observables with the different benchmarks into a separate issue. Can complete this issue with the current all Z observable.

benchmarks/average_relative_error_over_time.png Outdated Show resolved Hide resolved
benchmarks/latest_expval_benchmark_by_compiler.png Outdated Show resolved Hide resolved
benchmarks/latest_relative_absolute_errors_by_circuit.png Outdated Show resolved Hide resolved
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little suspicious how the QAOA, QV, and QCNN circuit results are all basically identical. We should make sure this is real and not an artifact of how we perform simulation (or something else)!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you plot the expectation values themselves and standard deviations as suggested above?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's compare the compiled gate counts between the different compilers in case they are returning approximately the same circuits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add expectation value plot to GH benchmarks pipeline
2 participants