Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Boogie test deadlocks #932

Merged
merged 20 commits into from
Aug 13, 2024

Conversation

keyboardDrummer
Copy link
Collaborator

@keyboardDrummer keyboardDrummer commented Aug 8, 2024

Changes

  • Made a change to the Dispose method of CustomStackSizePoolTaskScheduler.cs, so it now interrupts all its thread instead of joining them. I think this was causing the entire Boogie process to hang after it was finished, causing tests to deadlock.
  • Add blame-hang-timeout to the dotnet test invocation that runs the NUnit tests, which allows identifying which test fails when the unit tests timeout.
  • Add an undocumented option /processTimeLimit that can be used by tests to let Boogie stop itself after a certain timespan, allowing developers to get a stacktrace of where Boogie was at the point of timeout.
  • Remove APIs for cancelling requests, which were only available programmatically and not used by Dafny.
  • Fix a bug that could cause a concurrency related exception in QuantifierInstantiationInfo, which was introduced by [Civl] Improvements to pool-based quantifier instantiation #862.
  • Use an invariant culture somewhere in printing debug output, so the Boogie tests locally pass on my machine, despite my European culture.
  • Some renaming of variations of VerifyImplementation, to make the differences more clear.
  • Configure a 30 seconds timelimit per VC for all Boogie tests by default. Added an exception for a few slow tests
  • Let Boogie cancel checking of a VC after its timelimit has passed, regardless of what the solver is doing

Testing

  • Did not add any additional tests
  • Let the test-suite succeed without retries 4 times, in an attempt to determine that it resolves the test instabilities we've had.

@keyboardDrummer keyboardDrummer changed the title Set timeout to see where deadlocks are coming from Set timeouts in Boogie to see where deadlocks are coming from Aug 8, 2024
@keyboardDrummer keyboardDrummer requested review from atomb and shazqadeer and removed request for atomb August 9, 2024 09:17
@keyboardDrummer keyboardDrummer marked this pull request as ready for review August 9, 2024 09:17
@keyboardDrummer keyboardDrummer marked this pull request as draft August 9, 2024 09:33
@keyboardDrummer keyboardDrummer changed the title Set timeouts in Boogie to see where deadlocks are coming from Resolve Boogie test deadlocks Aug 9, 2024
Copy link
Collaborator

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a nice improvement. I saw two tiny things in the tests that might be worth a second look.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to check in this file?

@@ -1,5 +1,5 @@
// We use boogie here because parallel-boogie doesn't work well with -proverLog
// RUN: %boogie -rlimit:800000 -proverLog:"%t.smt2" "%s"
// RUN: %boogie -processTimeLimit:0 -timeLimit:0 -rlimit:800000 -proverLog:"%t.smt2" "%s"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth having a process time limit here, too, since there's no strict guarantee that Z3 will ever terminate given a resource limit.

@keyboardDrummer keyboardDrummer marked this pull request as ready for review August 10, 2024 08:19
@keyboardDrummer keyboardDrummer requested a review from atomb August 10, 2024 08:19
@keyboardDrummer keyboardDrummer enabled auto-merge (squash) August 10, 2024 12:28
@keyboardDrummer keyboardDrummer mentioned this pull request Aug 12, 2024
@shazqadeer
Copy link
Contributor

@keyboardDrummer : Thanks for digging into the flaky CI issues. I had also noticed these small tests failing in CI with a timeout and had no idea what was going on.

Fix a bug that could cause a concurrency related exception in QuantifierInstantiationInfo, which was introduced by #862.

To improve my understanding, could you say a bit more about the bug introduced by my aforementioned PR?

@keyboardDrummer
Copy link
Collaborator Author

To improve my understanding, could you say a bit more about the bug introduced by my aforementioned PR?

The fix is in this commit: c79011b
While running tests, I saw a concurrent modification exception during a call to HashSet.Add done from FindInstantiationHints. Looking at the git history, FindInstantiationHints was originally designed to be able to be called concurrently, but since the introduction of the HashSet it no longer supported this. I replaced the use of HashSet with a ConcurrentBag so it can handle concurrent Add calls.

@keyboardDrummer keyboardDrummer merged commit 004e283 into boogie-org:master Aug 13, 2024
5 checks passed
@keyboardDrummer keyboardDrummer deleted the setTimeout branch August 13, 2024 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants