Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve shutdown logic #5514

Conversation

jtorin
Copy link
Contributor

@jtorin jtorin commented Nov 16, 2023

Summary

  • Don't execute updater teardown logic twice on shutdown.
  • Consistently handle the edge-case where the JVM is already on its way to terminate when the shutdown code is scheduled.
  • Improve Azure ServiceBus subscriptions removal.

Details on the implementation and answers to questions that may arise while reading it

  • The code design would for sure look more satisfactory if just teardown() was called on each updater in a simple loop.
  • However, some of the updaters performs external remote calls, which may run for a long time - potentially never succeeding.
  • If executed synchronously in a serialized matter, this could mean that some updaters never get the chance to execute.
  • Therefore, these updaters could/should execute their code in a thread. Currently this is done by scheduling their own JVM shutdown hook, which makes them asynchronous.
  • The issue (that this PR solves) is that if the teardown() method is used as the payload in the shutdown hook it will be executed twice: once by the JVM shutdown hook and once by the GraphUpdaterManager calling teardown() as part of the normal shutdown process.
  • In the case of at least the AzureSiriUpdater, this leads to error messages and dreary stacktraces.
  • Since the GraphUpdaterManager loop is done synchronously, this may delay the shutdown.
  • ApplicationShutdownSupport has been applied to a couple of other usages as well, closing the window when the JVM is already shutting down.
  • Empty teardown() methods have been removed (there is an empty default implementation).
  • The main OTP shutdown is now executed immediately if the JVM is currently shutting down. This could potentially be controversial.
  • The event processor in AzureSiriUpdater is now closed instead of stopped, suppressing errors when removing subscriptions.

Bonus question:

  • How is WebsocketGtfsRealtimeUpdater terminated? Seems like an endless while(true) loop.

Unit tests

This is mostly untestable (for once).

Johan Torin added 7 commits November 16, 2023 09:44
Call close() instead of stop() on the event processor to make
sure it has stopped using the subscription before removal.
Otherwise errors may lead to stale subscriptions lingering.
At least in the case of the AzureSiriUpdater this leads to errors the second time.

* Provide an alternative API for adding system shutdown hooks that takes a Runnable directly.
* State the shutdown logic directly to the shutdown hook, remove the teardown method.
* Handle the execute-hook-directly-if-shutting-down logic internally in ApplicationShutdownSupport.
This avoids running it twice on shutdown.
Notice that this adds immediate execution in the case that the JVM is already shutting down.
If the JVM is already shutting down, the graceful shutdown of the HTTP server should
be executed immediately. Previously the server could potentially be started anyway.
If the JVM is already shutting down when the hook is scheduled, the message informing of
this is logged immediately, previously it could potentially be lost.
@jtorin jtorin requested a review from a team as a code owner November 16, 2023 08:58
Copy link

codecov bot commented Nov 16, 2023

Codecov Report

Attention: 29 lines in your changes are missing coverage. Please review.

Comparison is base (2686fcb) 66.91% compared to head (eafd577) 66.92%.

Files Patch % Lines
...mework/application/ApplicationShutdownSupport.java 0.00% 9 Missing ⚠️
...er/ext/siri/updater/SiriETGooglePubsubUpdater.java 0.00% 6 Missing ⚠️
...t/siri/updater/azure/AbstractAzureSiriUpdater.java 0.00% 6 Missing ⚠️
...entripplanner/standalone/server/GrizzlyServer.java 0.00% 4 Missing ⚠️
...n/java/org/opentripplanner/standalone/OTPMain.java 0.00% 2 Missing ⚠️
...org/opentripplanner/standalone/OtpStartupInfo.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             dev-2.x    #5514   +/-   ##
==========================================
  Coverage      66.91%   66.92%           
+ Complexity     15691    15690    -1     
==========================================
  Files           1819     1819           
  Lines          70233    70228    -5     
  Branches        7392     7391    -1     
==========================================
- Hits           46998    46997    -1     
+ Misses         20777    20772    -5     
- Partials        2458     2459    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@t2gran t2gran requested a review from vpaturet November 16, 2023 14:46
@vpaturet
Copy link
Contributor

This makes the shutdown logic much simpler.
I tested locally and did not see any issue, but this is inherently difficult to test.
Is this code already running on your systems?

@leonardehrenfried
Copy link
Member

Skane will test this on their test/prod environements and report back.

@t2gran
Copy link
Member

t2gran commented Nov 21, 2023

Nice! I like it :-)

@t2gran
Copy link
Member

t2gran commented Nov 21, 2023

How is WebsocketGtfsRealtimeUpdater terminated? Seems like an endless while(true) loop.

The code is from 2013, are someone using this?

@leonardehrenfried
Copy link
Member

The websocket updater was originally written for the Dutch project and as far as I'm aware not used anywhere.

It had to be rewritten in #3900 and we couldn't find anyone that could test it.

@Arilith Are you using it in the Netherlands?

@Arilith
Copy link
Contributor

Arilith commented Nov 23, 2023

The websocket updater was originally written for the Dutch project and as far as I'm aware not used anywhere.

It had to be rewritten in #3900 and we couldn't find anyone that could test it.

@Arilith Are you using it in the Netherlands?

Personally, I do not know of anyone. I'd also like to tag @skinkie for completeness sake.

@skinkie
Copy link
Contributor

skinkie commented Nov 23, 2023

@sven4all is Moop using this interface? It was used before.

@leonardehrenfried
Copy link
Member

This is ready to merge.

@jtorin jtorin merged commit 8904671 into opentripplanner:dev-2.x Dec 12, 2023
5 checks passed
@jtorin jtorin deleted the updater-shutdown-azure-subscriptions-removals branch December 12, 2023 11:13
t2gran pushed a commit that referenced this pull request Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants