Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust queues to prioritize account syncs, handle missing current day values #1682

Merged
merged 1 commit into from
Jan 24, 2025

Conversation

zachgoll
Copy link
Collaborator

@zachgoll zachgoll commented Jan 24, 2025

There have been several complaints in #1524 of how certain account graphs do not include the latest day's balance. While I have not been able to reproduce this, I think a potential cause is self-hosted queues skipping the background jobs that generate the balances.

Insufficient DB connections

Currently, we are running self-hosted GoodJob jobs in async mode, which means the queue is running in the same process as the web server. This is sufficient for most self hosted instances since the number of requests to the server are very low (~1-5 total users).

That said, when in async mode, the required DB pool size increases:

  • 3 connections for the Puma web server
  • 5 connections for GoodJob queue threads
  • 3 connections for GoodJob listener, cron, and executor

This means we need 11 total DB connections to support a self hosted instance. My guess is that the missing / sporadic graph balances is a result of jobs exhausting the connection pool, timing out, and not syncing the latest balances.

This PR increases the default DB pool size to 11 to accommodate this.

If running GoodJob in external mode (default for production, and used for our hosted instance), each process requires a different number of DB connections:

DB_POOL_SIZE=8 bundle exec good_job start
DB_POOL_SIZE=3 bundle exec rails server

Loading data state

To further protect against the case where an account is missing / not properly syncing the latest balances, I have added a "loading" state to the graphs to let the user know that we're still calculating the latest balances.

In an ideal state, this shouldn't be shown often because we have logic to sync user accounts immediately when they login. The thinking here is that we'd rather show "loading" than an invalid balance graph that leads a user to think their finances are something different than they actually are.

CleanShot 2025-01-24 at 13 24 48

Queue latencies

In addition to the above changes, I have implemented 3 queues based on latency:

  • latency_low - jobs that take ~30 seconds or less
  • latency_medium - jobs that take 1-2 minutes (i.e. Sync accounts job)
  • latency_high - jobs that take 5+ minutes (i.e. EnrichDataJob)

This should drastically speed up account syncs by offloading the expensive and slow EnrichDataJob to a separate worker pool entirely.

@zachgoll zachgoll linked an issue Jan 24, 2025 that may be closed by this pull request
@zachgoll zachgoll merged commit 3140835 into main Jan 24, 2025
5 checks passed
@zachgoll zachgoll deleted the 1524-bug-some-graphs-do-not-show-up-to-current-date branch January 24, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Some graphs do not show up to current date
1 participant