-
Notifications
You must be signed in to change notification settings - Fork 16
Job Retry
When creating jobs objects you have the option of setting the retryMax
and retryDelay
values. Both of these values will determine what will happen if a job fails either by a worker node not responding or the job process failing.
Every job has a property called dateRetry
which is used to determine if the job is ready for processing after a failure has occurred. The dateRetry
value is set when the job is retrieved from the database for processing. The retrieval query will not return jobs where the dateRetry
value is greater than the current date time value.
Currently the formula used to set the dateRetry
value during the job retrieval process is:
now() + job.timeout + ( job.retryDelay * job.retryCount )
The plan in the future is to move this to an exponential formula once RethinDB
has a power
function.
As you can see, to disable the retry process and make jobs retry as soon as possible, simply set the retryDelay
to zero.
If we take the job default properties and the Master Queue default masterReviewPeriod
which is 310 seconds, then the following sequence of events will occur:
-
The job has never been processed and has default properties.
status = 'waiting'
timeout = 300
retryCount = 0
retryMax = 3
retryDelay = 600
-
The job is retrieved from the database setting the
dateRetry
value.
- `status = 'active'`
- `dateRetry = now() + timeout`
- The job fails for some reason.
- `status = 'retry'`
- `retryCount = 1`
-
The job remains inactive within the database until after 300 seconds.
-
The Master Queue database review is initiated and the job is retrieved from the database for the first retry.
- `status = 'active'`
- `dateRetry = now + timeout + (retryDelay * retryCount)`
- The job fails again for some reason.
- `status = 'retry'`
- `retryCount = 2`
-
The job remains inactive within the database until after 900 seconds.
-
The Master Queue database review is initiated and the job is retrieved from the database for the second retry.
- `status = 'active'`
- `dateRetry = now + timeout + (retryDelay * retryCount)`
- The job fails again. What is wrong with this job?
- `status = 'retry'`
- `retryCount = 3`
-
The job remains inactive within the database until after 1500 seconds.
-
The Master Queue database review is initiated and the job is retrieved from the database for the third and final retry.
- `status = 'active'`
- `dateRetry = now + timeout + (retryDelay * retryCount)` _this is redundant however still set_
- The job fails for the last time.
- `status = 'failed'`
- Because the job status is set to failed it will no longer be retrieved from the database.
- Introduction
- Tutorial
- Queue Constructor
- Queue Connection
- Queue Options
- Queue PubSub
- Queue Master
- Queue Events
- State Document
- Job Processing
- Job Options
- Job Status
- Job Retry
- Job Repeat
- Job Logging
- Job Editing
- Job Schema
- Job Name
- Complex Job
- Delayed Job
- Cancel Job
- Error Handling
- Queue.createJob
- Queue.addJob
- Queue.getJob
- Queue.findJob
- Queue.findJobByName
- Queue.containsJobByName
- Queue.cancelJob
- Queue.reanimateJob
- Queue.removeJob
- Queue.process
- Queue.review
- Queue.summary
- Queue.ready
- Queue.pause
- Queue.resume
- Queue.reset
- Queue.stop
- Queue.drop
- Queue.Job
- Queue.host
- Queue.port
- Queue.db
- Queue.name
- Queue.r
- Queue.id
- Queue.jobOptions [R/W]
- Queue.changeFeed
- Queue.master
- Queue.masterInterval
- Queue.removeFinishedJobs
- Queue.running
- Queue.concurrency [R/W]
- Queue.paused
- Queue.idle
- Event.ready
- Event.added
- Event.updated
- Event.active
- Event.processing
- Event.progress
- Event.log
- Event.pausing
- Event.paused
- Event.resumed
- Event.completed
- Event.cancelled
- Event.failed
- Event.terminated
- Event.reanimated
- Event.removed
- Event.idle
- Event.reset
- Event.error
- Event.reviewed
- Event.detached
- Event.stopping
- Event.stopped
- Event.dropped
- Job.setName
- Job.setPriority
- Job.setTimeout
- Job.setDateEnable
- Job.setRetryMax
- Job.setRetryDelay
- Job.setRepeat
- Job.setRepeatDelay
- Job.updateProgress
- Job.update
- Job.getCleanCopy
- Job.addLog
- Job.getLastLog