Skip to content

Job Retry

Grant Carthew edited this page Aug 16, 2016 · 24 revisions

When creating job objects you can configure the timeout, retryMax, and retryDelay options. These options will determine what will happen if a job fails either by a Node.js process not responding or the job process failing. See the Job Options document for more detail.

Every job has a property called dateRetry which is used to determine if the job is ready for processing after a failure has occurred. The dateRetry value is set when the job is retrieved from the database for processing. The retrieval query will not return jobs where the dateRetry value is greater than the current date time value.

Currently the formula used to set the dateRetry value during the job retrieval process is:

now() + job.timeout + ( job.retryDelay * job.retryCount )

The plan in the future is to move this to an exponential formula once RethinDB has a power function.

As you can see, to disable the retry process and make jobs retry as soon as possible, simply set the retryDelay to zero.

If we take the job default properties and the Queue Master default 'masterInterval' value, which is 310 seconds, then the following sequence of events will occur:

  1. The job has never been processed and has default properties. It has been added to the queue.

    • status = 'added'
    • timeout = 300
    • retryCount = 0
    • retryMax = 3
    • retryDelay = 600
  2. The job is retrieved from the database setting the dateRetry value.

-   `status = 'active'`
-   `timeout = 300`
-   `retryCount = 0`
-   `retryMax = 3`
-   `retryDelay = 600`
-   `dateRetry = now() + timeout`
  1. The job fails for some reason.
-   `status = 'failed'`
-   `timeout = 300`
-   `retryCount = 1`
-   `retryMax = 3`
-   `retryDelay = 600`
  1. The job remains inactive within the database until after dateRetry or approximately 300 seconds.

  2. The Queue Master database review is initiated and the job is retrieved from the database for the first retry.

-   `status = 'active'`
-   `timeout = 300`
-   `retryCount = 1`
-   `retryMax = 3`
-   `retryDelay = 600`
-   `dateRetry = now + timeout + (retryDelay * retryCount)`
  1. The job fails again for some reason.
-   `status = 'failed'`
-   `timeout = 300`
-   `retryCount = 2`
-   `retryMax = 3`
-   `retryDelay = 600`
  1. The job remains inactive within the database until after dateRetry or approximately 900 seconds.

  2. The Queue Master database review is initiated and the job is retrieved from the database for the second retry.

-   `status = 'active'`
-   `timeout = 300`
-   `retryCount = 2`
-   `retryMax = 3`
-   `retryDelay = 600`
-   `dateRetry = now + timeout + (retryDelay * retryCount)`
  1. The job fails again. What is wrong with this job?
-   `status = 'failed'`
-   `timeout = 300`
-   `retryCount = 3`
-   `retryMax = 3`
-   `retryDelay = 600`
  1. The job remains inactive within the database until after dateRetry or approximately 1500 seconds.

  2. The Queue Master database review is initiated and the job is retrieved from the database for the third and final retry.

-   `status = 'active'`
-   `timeout = 300`
-   `retryCount = 3`
-   `retryMax = 3`
-   `retryDelay = 600`
-   `dateRetry = now + timeout + (retryDelay * retryCount)` _this is redundant however still set_
  1. The job fails for the last time.
-   `status = 'terminated'`
  1. Because the job status is set to terminated it will no longer be retrieved from the database.

Job Progress

As a final note, please review the Job.progress document. This document explains how your job handling function can report progress updates. These progress updates will extend the job timeout counter and update the dateRetry property. This will prevent the job from failing due to extended job processing time.

Main

How It Works

Contributing

API

Queue Methods

Queue Properties

Queue Events

Job Methods

Job Properties

Documentation

Clone this wiki locally