You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The block generation job has custom output logic to allow each reducer to output to multiple block files.
When speculative execution is enabled, this can result in two copies of the same block file being generated (one of which may be incomplete). This can be worked around by setting mapreduce.reduce.speculative = false.
When a reducer attempt fails, the partial output files will not be cleaned up. I'm not aware of an easy workaround for this beyond manually cleaning up the files after the job completes.
We should have each reducer use a staging directory and only move the output files when it completes.
The text was updated successfully, but these errors were encountered:
The block generation job has custom output logic to allow each reducer to output to multiple block files.
When speculative execution is enabled, this can result in two copies of the same block file being generated (one of which may be incomplete). This can be worked around by setting
mapreduce.reduce.speculative = false
.When a reducer attempt fails, the partial output files will not be cleaned up. I'm not aware of an easy workaround for this beyond manually cleaning up the files after the job completes.
We should have each reducer use a staging directory and only move the output files when it completes.
The text was updated successfully, but these errors were encountered: