Skip to content

Latest commit

 

History

History
507 lines (364 loc) · 21.8 KB

Appendix-Compound-Mode-Prediction.md

File metadata and controls

507 lines (364 loc) · 21.8 KB

Compound Mode Prediction Appendix

1. Description of the algorithm

The general idea behind compound prediction is to generate a weighted average of two different predictions of the same block to develop a final prediction. Let Prediction_1 and Prediction_2 denote two different predictions of the same block. Sample p(i,j) in the compound prediction is then generated using sample p1(i,j) from Prediction_1, sample p2(i,j) from Prediction_2 and weight m(i,j) as follows:

p(i,j) = m(i,j)p1(i,j) + (1-m(i,j))p2(i,j)

Figure 1 illustrates the process of generating compound mode predictions.

comp_mode_pred_fig1

Figure 1. Compound mode prediction generation.

Four different compound prediction types are supported:

  • Inter-Intra prediction: The mask (i.e. weights) are based on sample position relative to the block boundary.

  • Wedge prediction: The mask is based on a wedge codebook. Could be inter-inter or inter-intra prediction.

  • Distance-weighted compound prediction: The weights are based on the distance between the current frame and the reference frame.

  • Difference-weighted compound prediction: The weights are based on the difference between the two inter predictions.

Compound Inter-Intra Prediction

The compound inter-intra prediction mode is useful in blocks that contain previously occluded areas. Inter prediction is usually preferred for non-occluded content, whereas intra prediction is helpful in uncovered areas. A combined inter/intra prediction helps generate predictions for such cases that take advantage of the benefits of both inter prediction and intra prediction. Only H_PRED, V_PRED, DC_PRED, and SMOOTH_PRED intra modes are supported. The mask for the intra prediction P1(i, j) applies a smoothly decaying weight in the direction of intra prediction. The mask is inferred from a primitive 128-tap 1-D decaying function​ ii_weights1d(.).

  • math = 0.5,

  • math = ​ii_weights1d​(a*j),

  • math = ii_weights1d ​(a*i),

  • math​ = ii_weights1d​(a*min(i, j)).

where a = 128/size_of_long_edge(block_size) and where math, math, math and math are the masks for the inter-intra smooth modes involving the DC, horizontal, vertical and smooth intra prediction modes, respectively. The array ​ii_weights1d is given below

ii_weights1d(.):
60, 58, 56, 54, 52, 50, 48, 47, 45, 44, 42, 41, 39, 38,
37, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23,
22, 22, 21, 20, 19, 19, 18, 18, 17, 16, 16, 15, 15, 14,
14, 13, 13, 12, 12, 12, 11, 11, 10, 10, 10, 9, 9, 9, 8,
8, 8, 8, 7, 7, 7, 7, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 4,
4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1

Compound Wedge Prediction

The general idea in compound wedge prediction is to generate a better prediction of areas around edges by combining two different predictions of the block. The feature makes use of a wedge codebook where wedge orientations are either horizontal, vertical or oblique with slopes: 2, -2, 0.5 and -0.5 for square and rectangular blocks, as shown in Figure 2.

Wedge_Codebook_2020_01_11

Figure 2. Wedge Codebook.

Using two predictions Prediction_1 and Prediction_2 for the block, a final prediction p(i,j) for sample (i,j) in the block is generated by weighting the two predictions:

p​(i, j) = m(i, j) p1​(i, j) + ( ​1 ​ - m(i, j)) p2​(i, j)

where m(i,j) is a function of the distance of the pixel to the wedge line. The two predictors could be both inter or one inter and one intra where intra modes are constrained to be either DC, V, H or Smooth.

Difference-based Compound Prediction

The difference-based compound prediction mode addresses cases where wedge prediction is not good enough due to for example, non-straight moving edge in a block. It considers two different predictions Prediction_1 and Prediction_2 of the same block, computes the pixel-wise differences between the two predictions, generates masks for each of the two predictions based on the computed pixel-wise differences, and applies the mask to the two predictions to generate the final prediction for the block.

The mask for sample (i, j) is given by m(i,j) = b + a * |p​1​(i, j) - p​2​(i, j)|, where b controls the strength of the weight, a is used to smooth the variation of the mask values around b.

The prediction is generated using p​(i, j) = m(i, j) * p1​(i, j) + ( ​1 ​ - m(i, j)) * p​2​(i, j). Both the mask m(i,j) and (1-m(i,j)) are evaluated and the one that provides the best RD cost is selected.

Distance-based Compound Prediction

In distance-based prediction mode, the weighting applied to the two inter predictions is a function of the distance between the reference frames and the current frame. The idea is to provide more weight to the prediction from the closer reference frame. Let d0 and d1 denote the distances from the current frame to the forward and backward reference pictures. The weights depend on the ratio d1/d0, and on a set of the thresholds for the ratio. Let fwd_offset and bck_offset be the weights used in the distance-based compound prediction. Then we have the following:

  • Case where d0>d1: The fwd_offset and bck_offset weights correspond to the largest Threshold value for which d1/d0>Threshold is true.

  • Case where d0<=d1: The fwd_offset and bck_offset weights correspond to the smallest Threshold value for which d1/d0<Threshold is true.

  • When d0=0 or d1=0, if d0<=d1 then fwd_offset = 13 and bck_offset = 3, else fwd_offset = 3 and bck_offset = 13

Table 1 below provides the weights as a function of d1/d0.

Table 1. fwd_offset and back_offset as a function of the ration d1/d0 for the case where d0>0 and d1>0.

comp_mode_pred_table1

2. Implementation of the algorithm

Control macros/flags:

Table 2. Inter-intra-related control flags.

comp_mode_pred_table2

Table 3. Control flag for wedge prediction.

comp_mode_pred_table3

Table 4. Control flags related to inter-inter compound mode prediction.

comp_mode_pred_table4

Details of the implementation

The main function calls associated with compound mode prediction in mode decision are indicated in Figure 3.

comp_mode_pred_fig5

Figure 3. Function calls associated with compound mode prediction in mode decision.

The generation of coded blocks using the compound mode involves three main steps, namely the injection of the compound mode candidates, the processing of those candidates in MD stages 0 to 3, and the final encoding of selected compound mode candidates in the encode pass.

Step 1: Injection of compound mode candidates.

The three main functions associated with compound mode prediction at the candidate injection stage are precompute_intra_pred_for_inter_intra, inter_intra_search and determine_compound_mode. The first two are related to the generation of inter-intra compound candidates. The third is related to the injection of inter-inter compound candidates.

  1. Precompute_intra_pred_for_inter_intra

The function generates for a given block DC, Vertical, Horizontal and Smooth intra predictions that would be used in subsequent stages in the compound mode candidate injection process.

  1. Inter_intra_search

For a given block, the generation of inter-intra wedge prediction and the smooth inter-intra prediction is performed using the function inter_intra_search. The function is invoked only for the case of single reference inter predictions. The steps involved in the inter-intra search are outlined below.

  • Perform inter prediction through the function call av1_inter_prediction. Only luma prediction is computed.

  • Determine if wedge prediction could be used for the given block size using the function is_interintra_wedge_used. Only 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, 32x32, 8x32 and 32x8 block sizes are allowed.

  • Enable the flag enable_smooth_interintra.

  • Loop over the intra prediction modes: II_DC_PRED, II_V_PRED, II_H_PRED, II_SMOOTH_PRED

    • Perform smooth filtering of the inter prediction and the intra prediction through the function call combine_interintra_highbd or combine_interintra based on the already computed inter predictions and intra predictions. The intra predictions are already generated in the function precompute_intra_pred_for_inter_intra.

    • Compute the associated RD cost and keep track of the best RD cost and the corresponding intra prediction mode.

  • Perform inter-intra wedge prediction based on the best intra prediction mode from the smooth intra search step above using the function pick_interintra_wedge. The details of the function are included below.

pick_interintra_wedge: Determines the best wedge option in the inter-intra wedge prediction. Returns the wedge index and its associated cost.

  • The search is allowed only for blocks sizes 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, 32x32, 8x32 and 32x8. (is_interintra_wedge_used)

  • Compute the residual for intra prediction and the difference between the inter prediction and the intra prediction. (aom_highbd_subtract_block / aom_subtract_block)

  • Determine the best wedge option to use based on the above computed residuals and difference. (pick_wedge_fixed_sign). The details of the function are included below.

pick_wedge_fixed_sign: Determines the best wedge option for a fixed wedge sign (0).

  • Check if inter_intra wedge is allowed, as described above. (is_interintra_wedge_used)

  • Loop over the available edge prediction options

    • Determine the mask associated with the current wedge option. (av1_get_contiguous_soft_mask)

    • Compute the corresponding prediction residuals based on the intra prediction residual and the difference between the inter prediction residuals and the intra prediction residuals. (av1_wedge_sse_from_residuals)

    • Compute the R-D cost and keep track of the best option. (pick_wedge_fixed_sign and other computations.)

  1. Determine_compound_mode

The main function calls starting at Determine_compound_mode are outlined in Figure 4.

comp_mode_pred_fig6

Figure 4. Continuation of Figure 3 showing the main function calls starting with determine_compound_mode.

The generation of COMPOUND_WEDGE and COMPOUND_DIFFWTD predictions is performed using the function Determine_compound\_mode, which calls the function search_compound_diff_wedge. The rest of the details are outlined in the following.

For a given block, the generation of the single reference inter predictions is performed in the function av1_inter_prediction / av1_inter_prediction_hbd. Only luma predictions are generated.

Generate the residuals associated with the prediction from List1 reference picture, as well as the difference between the residuals corresponding to the predictions from List0 and List1 reference pictures, respectively.

In the function pick_interinter_mask, in the case of COMPOUND_WEDGE, the function pick_interinter_wedge is called. In the case of COMPOUND_DIFFWTD, the function pick_interinter_seg is called.

pick_interinter_wedge generates the prediction for the case of inter-inter COMPOUND_WEDGE and updates the best COMPOUND_WEDGE prediction mode and corresponding cost. This is allowed only for block sizes 8x8, 8x16, 16x8, 16x16, 16x32, 32x16, 32x32, 8x32 and 32x8. In this function, both the nominal mask and its inverse are evaluated and the best mask is selected. The best mask also indicated the mask sign.

pick_interinter_seg generates the prediction for the case of inter-inter COMPOUND_DIFFWTD and updates the best COMPOUND_DIFFWTD mask. Block size should be at least 8x8 for bipred to be allowed.

As an example, consider the flow below for the function inject_mvp_candidates_II

  1. Check if compound reference mode is allowed, i.e. The candidate should not be a single-reference candidate and the block size should be at least 8x8 for bipred to be allowed.

  2. Determine the number of compound modes to try:

    • If 8x8 <= block size <= 32x32, then compound modes to try = compound_types_to_try

    • else

      • If (compound_types_to_try == MD_COMP_WEDGE)

        compound modes to try = MD_COMP_DIFF0

      • else compound modes to try = compound_types_to_try

  3. Optimize further the number of modes to evaluate based on the variance of the source block. If the variance of the source block is smaller than a given threshold (inter_inter_wedge_variance_th), then MD_COMP_WEDGE is not considered in the search and compound modes to try is limited to MIN(compound modes to try, MD_COMP_DIFF0)

  4. Single reference case

    • Check if inter-intra is allowed: svt_is_interintra_allowed

      • enable_inter_intra flag should be set.

      • Block size should at least 8x8 and at most 32x32.(is_interintra_allowed_bsize)

      • Only NEARESTMV, NEARMV, GLOBALMV and NEWMV modes are allowed. (is_interintra_allowed_mode)

      • (rf[0] > INTRA_FRAME) && (rf[1] <= INTRA_FRAME). (is_interintra_allowed_ref);

      If inter_intra is allowed, the total number of candidates to check is 3 (Single-reference inter mode, inter-intra wedge, smooth_inter-intra), else it set to 1 (only Single-reference inter mode).

    • Loop over the NEARESTMV candidate and all the NEARMV candidates.

      • Update the candidate parameters.

      • Determine the intra prediction mode that yields the best smooth inter-intra prediction, and determine the best inter-intra wedge prediction option based on the best intra prediction mode from the smooth inter-intra prediction search. (inter_intra_search)

  5. Compound reference case

    For all NEARESTMV_NEARESTMV and NEAR_NEARMV candidates, loop over all selected compound prediction modes

    • Update the candidate parameters

    • Determine the best wedge option for the case of COMPOUND_WEDGE or the best difference weighted prediction mask for the case of COMPOUND_DIFFWTD. (pick_interinter_mask)

Step 2: Generate compound mode candidates in MD stages 0, 1 and 2.

The two main functions involved in generating compound mode candidates in MD stages 0, 1 and 2 are warped_motion_prediction and av1_inter_prediction.

comp_mode_pred_fig7

Figure 5. Continuation of Figure 3 showing the main function calls associated with compound modes in the case of warped motion prediction.
  1. warped_motion_prediction
  • plane_warped_motion_prediction: Generates the luma and chroma warped luma predictions. The chroma predictions are generated for blocks that are 16x16 or larger.
    • av1_dist_wtd_comp_weight_assign: Returns forward offset and backward offset for the case of compound reference candidates and where the inter-inter compound prediction mde is COMPOUND_DISTWTD. The forward offset and backward offset are used as weights in the generation of the final prediction.
    • av1_make_masked_warp_inter_predictor: Called only in the case of compound reference candidate where the inter-inter compound type is COMPOUND_WEDGE or COMPOUND_DIFFWTD. Generates the predictions for both of those two compound types. The first step is to build the mask for the case of the COMPOUND_DIFFWTD inter-inter compound type using the function av1_build_compound_diffwtd_mask_d16. The next step is to generate the predictions using the function build_masked_compound_no_round as follows:
      • The function av1_get_compound_type_mask is called and returns the mask for either the case of COMPOUND_DIFFWTD or for the case of COMPOUND_WEDGE. The function av1_get_contiguous_soft_mask returns the mask for the case of COMPOUND_WEDGE. For the case of COMPOUND_DIFFWTD, the mask is computed in the step above.
      • The function aom_highbd_blend_a64_d16_mask/ aom_lowbd_blend_a64_d16_mask is the called to perform the blending of the two inter predictions using the generated mask.
    • eb_av1_warp_plane is invoked in the case of BIPRED where inter-inter compound type is COMPOUND_DISTWTD. In this case the function highbd_warp_plane / warp_plane is called and in turn calls the function eb_av1_highbd_warp_affine / eb_av1_warp_affine. The latter applies the affine transform and generates the warped motion prediction using the forward offset and backward offset weights associated with the COMPOUND_DISTWTD mode.
  • chroma_plane_warped_motion_prediction_sub8x8: Generates chroma warped motion predictions for blocks that are smaller than 16x16. The function av1_dist_wtd_comp_weight_assign is first called to generate the mask for the COMPOUND_DISTWTD case. The appropriate function in the function array convolve[][][] / convolveHbd[][][] is then called to generate the prediction using the forward offset and the backward offset weights.
  1. av1_inter_prediction

comp_mode_pred_fig8

Figure 6. Continuation of Figure 3 showing the main function calls in av1_inter_prediction associated with the compound mode.

In the case where inter prediction motion mode is different from WARPED_CAUSAL, then the function av1_inter_prediction is called to generate the inter prediction. The main function calls associated with compound mode prediction are av1_dist_wtd_comp_weight_assign, av1_make_masked_inter_predictor and combine_interintra, which are described above.

Step 3: Generate the final compound mode predictions in the encode pass. The two main relevant functions are warped_motion_prediction and av1_inter_prediction. The two functions are described above.

3. Optimization of the algorithm

Inter-intra prediction

The settings for the different flags associated with inter-intra prediction mode are outlined in Table 5 below.

Table 5. Optimization settings for inter-intra compound prediction.

comp_mode_pred_fig9

The flag md_enable_inter_intra is used to control when the inter-intra modes are allowed as a function of the PD pass and of the flag enable_inter_intra. The latter is active in the default mode only for MR mode and for the M0 preset, otherwise it would active only if the config flag inter_intra_compound is active.

Inter-inter compound prediction

The flags compound_level and compound_mode control the complexity-quality tradeoff of the inter-inter compound prediction modes.

Table 6. Settings for compound_level in inter-inter compound prediction.

comp_mode_pred_fig10

Table 7. Optimization settings for the inter-inter compound prediction.

comp_mode_pred_fig11

The flag compound_types_to_try indicates the inter-inter compound mode to evaluate as a function of the PD pass and of the flag picture_control_set_ptr->parent_pcs_ptr->compound_mode. The setting for the latter in the default mode depends on sequence_control_set_ptr->compound_mode, the encoder preset and the sc_content_detected flag; otherwise, it is set to the config input value compound_level. The flag sequence_control_set_ptr->compound_mode depends in the default configuration on the encoder preset, otherwise it set to the config input value compound_level.

Inter-inter wedge prediction

For the case of inter-inter wedge prediction, the flag wedge_mode decides on the tradeoff between complexity and quality for inter-inter wedge prediction. The settings for the flag are given in Table 8.

Table 8. wedge_mode settings and description.

comp_mode_pred_fig12

Currently, wedge_mode is set to 0, i.e. full search is performed all the time as indicated in Table 9.

Table 9. wedge_mode settings.

comp_mode_pred_fig13

Whether to include wedge prediction in the case of inter-inter compound prediction is also controlled by the variance of the source block. If the variance of the source block is smaller than a given threshold (inter_inter_wedge_variance_th), then MD_COMP_WEDGE is not considered in the search and the compound modes to try are limited to at most to MD_COMP_DIST and MD_COMP_DIFF0.

References

[1] Cheng Chen, Jingning Han, and Yaowu Xu, “A Hybrid Weighted Compound Motion Compensated Prediction for Video Compression,” Picture Coding Symposium, pp. 223-227, 2018.

[2] Yue Chen, Debargha Murherjee, Jingning Han, Adrian Grange, Yaowu Xu, Zoe Liu, Sarah Parker, Cheng Chen, Hui Su, Urvang Joshi, Ching-Han Chiang, Yunqing Wang, Paul Wilkins, Jim Bankoski, Luc Trudeau, Nathan Egge, Jean-Marc Valin, Thomas Davies, Steinar Midtskogen, Andrey Norkin and Peter de Rivaz, “An Overview of Core Coding Tools in the AV1 Video Codec,” Picture Coding Symposium, pp. 41-45, 2018.