• Aucun résultat trouvé

JCTVC-G076 CE7: Cross-check of Tool 4 - DST for Intra Chroma Residual Coding [C

Dans le document Td corrigé JCT-VC - ITU pdf (Page 54-58)

Yeo, Y. H. Tan, Z. Li (I2R)]

JCTVC-G376 Cross-check report of CE7 Tool 4, Mode-Dependent DCT/DST for 4x4 Chroma Blocks (G107) [Y. Shibahara, T. Nishi (Panasonic)]

4.7.3 Discussion and Conclusions

4.8 CE8: Non-deblocking loop filtering 4.8.1 Summary

JCTVC-G038 CE8: Summary report of Core Experiment on non-deblocking loop filtering [T. Yamakage, I. S. Chong, M. Narroschke, Y.-W. Huang]

A summary of core experiment 8 (CE8) on non-deblocking loop filtering is reported. There are eight Subtests in CE8, Subtest a: Block level filter adaptation with directional feature (three proposals), Subtest b: Sub-frame delay adaptive loop filter (one proposal), Subtest c: Line Memory reduction in ALF and SAO decoding (seven proposals), Subtest d: Filter shapes and coefficient constraints, for both luma and chroma (three proposals), Subtest e: Prediction of filter coefficients (three proposals), Subtest f: Filter switching reduction at LCU boundary for LCU friendly decoding (one proposal), Subtest g: Chroma filter control (one proposal) and Subtest h: Other in-loop filters (one proposal). These are evaluated based on the common test conditions in JCTVC-F900 and additional conditions in JCTVC-F908. All mandatory results are verified by cross-checkers.

Note: G038 v4 includes a powerpoint deck with the results of the informal viewing.

According to the cross-checkers’ comments and further study such as unification and harmonization in some subtests (i.e., CE8.a and CE8.c), it is suggested to consider those unification and harmonization to regard them as additional CE contributions. As for CE8.c, CE8 is planning to conduct an informal subjective viewing (at least for CE8.c.5, Non-CE8.c.6 and Non-CE8.c.7).

Results on “subtest 0” (encoder only) which shows that 1-pass ALF can be operated with 0.2% bit rate increase, and 2-pass with 0.1% bit rate increase, compared to the current 14-pass in HM (also G146 refers to that). Decide to use one of these (or switchable) for upcoming default settings (was reviewed in the BoG on ALF and decided after that). Adopt the method according to JCTVC-G038 table 2 right column (two-pass method) for the common test conditions (also JCTVC-G1023).

Subtest A:

Adaptation is on level of 4x4 blocks. G609 (-0.1/-0.2/-0.2/-0.4 for AI/RA/LDB/LDP). Has advantage of reduced line buffer from 5 to 4 due to reduced vertical filter size. The proposal uses 8 points instead of 4 for BA classification. In BA mode classification, it uses diagonal instead of hor/vert analysis for the star-shaped filter (hor/vert for cross-star-shaped still). With proper implementation (re-usage of results) this increases number of additions by 1.5. G316 (0.0/-0.1/-0.1/-0.1) proposes 2D merging instead of 1D (imposes syntax change); does this increase latency? G691 is a combination of both which shows that the gains are additive.

G656 is another combination of tools that was considered in this context.

G647 uses 8 classes in BA classification such that 8 filters are used instead of 16. This produces loss of 0.1%. Opinions: For high resolutions, larger number of classes may be needed. Restrictions could eventually be considered in the context of level definition

Decision: Adopt 4x4 classification, but not the other elements of G609 and G316 which would increase complexity. Keep number of classes 16 in BA classification.

General note: In ALF, modifications should target complexity reductions mainly; only proposals which are giving a decent gain/complexity benefit (regardless which direction should be considered); i.e.

complexity decrease should not penalize and complexity increase should give sufficient gain.

Subtest B:

G498 Simplified ALF design. Only 5x5 diamond shape, no pixel classification, no DC offset. More coarse quantization and coding.

Current implementation does not process chroma and does not include slice boundary processing.

Looking at the loss of 1%, several experts express the opinion that we should not replace the current ALF by this. Note: G499 is an improved version with lower loss.

Subtest C:

(G212 presented in this context, is combination of G208/G206 (“option 1”), and something new on SAO LB reduction (“option 2”). Also G211 was considered here.

G208 uses “virtual boundary processing” = specific boundary padding method with some irregularity which could also be implemented differently e.g. by adjusting filter coefficients

Investigate visual quality of G212 option 1 against HM, adopt when it does not produce artifacts.

20 viewers “Score based method” was used, where experts gave 1 point for anchor and prop each when they were equal, and otherwise 2/0 or 0/2 if one was better.

(anchor/proposal) BQM 18/22, Cactus 22/18, Vidyo3 15/25, Vidyo4 20/20

Conclusion: No visual difference. Even 15/25 means that still 75% of the subject thought both are equal.

Decision: Adopt G212 option 1

G207 (c.5) proposes a method to perform padding for SAO at slice boundaries (which would obviously increase the complexity). This shall also be investigated in the subjective test to identify whether there is a problem with SAO at slice boundaries (in case where across-boundary processing is disabled), but we would not adopt it at this meeting as it appears inconsistent to have something for SAO but not for de-blocking where the same problem occurs. There may also be other solutions such as post-processing of boundaries. (Note G194 is also related). Result of subjective viewing: It looks better than anchor.

Investigate combined solution for ALF and SAO for slice and tile boundaries in CE, for de-blocking currently no proposal on the table, but a solution which also includes de-blocking would be desirable.

Subtest D:

G208 (9x9 cross shape) performs best and has better performance than G648 (7x11). G208 will be tested together with G206 such that reducing line buffers by decreasing vertical filter size is not relevant anymore (same applies to G130 which reduces vertical filter size only for chroma and produces losses)

Subtest E:

(G665) Prediction of filter coefficients from other coefficients of the same filter (instead of from one filter to the next). Gain of 0.1% observed in LD B and P cases. Note: G610 is a similar idea that provides more gain.

Does not add or remove complexity (or stability). Several experts support it as the design of prediction from the same filter seems to be more straightforward than from different filter. The gain would increase when APS is sent more frequently for error resilience. Revisited in context of G610. Decision: Adopt.

Subtest G:

Adds a third mode where within a slice the ALF applied to chroma is invoked whenever luma is filtered (currently it is either entirely on or off). Only marginal gain (0.2% for chroma only). – also one more encoder decision.

No support by other companies. No action.

Note: Question was raised what would be the performance when chroma always follows luma, and some experts expressed that they would like such a solution, however other experts expressed it might be dangerous to do this, as there may be good reasons to switch it entirely on/off.

Subtest H (G235): NLM filter.

Benefit around 0.1-0.3% for different test cases, encoder/decoder runtime increase up to 12 (enc) / 5 (dec)

%. NLM operated alternatively with ALF. In worst case, runtime increase could be dramatically higher (if it were used everywhere).

No support by other experts – no action.

4.8.2 Contributions

JCTVC-G1023 CE8 subset 0: Improved ALF N pass encoding [I. S. Chong, M. Karczewicz, T. Yamakage, T. Watanabe, T. Chujoh, C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei] [late]

This evaluates a modified ALF N pass encoding algorithm. This includes improvement and bugfix of ALF N pass encoding. Coding efficiency gain for luma is 0.0 %, 0.0 %, 0.1% and 0.2 % in HE-AI, RA, LB, and LP without encoding/decoding time increase on average.

(non-normative) Subtest A

JCTVC-G316 CE8.a.1: 2-D mergeable syntax [T. Ikai (Sharp), I. S. Chong, M. Karczewicz (Qualcomm), T. Yamakage, T. Watanabe, T. Chujoh (Toshiba), C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]

This contribution reports evaluation results of 2-D mergeable syntax technique [JCTVC-F384]. This technique aims to free the restriction on block classification in the HM-4.0 ALF. The coding gains of 0.0

%, 0.1 %, 0.1 % and 0.1 % in HE-AI, RA, LB, and LP were reported. The decoding time ratio was 100 % to 101 % and the encoding time ratio was 99 % to 100 % in HE case. The proposal was cross-checked by Samsung (JCTVC-G649). Harmonization of the other proposal “CE8.a.2: Directional feature calculation on subset of pixels” (JCTVC-G609) is also tested in JCTVC-G649.

JCTVC-G649 CE8 Subtest a: Cross-check of Tool 1 (JCTVC-G316) 2-D mergeable syntax

[P. Lai, F. C. A. Fernandes (Samsung)]

JCTVC-G609 CE8.a.2: Directional feature calculation on subset of pixels [T. Yamakage, T.

Watanabe, T. Chujoh (Toshiba), C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W.

Huang, S. Lei (MediaTek), M. Karczewicz, I. S. Chong (Qualcomm), T. Ikai, A. Segall, T. Yamamoto (Sharp)]

This evaluates a directional feature calculation on subset of pixels to improve coding efficiency. This technique uses the internal and the surrounding pixels (window pixels) of the 4x4 block unit which have different complexity and performance trade off. It also evaluates the changes in computational direction to diagonal directions from horizontal/vertical directions. Coding efficiency gain for luma is 0.1 %, 0.2

%, 0.2% and 0.4 % in HE-AI, RA, LB, and LP without encoding/decoding time increase on average.

When merged with CE8.a.1 the coding efficiency is improved by is 0.2 %, 0.3 %, 0.3% and 0.4 % in HE-AI, RA, LB, and LP.

JCTVC-G464 CE8 subtest a tool 2: crosscheck of Directional feature calculation on subset of pixels [K. Sugimoto, K. Miyazawa, A. Minezawa, S. Sekiguchi

(Mitsubishi)]

JCTVC-G650 CE8 Subtest a: Cross-check of Tool 2 (JCTVC-G609) Directional feature calculation on subset of pixels [P. Lai, F. C. A. Fernandes (Samsung)]

JCTVC-G647 CE8 Subtest a, Tool 3: Block-based filter adaptation with up to 8 filters (HV8) [P. Lai, F. C. A. Fernandes, I.-K. Kim (Samsung)]

This contribution presents filter adaptation design with unified and reduced number of initial filter classes for both block-based and region-based filter adaptations (BA and RA). For BA, 2 directional classes are produced by directly comparing directional features, each directional class has 4 magnitude levels, leading to 8 initial filter classes; while HM4.0 has 3 directional classes with 5 magnitude levels (15 initial classes). RA is modified to partition a frame into 8 regions corresponding to 8 initial filter classes. Such design reduces the number of testing in filter-class merging process at encoder, and unifies the syntax of merging flags under BA and RA.

The proposed method reports 0.1% BD-rate loss for AI, RA, LDB structures; and 0.2% BD-rate loss for LDP structure. The enc / dec time have no measurable changes as compared to HM4.0.

JCTVC-G611 CE8.a.3: Cross check of Samsung's directional feature calculation on subset of pixels [I. S. Chong, M. Karczewicz (Qualcomm)]

Subtest B

JCTVC-G498 CE8: ALF with low latency and reduced complexity [A. Fuldseth, G.

Bjøntegaard (Cisco)]

The document describes a low complexity ALF technique suitable for low latency applications. One single set of ALF filter coefficients are computed quantized and transmitted sequentially for each block using a single pass technique. The proposed ALF also has low complexity by using only a 5x5 diamond shape and no decoder-side variance calculations. When applied to low complexity configurations, BD-rate gains between 1.4 % and 3.7% are reported. When applied to high efficiency configurations, BD-BD-rate losses between 0.5% and 1.1% are reported.

JCTVC-G371 CE8 Subset b: cross-check for Cisco’s ALF (JCTVC-G498) [J. Xu, A.

Tabatabai (Sony)]

JCTVC-G864 CE8 Subset b.1: Cross check of Cisco's ALF with low latency and reduced complexity [I. S. Chong, M. Karczewicz]

Subtest C

Dans le document Td corrigé JCT-VC - ITU pdf (Page 54-58)