Recent years have witnessed significant advancements of video compression technologies. The implementation of modern video coding standards, such as H.264/AVC [wiegand2003overview], HEVC [sullivan2012overview], VP9 [mukherjee2013latest], and AV1 [chen2018overview] have greatly benefited high-quality video streaming applications over bandwidth-constrained networks. Despite the improved capabilities of video codecs, banding artifacts remain a dominant visual impairment of high-quality, high-definition compressed videos. Banding artifacts, which present as sharply defined color bands on otherwise smoothly-varying regions, are often quite noticeable, in part because of the tendency of visual apparatus to enhance sharp gradients, as exemplified by the Mach bands illusion [ratliff1965mach]. To further optimize the perceptual quality of compressed user-generated videos [tu2020ugc, tu2020comparative], developing ways to detect and reduce banding artifacts is a problem of pressing interest.
|(a) Original||(b) Compressed||(c) Debanded|
Prior approaches to banding reduction may be categorized in three ways. If debanding is attempted on the source content before encoding, it is a pre-processing step [roberts1962picture, joy1996reducing, daly2004decontouring, daly2003bit]. However, these methods have generally been developed for decontouring heavily quantized pictures rather than on compressed/banded video content. A second approach is to apply in-loop processing, whereby the quantization process is adjusted inside the encoder to reduce banding effects [yoo2009loop, casali2015adaptive]. The third approach, post-filtering, has been most widely studied, since it offers maximum freedom for decoder implementations, i.e., the design of post-filters can be relatively unconstrained and flexible. Most banding removal algorithms follow a two-step procedure: first, banding regions are detected and located in the source video frame; then, spatially local filtering is applied to reduce the banding artifacts with dithering sometimes incorporated. For the banding detection stage, some methods [daly2004decontouring, ahn2005flat, choi2006false, lee2006two, huang2016understanding, bhagavathy2009multiscale] exploit local features, either pixel- or block-wise, such as the image gradient, contrast, or entropy, to measure potential banding statistics. Other methods utilize image segmentation techniques [wang2016perceptual, baugh2014advanced]. Either way, banding artifacts are subsequently suppressed by applying low-pass smoothing filters [daly2004decontouring, lee2006two, choi2006false], dithering techniques [wang2014multi, bhagavathy2009multiscale, jin2011composite, ahn2005flat], or combinations of these [baugh2014advanced, huang2016understanding].
We deem post-filtering a better approach to handle banding artifacts, since it can be performed outside of the codec loop, e.g., in the display buffer, hence offering maximum freedom of design. Moreover, state-of-the-art adaptive loop filters like those implemented in VP9 or HEVC have not been observed to supress or have any effect on banding artifacts [wang2016perceptual, huang2016understanding]
. Here we propose a new adaptive debanding filter, which we dub the AdaDeband method, as a post-processing solution to reduce perceived banding artifacts in compressed videos. Unlike many prior debanding/decontouring models, we recast the problem differently, as a reconstruction-requantization problem, where we first employ an adaptive interpolation filter along each banded region to estimate the “ideal” plane at a higher bit-depth, then we re-quantize the signal to 8-bit via a dithering technique to suppress quantization error, yielding a visually pleasant enhancement of the banding region. Fig.1 exemplifies VP9-compressed banding artifacts and the corresponding AdaDeband-filtered output. We demonstrate that our proposed method outperforms other recent debanding methods, both qualitatively and quantitatively.
Ii Adaptive Debanding Filter
Fig. 2 illustrates a schematic of AdaDeband, which is comprised of three modules. First, a banding detector is deployed to localize the banding edges and band segments with pixel precision. A content-aware size-varying smoothing filter is then applied to reconstruct the gradients within band segments at a higher bit-depth, while preserving image texture details. The final step involves re-quantizing the reconstructed smooth regions to SDR 8-bit resolution using a dithering process to reduce quantization error, yielding a perceptually pleasant debanded image.
Ii-a Banding Region Detection
We utilize the Canny-inspired blind banding detector with same parameters proposed in [tu2020bband] for banding edge extraction. Only the first two modules, pre-processing and banding edge extraction, are performed to obtain a banding edge map (BEM). For convenience, we restate some definitions from [tu2020bband]: after self-guided pre-filtering, pixels having gradient magnitudes less than are labelled as flat pixels (FP); pixels with gradient magnitudes exceeding are marked as textured pixels (TP). The remaining pixels are grouped into a candidate banding pixel (CBP) set, based on which the BEM is extracted. Connected-component (CC) labeling is applied on the set of non-textured pixels (FP CBP), thereby generating a band map (BM). Band edges (BE) define the boundaries of adjacent bands (B); i.e., the bands (B) are framed by band edges (BE), as shown in Fig. 3. In this way we define and extract two component sets, BEM and BM, which together compose all the banded regions of a given video frame.
Ii-B Banding Segment Reconstruction
Banding artifacts usually occur on regions of small (but non-zeros) gradients, usually appearing as bands separated by steps of color and/or luminance. As depicted in Fig. 3, pixels lying within a band have very similar colors and luminances. A simple method of restoring an ‘original’ smooth local image is to apply a low-pass filter (LPF) to interpolate across bands. A traditional 2D-LPF may be formulated as:
where are input pixel (luminance or color) values at spatial locations , and are the filter coefficients, where . Here we use the simple moving average filter () as the smoothing LPF, although one may use any other smoothing filter, such as a Gaussian-shaped filter, or even a nonlinear device such as a median filter.
To effectively smooth band discontinuities, the span of LPF should be adequately wide relative to the band width. In Fig. 3(a), for example, it is only necessary to apply a small LPF on the pink band, whereas a larger LPF would be required for the blue segment. Thus, we apply a dynamic way of determining the filter size for each band, or segment of a band, if the width of the band varies along its axis, based on the extracted BEM and BM. Assume Bj to be an exemplar band, which is framed by detected band edges, BEi-1 and BEi, as depicted in Fig. 3(a). For all the pixels located within band Bj, the spatial extent of the LPF is defined in terms of the ratios of the band area to the lengths of the adjacent BEs. In the unusual instance where a band is enclosed by a single BE, then the ratio is scaled by four. Both cases are expressed here:
where is the cardinality of the pixel set (area of band, or length of band edge), and denotes the set of band edges that enclose Bj. Finally, define the space-varying radius of the LPF window at to be half of :
We have ensured that the span of the LPF adapts to the local geometries of banded areas, as shown in Fig. 3(a). Nevertheless, it must also be recognized that the LPFs may process visually important textures or object boundaries while smoothing large banding regions. This kind of content blurring is not acceptable. Accordingly, we constrain the LPF not to include any TP in the sampling window, which is achieved by recursively halving the filter size (as in Fig. 3(b)):
where is the set of indices of the (square) filter window centered at , with linear dimensions .
The above filter-size-determining process is performed on every banded pixel in set CBP to generate a window-size map for further deployment of size-varying LPFs. We observed that, however, the estimated window-size map tends to be noisy due to the potential unrobustness of banding detector. Thus, a median filter is applied to further “denoise” the estimated window-size map, based on which LPFs are then conducted via Eq. (1).
|(a) Size variability||(b) Content (Texture) awareness|
Ii-C Requantization with Dithering
Dithering techniques are used in a variety of ways in the design of debanding algorithms. Some methods, for example, only apply filtering without dithering to remove false contours [daly2004decontouring, lee2006two, choi2006false]. Other methods apply only a very small amount of dither, either by adding random noise (noise-shaping) [yoo2009loop, bhagavathy2009multiscale, jin2011composite, wang2014multi], or by stochastic shuffling of pixels [ahn2005flat, huang2016understanding], but without any smoothing filter. Among those that combine filtering with dithering, Baugh [baugh2014advanced] proposed to apply dithering after smoothing on banded regions, while the authors of [huang2016understanding] suggested probabilistic dithering prior to average filtering. Here we will show that dithered re-quantization very effectively ameliorates banding artifacts arising from compression.
Fig. 4 shows an example of the effects of processing banded areas with and without dithering, respectively. It may be observed that dithered quantization on the reconstructed banded regions is able to reduce re-quantization error, yielding a pixel distribution similar to the original, while direct quantization without dithering still retains bands. Formally, randomized (dithered) quantization [wannamaker2000theory] may be be expressed as:
where is the filtered image calculated via Eq. (1), is a 2D noise image, and is a N-to-8-bit quantizer.
|(a) Original and compressed||(b) Estimated reconstruction|
|(c) Uniform quantization||(d) Dithered quantization|
|(a) FCDR||(b) FFmpeg-deband||(c) AdaDeband-UN||(d) AdaDeband-GUN||(e) AdaDeband-||(f) AdaDeband-|
|(a) Gaming_1080P-71a5||PSNR / SSIM||30.49 / 0.888||30.49 / 0.888||30.46 / 0.885||30.48 / 0.887|
|(b) LyricVideo_1080P-5a1f||PSNR / SSIM||45.12 / 0.994||44.93 / 0.993||44.45 / 0.990||44.35 / 0.991|
It should be noted that the pattern of the noise image used for dithering will shape the textures of the resulting debanded regions, hence the type of noise must be carefully selected. We demonstrate several well-known noise patterns and their corresponding dithered outcomes, both visually and quantitatively, in Fig. 5. Among commonly used methods [ulichney1988dithering, lippel1971effect, ulichney1999review]
, we chose to employ Gaussian blurred uniform white noise, as shown in Fig. 5(d); other recommended and effective options include uniform and noise (Fig. 5(c)(e)). It may also be observed from Fig. 5 that AdaDeband outperforms FCDR and FFmpeg-deband when smoothing the banded staircase, yielding a more pleasant visual enhancement.
We compared our proposed adaptive debanding filter (dubbed AdaDeband) against two recent debanding/decontouring methods proposed for compressed videos: FCDR [huang2016understanding] and the FFmpeg-deband filter [ffmpeg-deband], on ten111Filenames: Gaming_1080P-71a5, NewsClip_1080P-2eb0, Sports_1080P-19d8, Vlog_720P-60f8, LyricVideo_1080P-3b60, LyricVideo_1080P-5a1f, MusicVideo_720P-44c1, MusicVideo_720P-3698, Sports_720P-058f, Vlog_720P-32b2 selected videos from the YouTube UGC dataset [wang2019youtube]. All the test sequences were scaled to 720p for computational convenience, and we compressed the videos using VP9 constrained quality (CQ) mode with -crf 39 to generate noticeable banding artifacts. In our implementations, only the luma channel was filtered since much less banding was observed on the Cb/Cr channels, but the proposed AdaDeband could also be applied to each color or chroma component.
Visual comparisons of the debanding methods are shown in Fig. 6. We may see that AdaDeband effectively smoothed the banding, leaving edges/textures well preserved. FFmpeg-deband, in contrast, tended to over-smooth weak textures, and under-smooth relatively large banded regions. Another advantage of AdaDeband is its adaptiveness, which can remove bands of any scale/shape, whereas FFmpeg-deband and FCDR require specification of a set of filter parameters, which may affect performance on scenarios it has not been exposed to.
To further verify the adaptiveness of AdaDeband against different quantization parameters, we generated different levels of banding effects using the VP9 CQ mode, with -crf ranging from 1 to 51 on the exemplary video in Fig. 1. We may see in Fig. 7 that our proposed debanding filter produced robust BBAND scores against different levels of quantization without any further tuning of filter parameters.
In addition to the visual results, we also quantitatively compared the debanded videos using several common video quality models. However, it has been shown that traditional video quality metrics like PSNR and even SSIM family [wang2004image, pei2015image, wang2003multiscale] do not align very well with human perception of banding [wang2016perceptual]. Moreover, if the original video already contains banding artifacts, which is often the case for high-resolution videos, it is less reliable to rely on full-reference quality models, since “more similarity” with respect to the original is not necessarily indicating “perceptually better.” Therefore, we also compare on a blind banding assessment model (the BBAND index) [tu2020bband], which has been shown to deliver predictions consistent with subjective judgments of banding. It may be seen in Table I that AdaDeband outperformed FCDR and is on par with FFmpeg-deband in terms of BBAND scores, indicating that it produces perceptually favorable results. Moreover, AdaDeband yields slightly better PSNR and SSIM scores than FFmpeg-deband, indicating less distortion compared to the original.
|FCDR||39.08 (5.10)||0.9709 (0.0308)||0.5790 (0.2494)|
|FFmpeg||38.84 (4.96)||0.9677 (0.0311)||0.2264 (0.0903)|
|AdaDeband||38.97 (4.97)||0.9699 (0.0309)||0.2206 (0.0895)|
We proposed an adaptive post-debanding filter to remove banding artifacts resulting from coarse video compression. The efficacy of the algorithm can be attributed to the accurate detection of banding regions, content-aware band reconstruction, and a dithered re-quantization. Both visual and quantitative results demonstrate significant performance improvement over prior debanding methods. Actually, it can be regarded as a visual enhancement algorithm, whose effects could be accounted for when performing rate-distortion optimized video encoding. Since it is a post-processing model, it may be optimized for efficient, low-power implementations in real-time applications.