Project: Video Denoising and Enhancement via Dynamic Sparse + Low-Rank Matrix Decomposition
Main idea
Video denoising refers to the problem of removing “noise” from a video sequence. Here the term “noise” is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts - the ‘‘low-rank layer“, the ‘‘sparse layer”, and everything else (which is small and bounded).
Our proposed algorithm consists of two parts. At each time instant, it first separate the video into “noisy” version of the two layers and . This is followed by applying an existing state-of-the-art denoising algorithm VBM3D [] on each layer. In doing this, VBM3D exploits the specific characteristics of each layer and, hence, is able to find more matched blocks to filter over, resulting in better denoising performance.
Example applications
Why Low-Rank + Sparse? Here are some examples:
In very low-light videos of moving targets/objects (the moving target is barely visible), the denoising goal is to “see” the barely visible moving targets (sparse). These are hard to see because they are corrupted by slowly-changing background images (well modeled as forming the low-rank layer plus the residual). See following dark video example:
In a traditional denoising scenario, consider slowly changing videos that are corrupted by salt-and-pepper noise (or other impulsive noise). For these type of videos, the large magnitude part of the noise forms the “sparse layer”, while the video-of-interest (slowly-changing in many applications, e.g., waterfall) forms the approximate ‘‘low-rank layer’’. The approximation error in the low-rank approximation forms the ‘‘small bounded residual". See following waterfall-salt-pepper video for an example of this:
More generally, consider slow-changing videos corrupted by very large variance white Gaussian noise. As we explain in the paper, large Gaussian noise can, with high probability, be split into a very sparse noise component plus bounded noise. See following waterfall-large-Gaussian video for an example of this:
Problem formulation
Let denote the image at time arranged as a 1D vector. We consider denoising for videos in which image can be split as , where is a sparse vector, 's lie in a fixed or slowly changing low-dimensional subspace so that the matrix is low-rank, and is the residual noise that satisfies .
For all , implement an appropriately modified ReProCS algorithm []
Split the video frame into layers and
For every frames, perform subspace update, i.e., update
Denoise using VBM3D
Experiments
We compare ReLD with VBM3D [], SLMA [], and MLP [] on different datasets and compare their denoising performance. We summarize the experimental results in following table (PSNR and running time in second):
dataset
ReLD
VBM3D
MLP
SLMA
Waterfall
25
35.00 (73.54)
32.02 (24.83)
28.26 (477.22)
*
Waterfall
30
34.51 (73.33)
30.96 (23.96)
26.96 (474.26)
*
Waterfall
50
33.08 (73.14)
27.99 (24.14)
18.87 (477.60)
*
Waterfall
70
29.25 (69.77)
24.42 (21.01)
15.03 (478.73)
*
Escalator
25
31.01 (16.64)
30.32 (5.34)
25.53 (107.51)
21.17 ()
Escalator
30
30.27 (16.45)
29.29 (5.38)
24.54 (108.65)
20.49 ()
Escalator
50
27.84 (16.03)
25.10 (5.27)
18.83 (109.40)
17.98 ()
Escalator
70
25.15 (15.28)
20.20 (4.72)
15.20 (108.78)
15.90 ()
Fountain
25
32.67 (16.70)
31.18 (5.44)
26.86 (105.64)
22.93 ()
Fountain
30
32.25 (15.84)
30.26 (5.17)
25.67 (107.41)
21.85 ()
Fountain
50
30.53 (15.82)
26.55 (5.24)
18.53 (109.79)
18.55 ()
Fountain
70
27.53 (15.03)
22.08 (4.69)
14.85 (107.52)
16.25 ()
Curtain
25
35.47 (16.78)
34.60 (4.15)
31.14 (189.14)
23.28 ()
Curtain
30
34.58 (17.35)
33.59 (4.37)
28.90 (191.14)
22.74 ()
Curtain
50
31.91 (17.17)
30.29 (4.42)
18.58 (188.30)
19.12 ()
Curtain
70
28.10 (16.50)
26.15 (3.85)
14.73 (192.00)
16.68 ()
Lobby
25
39.78 (57.96)
35.00 (19.57)
29.22 (384.11)
23.43 ()
Lobby
30
38.76 (57.99)
33.64 (19.09)
27.72 (395.67)
21.15 ()
Lobby
50
35.15 (58.41)
29.23 (19.35)
18.66 (403.59 )
18.21 ()
Lobby
70
29.68 (56.51)
24.90 (17.00)
14.85 (401.29)
16.82 ()
*: Waterfall dataset is a long sequence, and based on the code provided by authors of SLMA, we were unable to get any results due to extremely low speed.
As can be seen from the table, our algorithm ReLD outperforms all other algorithms in all cases.
Next we visually compare the denoising performances. Since ReLD and VBM3D are the best two among above algorithms, for ease of display we only show results of these two. The noise being added are i.i.d. Gaussian with .
Comparison on Waterfall Dataset:
Comparison on Escalator Dataset:
Comparison on Fountain Dataset:
Comparison on Curtain Dataset:
Comparison on Lobby Dataset:
Note that denoising with noise level is a difficult task – both ReLD and VBM3D inevitably result in blurring effect. However, as can be seen from above videos, ReLD preserves more details than VBM3D, e.g., the white board in the Curtain dataset and the book shelf in the Lobby dataset.
[] Kostadin Dabov, Alessandro Foi, and Karen Egiazarian, “Video denoising by sparse 3D transform-domain collaborative filtering,” 2007.
[] Emmanuel J Candès, Xiaodong Li, Yi Ma, and John Wright, “Robust Principal Component analysis?”, Journal of ACM, 2011.
[] Han Guo, Chenlu Qiu, and Namrata Vaswani, “An Online Algorithm for Separating Sparse and Low-dimensional Signal Sequences from Their Sum,” IEEE Trans. on Sig. Proc., 2014.
[] Hui Ji, Sibin Huang, Zuowei Shen, and Yuhong Xu, “Robust Video Restoration by Joint Sparse and Low Rank Matrix Approximation,” SIAM Journal on Imaging Sciences, 2011.
[] Harold C Burger, Christian J Schuler, and Stefan Harmeling, “Image Denoising: Can Plain Neural Networks Compete with BM3D?,” CVPR 2012.