Project: Video Denoising and Enhancement via Dynamic Sparse + Low-Rank Matrix Decomposition

Main idea

Video denoising refers to the problem of removing “noise” from a video sequence. Here the term “noise” is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts - the ‘‘low-rank layer“, the ‘‘sparse layer”, and everything else (which is small and bounded).

Our proposed algorithm consists of two parts. At each time instant, it first separate the video into “noisy” version of the two layers ell_t and s_t . This is followed by applying an existing state-of-the-art denoising algorithm VBM3D [] on each layer. In doing this, VBM3D exploits the specific characteristics of each layer and, hence, is able to find more matched blocks to filter over, resulting in better denoising performance.

Example applications

Why Low-Rank + Sparse? Here are some examples:

In very low-light videos of moving targets/objects (the moving target is barely visible), the denoising goal is to “see” the barely visible moving targets (sparse). These are hard to see because they are corrupted by slowly-changing background images (well modeled as forming the low-rank layer plus the residual). See following dark video example:

In a traditional denoising scenario, consider slowly changing videos that are corrupted by salt-and-pepper noise (or other impulsive noise). For these type of videos, the large magnitude part of the noise forms the “sparse layer”, while the video-of-interest (slowly-changing in many applications, e.g., waterfall) forms the approximate ‘‘low-rank layer’’. The approximation error in the low-rank approximation forms the ‘‘small bounded residual". See following waterfall-salt-pepper video for an example of this:

More generally, consider slow-changing videos corrupted by very large variance white Gaussian noise. As we explain in the paper, large Gaussian noise can, with high probability, be split into a very sparse noise component plus bounded noise. See following waterfall-large-Gaussian video for an example of this:

Problem formulation

Let m_t denote the image at time arranged as a 1D vector. We consider denoising for videos in which image can be split as m_t = ell_t + s_t + w_t , where s_t is a sparse vector, ell_t 's lie in a fixed or slowly changing low-dimensional subspace so that the matrix $L :=[ell_1, ell_2,ldots, ell_{t_{max}}]$ is low-rank, and w_t is the residual noise that satisfies $| w_t |_{infty}leq b_w$ .

Overall algorithm – ReProCS-based Layering Denoising (ReLD)

For , initialization using PCP [].
For all , implement an appropriately modified ReProCS algorithm []
- Split the video frame into layers $hat{ell}_t$ and $hat{s}_t$
- For every frames, perform subspace update, i.e., update $hat{P}_t$
Denoise using VBM3D

Experiments

We compare ReLD with VBM3D [], SLMA [], and MLP [] on different datasets and compare their denoising performance. We summarize the experimental results in following table (PSNR and running time in second):

dataset		ReLD	VBM3D	MLP	SLMA
Waterfall	25	35.00 (73.54)	32.02 (24.83)	28.26 (477.22)	*
Waterfall	30	34.51 (73.33)	30.96 (23.96)	26.96 (474.26)	*
Waterfall	50	33.08 (73.14)	27.99 (24.14)	18.87 (477.60)	*
Waterfall	70	29.25 (69.77)	24.42 (21.01)	15.03 (478.73)	*

Escalator	25	31.01 (16.64)	30.32 (5.34)	25.53 (107.51)	21.17 ()
Escalator	30	30.27 (16.45)	29.29 (5.38)	24.54 (108.65)	20.49 ()
Escalator	50	27.84 (16.03)	25.10 (5.27)	18.83 (109.40)	17.98 ()
Escalator	70	25.15 (15.28)	20.20 (4.72)	15.20 (108.78)	15.90 ()

Fountain	25	32.67 (16.70)	31.18 (5.44)	26.86 (105.64)	22.93 ()
Fountain	30	32.25 (15.84)	30.26 (5.17)	25.67 (107.41)	21.85 ()
Fountain	50	30.53 (15.82)	26.55 (5.24)	18.53 (109.79)	18.55 ()
Fountain	70	27.53 (15.03)	22.08 (4.69)	14.85 (107.52)	16.25 ()

Curtain	25	35.47 (16.78)	34.60 (4.15)	31.14 (189.14)	23.28 ()
Curtain	30	34.58 (17.35)	33.59 (4.37)	28.90 (191.14)	22.74 ()
Curtain	50	31.91 (17.17)	30.29 (4.42)	18.58 (188.30)	19.12 ()
Curtain	70	28.10 (16.50)	26.15 (3.85)	14.73 (192.00)	16.68 ()

Lobby	25	39.78 (57.96)	35.00 (19.57)	29.22 (384.11)	23.43 ()
Lobby	30	38.76 (57.99)	33.64 (19.09)	27.72 (395.67)	21.15 ()
Lobby	50	35.15 (58.41)	29.23 (19.35)	18.66 (403.59 )	18.21 ()
Lobby	70	29.68 (56.51)	24.90 (17.00)	14.85 (401.29)	16.82 ()

*: Waterfall dataset is a long sequence, and based on the code provided by authors of SLMA, we were unable to get any results due to extremely low speed.

As can be seen from the table, our algorithm ReLD outperforms all other algorithms in all cases.

Next we visually compare the denoising performances. Since ReLD and VBM3D are the best two among above algorithms, for ease of display we only show results of these two. The noise being added are i.i.d. Gaussian with sigma=70 .

Comparison on Waterfall Dataset:

Comparison on Escalator Dataset:

Comparison on Fountain Dataset:

Comparison on Curtain Dataset:

Comparison on Lobby Dataset:

Note that denoising with noise level sigma =70 is a difficult task – both ReLD and VBM3D inevitably result in blurring effect. However, as can be seen from above videos, ReLD preserves more details than VBM3D, e.g., the white board in the Curtain dataset and the book shelf in the Lobby dataset.

Demo code

Click here.

References

[] Kostadin Dabov, Alessandro Foi, and Karen Egiazarian, “Video denoising by sparse 3D transform-domain collaborative filtering,” 2007.
[] Emmanuel J Candès, Xiaodong Li, Yi Ma, and John Wright, “Robust Principal Component analysis?”, Journal of ACM, 2011.
[] Han Guo, Chenlu Qiu, and Namrata Vaswani, “An Online Algorithm for Separating Sparse and Low-dimensional Signal Sequences from Their Sum,” IEEE Trans. on Sig. Proc., 2014.
[] Hui Ji, Sibin Huang, Zuowei Shen, and Yuhong Xu, “Robust Video Restoration by Joint Sparse and Low Rank Matrix Approximation,” SIAM Journal on Imaging Sciences, 2011.
[] Harold C Burger, Christian J Schuler, and Stefan Harmeling, “Image Denoising: Can Plain Neural Networks Compete with BM3D?,” CVPR 2012.