1 Introduction
Inverse problems arising in image restoration require the use of prior knowledge on images in order to determine the most likely solutions among an infinity of possibilities. Traditional optimization methods rely on priors modeled as convex regularization functions such as the total variation, encouraging smoothness, or the
norm for sparsity. In this context, convex optimization algorithms have played an important role and their convergence properties have been wellestablished. However, the methods based on ”handcrafted” convex priors are now significantly outperformed by deep neural networks that directly learn an inverse mapping from the degraded measurements to the solution space. Here, the prior knowledge on images and the degradation to recover from do not need a formal mathematical definition. Instead, they are implicitly taken into account when training the network from a large dataset of degraded images (i.e. network’s intput) along with their original versions (i.e. network’s output). This approach has enabled a significant leap in performance for common image restoration problems including denoising, demosaicing, superresolution, etc. However, different neural networks must be carefully designed and trained for each problem specifically. Furthermore, it is usually impossible to interpret what task is performed by the different layers or submodules of the trained network which acts as a black box.
In order to increase the interpretability and genericity of deep neural networks, the field is evolving towards methods that combine them with traditional optimization algorithms. A popular approach, seen for example in [1, 2, 3, 4, 5, 6, 7], consists in defining socalled “unrolled” neural networks that simulate several iterations of an optimization algorithm. For instance, in unrolled gradient descent, the gradient of the datafidelity term can be computed knowing its closed form expression, while a convolutional neural network (CNN) is used to simulate the gradient computation of a regularization term. Endtoend training of this regularizing network is then performed by “unrolling” a given number of iterations of the optimization algorithm. Here, the trained CNN is meant to represent only prior knowledge on images, which does not depend on a specific image restoration task. However, in practice these networks require retraining for each task, meaning that taskspecific features are also learnt in the endtoend unrolled training.
A more universal approach referred to as “PlugandPlay” (PnP), relies on proximal algorithms such as the Alternating Direction Method of Multipliers (ADMM) [8] or Half Quadratic Splitting (HQS) [9], where the proximal operator of the regularization term is usually replaced by a denoiser as in [10, 11, 12, 13, 14]. While Venkatakrishnan et al. [10] introduced this approach using traditional denoising techniques including BM3D [15] or KSVD [16], more recent methods attain higher performance thanks to neural networks trained for denoising. The main limitation, however, is that the objective function to minimize is typically nonconvex when the proximal operator of the regularization term is defined as a high performance denoiser. Hence the proximal algorithms used may not provide the optimal solution.
In this paper, we propose a preconditioned ADMM in order to improve the algorithm’s performance in this challenging “PlugandPlay” scenario. While the proposed formulation remains mathematically equivalent to the original problem, the preconditioning makes it possible to use a denoiser that takes into account spatially varying noise levels. This is particularly advantageous for applications where the variance of the approximation error made at each iteration varies spatially in a predictable way. For instance, in image demosaicing, completion, or interpolation, only the unknown pixels that require interpolation are expected to have an approximation error. Hence, the pattern of known and unknown pixels can be used in our scheme to define a sensible preconditioning matrix. Likewise, when an image is corrupted by Poisson noise, the noise variance at each pixel is known to be proportional to the noiseless pixel intensity. In summary, the main contributions of our paper are as follows:

We formulate a preconditioned ADMM, which enables the use of a denoiser that can take into account spatially varying standard deviation.

We propose a procedure for training a denoising neural network that takes as input a map of pixelwise standard deviation, along with the image to be denoised.

We demonstrate the effectiveness of the approach and define suitable preconditioning schemes for several applications: image completion, interpolation, demosaicing and Poisson denoising. Note that for Poisson denoising, we present further derivations accounting for the negative loglikelihood of the Poisson distribution, which replaces the conventional least square data fidelity term (only suitable for Gaussian noise).
2 Background: PlugandPlay ADMM
2.1 PlugandPlay Prior
Let us consider the linear inverse problem of recovering a clean image from its degraded measurements obtained with the degradation model , where is the degradation matrix, is additive white Gaussian noise with standard deviation , and
is the unknown ground truth image arranged as a vector. The problem is generally formulated as the Maximum a Posteriori (MAP) estimation of
, given its prior probability density function
. The MAP is given by:(1)  
(2)  
(3) 
In practice, the prior distribution is not used directly, but is represented by the regularizer , which penalizes unlikely solutions.
The PnP approach introduced in [10] goes one step further by removing the need for an explicit definition of the regularizer. This is made possible by the use of proximal algorithms such as ADMM in which the regularizer only appears in the evaluation of the proximal operator of for some scalar factor . This operator is defined as:
(4) 
It can be noted that the expression of is a particular case of Eq. (3), where the degradation matrix
is the identity matrix, i.e. the degradation model only consists in the addition of white Gaussian noise. In other words
is the MAP denoiser assuming additive white Gaussian noise of variance and given the underlying prior . Based on this observation, the authors of [10] replaced by a standard image denoiser which implicitly determines the prior distribution. In the following, we will note the denoiser as a function of the image and the noise standard deviation:(5) 
More recent works use a trained CNN to represent the proximal operator. For instance, in [13], adversarial training is used to learn a projection operator. Here, the assumption is that the regularizer is the indicator function of the set of “natural images”. Hence, the corresponding proximal operator projects the input image to the closest “natural image”, which can be seen as a form of blind denoising. While this avoids the need for a parameter , it does not allow for controlling the denoising “strength” within the algorithm. In [11], a neural network called DRUNet, that combines ResNet [17] and UNet [18] architectures, is trained for the task of Gaussian denoising. The noise standard deviation is given as a constant input map in addition to the noisy image. Hence, compared to approaches based on denoising CNNs trained for a single noise level[19], or for blind denoising [13, 20, 21], the method in [11] can control the algorithm’s parameters more precisely at each iteration, while keeping the highest denoising performance. Since a single network is used, it also simplifies the method in [12], where 25 denoisers are trained for different standard deviations in order to cover a larger range of noise levels.
In this paper, we argue that a denoiser parameterized with an input map for standard deviation can even be trained to control the noise level at each pixel independently. Such a denoiser further improves the PnP scheme for several applications thanks to the proposed preconditioning.
2.2 ADMM Formulation
In this section, we present the classical ADMM formulation in the case of the least squares inverse problem in Eq. (3). The same notations will be used throughout the paper. In order to use the ADMM, the problem is first cast into a constrained optimization problem by splitting the two terms:
(6)  
subject to 
The constraint is included in the optimization by defining the augmented Lagrangian function , which introduces a dual variable and a penalty parameter :
(7)  
(8) 
The ADMM method consists in alternatively minimizing the augmented Lagrangian for the variables and separately, and by updating the dual variable . In practice, the penalty parameter may also be updated at each iteration to accelerate the convergence. The ADMM iteration thus reads as:
(9)  
(10)  
(11)  
(12) 
where the dual variable is typically zeroinitialized. An initialization of is also required and is often performed as , but a finer initialization strategy may be used depending on the problem.
3 Preconditioned PlugandPlay ADMM
In several problems, we have additional knowledge about locally varying noise level. For instance, in image completion or interpolation problems, the pixels already sampled in are known without error. Thus, at a given iteration, only the pixels at unknown positions need to be denoised. However, in the original ADMM formulation the denoising step considers the same noise level at each pixel. Hence, in order to finely tune the denoising effect and improve the performance of the PnPADMM, we reformulate the problem with a preconditioner that takes into account the knowledge of a variable noise level.
3.1 ADMM Reformulation
Let us consider the following problem, mathematically equivalent to Eq. (3):
(14)  
(15) 
where is a diagonal preconditioning matrix.
Similarly to the Section. 2.2, we can express the ADMM algorithm for solving Eq. (14) by replacing the updates of and in Eqs. (9) and (10) with respectively:
(16)  
(17) 
The variable can be updated directly using the wellknown closed form solution of Eq. (16):
(18) 
However, for updating , the denoiser from Eq. (5), is no longer suitable because of the matrix within the regularization term. The next section presents how the yupdate can still be performed using a more general denoiser that considers Gaussian noise with variable variance.
3.2 New Prior Term SubProblem
3.2.1 General Gaussian Denoiser
Let us first define the general expression of a denoiser for Gaussian noise with covariance matrix . The MAP estimate from a noisy image can be derived similarly to Eqs. (1)(3), where the likelihood
is a multivariate Gaussian distribution, i.e.
. Including the prior and taking the negative logarithm gives the expression of the denoiser:(19) 
Here, we will only consider the case of a diagonal matrix . Hence, we still assume independent noise at each pixel (but not identically distributed), and the diagonal of corresponds to pixelwise variance. Note that the denoiser is equivalently defined as for a constant standard deviation . A practical image denoiser can be obtained by training a CNN that takes as input a map of the pixelwise noise levels in addition to the noisy image. A training procedure for such a denoiser is presented in Section 4.
3.2.2 SubProblem Solution
Now, let us rewrite Eq. (17) using the change of variable , which gives . Thus, we have and with:
(20)  
(21) 
Given the denoiser in Eq. (19) we can then rewrite the yupdate step as:
(22) 
Therefore, the preconditioned ADMM still solves the original problem but introduces a denoising step that assumes noise with spatially varying standard deviation. The standard deviation can be adjusted per pixel via the diagonal matrix and the global parameters and . A suitable choice of the preconditioning matrix can be made depending on the problem in order to improve the performance of the PnPADMM.
3.3 Variables Interpretation and Initialization
It should be noted that in the preconditioned problem, an image only estimates the intermediate variable , but the final result is (see Eq. (15)). Hence, (similarly ) may not be a plausible image. However, the input of the denoiser is , where the multiplication by rescales the image pixels consistently with the “natural image” , as expected by the denoiser.
Inversely, for the initialization, the inverse scaling must be performed. Given an initial estimate of the true image , a good initialization for our problem is thus .
4 Deep Locally Adjustable Denoiser
4.1 Denoising Network for Known Noise Level
Our algorithm relies on a deep neural network for solving the prior term subproblem which corresponds to a Gaussian denoising problem where the noise level (i.e. standard deviation) is known and can be adjusted for each pixel. Here, we use the DRUNet architecture proposed in [11]. The overall structure consists of a UNet where each level contains sequences of 4 residualblocks and either a downsampling layer (first branch of the U) or an upsampling layer (second branch).
The DRUNet network structure in [11] conforms to the definition of the denoisers (Eq. (5)) or (Eq. (19)) by taking as input the concatenation (in the channel dimension) of the noisy image and a noise level map whose pixels’ values are equal to the noise standard deviation. The advantage of using an input noise level map is that a single generic model can be trained to perform optimally for a large range of noise levels. However, in [11] the model is only trained considering constant noise level maps since their algorithm only uses the constant denoiser . In the rest of the paper, we refer to this network as DRUNetcst.
In order to use our preconditioning approach, a locally adjustable denoiser is required. Therefore, the training process must be adapted so that each input sample consists of an arbitrary noise level map along with an image corrupted with the corresponding Gaussian noise level for each pixel. We describe in the next section how to randomly generate suitable patterns so that the trained model generalizes well for any arbitrary noise level map.
4.2 Noise Level Map Generation
Let us first consider the case of a constant noise level map as in [11]
. Here, all the pixels are equal to the same random variable
that can be simply defined as:(23) 
where
is a uniformly distributed random variable in the range
. The parameter is equal to the expectation of the random variable . It can be selected to train a denoiser that is sufficiently generic for all the noise levels in the range [0,].Now, in order to train a locally adjustable denoiser , we generate a random map for each training image using the following random process:
(24) 
where is the pixel index, , and (for each pixel ) are independent random variables with uniform distributions in the range . One can verify that is the expectation of , and the range of possible values of is [0,] similarly to the previous case with constant noise level.
The random weight allows us to adjust the variance of the noise level map (i.e. lower weight corresponding to higher variance). The random offset is also necessary to reduce the correlation between the variance and the mean of the generated maps. In particular, the offset enables the generation of maps with a low variance but a high average level, so that the trained denoiser generalizes well even in the constant noise level case. Hence, the method makes it possible to train the network with noise level maps covering a wide range of first and second order statistics, which prevents overfitting for a specific type of pattern. In the paper, we refer to the network trained with this method as DRUNetvar.
For some applications, the noise standard deviation may also need to be adjusted depending on the color component. For these applications, we train another network called DRUNetvarRBG that takes an input noise level map for each color component. For the training, the noise level maps are generated for each component separately using the random process in Eq. (24).
4.3 Training Details
The training is performed by generating a noise level map randomly for each input image, as described in Section 4.2. Considering pixel data in the range , we use the parameter in Eq. (23) (for the DRUNetcst denoiser) or Eq. (24) (for the DRUNetvar and DRUNetvarRBG denoisers). The maximum noise level is thus . Gaussian noise with standard deviation defined by the noise level map is then added to the input image.
The remaining details of the training procedure are the same as described in [11]. We use the same large dataset of 8694 images, including the Berkeley Segmentation Dataset BSDS500 (without the validation images) [22], the Waterloo Exploration Database [23], the DIV2K dataset [24], and the Flickr2K dataset [25]. We minimize the loss between the denoiser’s output and the ground truth image using the ADAM optimizer [26] with a learning rate initialized at and decreased by half every 100000 iterations. The training is stopped when the learning rate is smaller than . Finally, each iteration of the training uses a batch of 16 patches of size 128x128 randomly selected from the images of the training dataset.
4.4 Denoising Performance
Here, we evaluate the performance of our denoisers DRUNetvar and DRUNetvarRBG generalized for variable standard deviation, in comparison with DRUNetcst [11] as well as other stateoftheart denoisers that assume a constant noise level. These include the BM3D denoiser (noted CBM3D for color images) [15] and the two recent CNNbased methods FFDNet [27] and RNAN [28]. Similarly to DRUNetcst, the FFDNet denoiser can be parameterized at inference time knowing the noise standard deviation. On the other hand, RNAN requires a separate training for each noise level, but has shown higher denoising performance thanks to a nonlocal attention module.
4.4.1 Constant Noise Level
First, let us consider input images corrupted with constant noise level. The results in Table I show that our locally adjustable denoisers DRUNetvar and DRUNetvarRBG do not have significantly degraded performance compared to the reference network DRUNetcst, despite being trained for more generic noise level maps. Only a very moderate loss is observed for the DRUNetvarRBG denoiser compared to DRUNetcst, mostly for the lowest and highest noise levels ( or ). Nevertheless, DRUNetvarRBG still outperforms the other stateoftheart methods, even the RNAN denoiser trained specifically for each noise level.
Dataset  CBSD68  Set5  

noise level  10  30  50  10  30  50 
CBM3D [15]  35.90  29.91  27.51  36.02  30.97  28.75 
FFDNet [27]  36.14  30.32  27.97  36.16  31.35  29.24 
RNAN [28]  36.43  30.65  28.30  36.62  31.77  29.61 
DRUNetcst [11]  36.51  30.79  28.48  36.67  31.90  29.80 
DRUNetvar  36.51  30.80  28.48  36.67  31.91  29.79 
DRUNetvarRBG  36.47  30.78  28.46  36.61  31.90  29.77 
4.4.2 Spatially Varying Noise Level
DRUNetcst [11]  DRUNetvar  

Dataset  Avgstd  Truestd  PnPADMM  Truestd  

Set5  29.01  29.03  34.35  34.48 
BSD68  28.14  28.22  32.75  32.90  

Set5  24.45  24.37  32.03  32.24 
BSD68  23.59  23.60  30.09  30.27  

Set5  19.82  19.58  29.69  30.06 
BSD68  19.03  18.88  27.73  28.01 
Now, let us show the advantage of the proposed training procedure by comparing the performances of our DRUNetvar denoiser with the DRUNetcst version when the input noise standard deviation varies spatially. For these tests, we randomly select a standard deviation in the range for each pixel. For a complete comparison with DRUNetcst, we have tested this network in three different ways:

Avgstd: using the average noise level as input.

Truestd: using the true noise level map as input.

PnPADMM: solving the problem of denoising with variable noise level (Eq.(19)) using PnPADMM without preconditioning based on the DRUNetcst network.
For the PnPADMM, we set the parameters using Eq. (33) so that the noise of highest standard deviation is removed at the first iteration, while successive iterations refine the result by decreasing the noise level of the denoiser down to . Here, we use for iterations, which we found to give the best results for this application.
Table II and Fig. 1 present the results of DRUNetcst in each of the three schemes, and DRUNetvar in the Truestd scheme. As expected, the results using only the average standard deviation are unsatisfying, since the pixels with above average noise level are not sufficiently denoised, while details are not optimally preserved in less noisy regions. The same behavior is observed when using the true noise level map as input to DRUNetcst. This confirms that a denoiser trained for constant noise level does not generalize when the noise standard deviation strongly varies between pixels. More satisfying results are obtained when using DRUNetcst in the PnPADMM scheme. However, our DRUNetvar denoiser trained directly for this task can still retrieve more details, which indicates that the PnPADMM without preconditioning is suboptimal. More realistic applications are presented in the rest of the paper to demonstrate that our locally adjustable denoisers also allow for improved performances in practical scenarios thanks to the preconditioned PnPADMM.
5 Applications
In this section we present several examples of applications in image restoration. Although the proposed denoiser allows us to adjust of the variance locally, it still assumes an independent noise distribution for each pixel. Hence, in order to demonstrate the advantage of our preconditioning for the PnPADMM scheme, we restrict our study to applications where the error variance is expected to vary only in the pixel domain. While this excludes deblurring, or superresolution from an antialiased downsampled image, many applications of high practical interest are still concerned. These include image completion, interpolation, demosaicing and Poisson denoising.
5.1 Image Completion and Interpolation
In the problems of image completion and interpolation, a subset of the original image pixels are sampled in the input vector , while the remaining ones are unknown and must be determined. In both problems, the sampling can be expressed in the matrix form with a sampling matrix such that each row contains only zeros, except one element equal to at the index of a sampled pixel.
When solving the problem with ADMM, all the unknown pixels are expected to have the highest estimation error at the initialization stage. However, over iterations, more reliable estimates should be obtained especially for pixels close to a sampled pixel. Hence, the denoiser within our preconditioned PnPADMM should be adjusted locally depending on the proximity to a sampled pixel. Following this intuition, we define a preconditioning matrix that varies over iterations according to the following formula for each iteration :
(25) 
where is a scalar parameter and is defined recursively by blurring , and where is the mask indicating the known pixels:
(26) 
Here, * is the 2D convolution operation and is a 2D Gaussian filter of parameter . Note that it is equivalent to have . We can thus determine so that the preconditioning at the last iteration does not depend on , but only on a fixed parameter at the last iteration by taking .
Using this definition, the preconditioning values are always equal to for the known pixels (i.e. ), and higher than for the other pixels, which results in a stronger denoising for the unknown pixels in Eq. (22). The parameter prevents too high preconditioning values that would make the denoising impractical. The highest preconditioning value is . Additionally, thanks to the Gaussian blur, the preconditioning values of unknown pixels will depend on their distance with sampled pixels. Therefore, the least reliable pixels (i.e. that are far from sampled pixels) will be denoised more.
Note that we only use nonzero preconditioning values (and thus nonzero standard deviation in the denoiser) so that the matrix remains invertible. However, ideally, no denoising should be performed at the sampled pixel positions to prevent unnecessary loss of information. Therefore, we force the denoiser’s output to be equal to the input for these pixels.
5.2 Demosaicing
Digital cameras typically capture colors using a sensor with a color filter array (CFA). Thanks to the mosaic formed by the CFA, neighbor pixels on the sensor record a different color information. Knowing the CFA pattern, the full color information can then be retrieved by an inverse problem called demosaicing.
Red, green and blue filters are generally used so that each pixel directly records one of the RGB components. Demosaicing can then be seen as an interpolation problem where a pixel R, G or B value is either known or unknown. Therefore, we use the same preconditioning strategy described in Section 5.1. However, in a realistic scenario, sensor data may also contain noise which is preferably removed jointly with the demosaicing step [31]. In the case of noisy data, our method applies similarly, but we do not force the denoiser to keep the sampled pixels unchanged as explained in Section 5.1.
Furthermore, unlike the problems of completion and interpolation, the demosaicing uses different masks indicating the sampled pixels for the R, G and B components. The preconditioning matrix thus takes different values for each component of the same pixel, which requires using a denoiser parameterized with a RGB noise level map. Our DRUNetvarRBG network trained for arbitrary noise level patterns is thus suitable for this application.
However, in order to illustrate the tradeoff between the genericity of the denoiser and the optimal performance, we also trained a more specialized denoiser for demosaicing that we call DRUNetdem. For training this network, instead of generating noise level maps with the generic random process in Eq. (24), we generate the patterns from the CFA mask, that is also used in the preconditioned ADMM. These patterns are defined by Eqs. (25) and (26), where a mask is defined for each color component by the CFA pattern. Experimental results using either the generic network DRUNetvarRBG or the specialized one DRUNetdem, are given in Section 6.4.
5.3 Poisson Denoising
In many practical scenarios, the assumption that images are corrupted by additive white Gaussian noise is inaccurate. Camera noise is typically better modelled by a Poisson process. In order to formulate a MAP optimization for the problem of denoising an image corrupted with Poisson noise, the least squares data term in Eq. (3) must be modified. Here, we rederive our preconditioned ADMM algorithm for Poisson denoising.
5.3.1 Preconditioned ADMM for Poisson denoising
Applying Poisson noise to an original image gives a noisy image such that the likelihood distribution is:
(27) 
As seen in Eqs. (1)(3), the data term is the negative loglikelihood which can be derived as
(28) 
where is a constant term that can be ignored for the minimization. The preconditioned Poisson denoising problem is then formulated by substituting with and by adding the regularization term as in Eq. (14):
(29) 
The problem is solved using the ADMM, by splitting the data and regularization terms following the methodology in Section 2.2. Since the regularization term remains the same as in Section 3, the yupdate in Eq. (22) still applies, and is performed using the same Gaussian denoiser . However, given the new data term, the xupdate becomes:
(30) 
where .
Knowing that is a diagonal matrix, this problem can be solved independently for each pixel. The following closed form solution is derived in Appendix A (similar derivations are also found in [14] for the case ):
(31) 
It is worth noting that unlike the Gaussian case, the noise level only depends on the data and is not controlled by a parameter . However, in photography the amount of Poisson noise can be reduced by increasing the exposure. This effect can be simulated by applying Poisson noise to a ground truth image rescaled in the range . A high peak value thus simulates a high exposure and less noise. However, for an input image with values in the range , the algorithm must be modified since the denoiser in Eq. (22) assumes data in the range . The yupdate should then be performed by replacing with the denoiser adapted to the range by rescaling the input and output as:
(32) 
Similarly, the final ADMM result should be divided by to recover an image in the range .
5.3.2 Choice of Preconditioning Matrix
It is well known that the variance and the expected value of a random variable with the Poisson distribution are both equal to the distribution’s parameter, i.e. the noiseless image pixel. This means that for each pixel of the image, the noise standard deviation is equal to the square root of the pixel value in the ground truth image. Although this image is unknown, the noisy image provides a first estimate of the noise variance, which can be refined at the next iterations, as we obtain a better estimate of the noiseless image.
Therefore, in this problem, we use for each iteration, so that the denoiser is only controlled using the preconditioning matrix which is initialized as . Here, the square root is applied elementwise and the notation forms a diagonal matrix from the vector. Following the initialization strategy in Section 3.3, the variable is then initialized such that , and thus .
Then, at each iteration, a more accurate estimate of the noiseless image is given by obtained after the denoising step in Eq. (22). Note that is directly the output of the denoiser (before applying ) and is thus obtained without further matrix multiplication. Hence, the preconditioning matrix can be updated to better estimate the noise standard deviation using .
6 Experimental Results
6.1 ADMM Parameters Setting
The main hyperparameters of the ADMM are the number of iterations , the initial penalty parameter , and it’s increase factor in Eq. (12). The additional parameter in Eq (14) depends on the problem and corresponds to the Gaussian noise standard deviation in the measurement vector . For noisefree applications (i.e. completion, interpolation, noisefree demosaicing), using the theoretical value would remove the regularization term. Therefore, in order to keep the benefit of the regularization in these applications, we use a small value . Also note that this parameter does not appear in the case of Poisson denoising in Eq. (29) since the noise standard deviation only depends on the ground truth image.
6.1.1 Completion, Interpolation and Demosaicing
For the completion, interpolation and demosaicing (either with or without noise) the parameters setting is inspired from [11]: since the noise level of the denoiser is globally controlled at each iteration by , we choose such that is sufficiently large to remove initialization noise or artifacts at the first iteration (regardless of the loss in details). For example, in the completion problem, we use zeroinitialisation and a large value of . For the interpolation and demosaicing problems, more accurate initialisation can be performed, thus we use a smaller value of . The parameter is then determined so that decreases down to , in order to best preserve the details in the final image. Therefore, we have . Using this equation for and , we can compute the parameters values:
(33) 
Note that the noise level assumed by the denoiser is also controlled locally by the preconditoning matrix in Eq.(22). However, our preconditioning strategy for these problems is such that if is the index of a sampled pixel (see Section 5.1). Hence at the last iteration, the image is denoised assuming the correct noise standard deviation for the sampled pixels. In our experiments, we can thus compare our preconditioned PnPADMM to the nonpreconditioned version using the same parameterization of and .
The preconditioning matrix definition in Eqs. (25) and (26) introduces the additional parameters , controlling the blurring of the mask at each iteration, and , controlling the maximum preconditioning value . In all the experiments, we use (i.e. ). The parameter is set according to the number of iterations using as explained in Section 5.1, where is the blurring parameter at the last iteration that we set empirically for each application (see details in the respective sections and parameters summary in Table III).
6.1.2 Poisson Denoising
For the problem of Poisson denoising, the noise level at each pixel and each iteration is fully controlled by the preconditioning matrix as explained in section 5.3, which removes the need for the penalty parameter. Therefore we simply use and .
Nevertheless in order to evaluate the advantage of the preconditioning, we also implement the nonpreconditioned version by taking and using the DRUNetcst network. In this case, the noise standard deviation assumed by the denoiser in Eq. (22) is at each iteration . Given an input image data in the range and with Poisson noise, the highest possible noise standard deviation is . Hence, we initialize so that . We found empirically that the best results are always obtained by decreasing the denoiser’s standard deviation down to at the last iteration . Therefore in our experiments without preconditioning, we always use and .
A summary of the parameters setting can be found in Table III. It does not include the number of iterations which is studied more in detail for each application in the following sections.
without preco.  with preco.  
completion  1  1  0  10  
interpolation  0.4  10  
demosaicing  noisefree  0 or 0.3  10  
noise  0 or 0.3  10  
without preco.  with preco.  


1  1 
6.2 Interpolation
Set5  CBSD68  
x2  bicubic interpolation  31.63  27.85 
PnPADMM w/o preco. ()  33.34  28.85  
PnPADMM w/ preco. ()  33.47  29.00  
x4  bicubic interpolation  25.66  23.20 
PnPADMM w/o preco. ()  26.36  23.65  
PnPADMM w/ preco. ()  26.94  23.90 
The interpolation problem can be seen as a particular case of superresolution, where the low resolution input image was generated without applying an antialiasing filter before subsampling the ground truth image pixels. As a result, the main difficulty of the interpolation problem is to remove aliasing effects. First, we show in Fig. 2 that our method better removes aliasing when we update the preconditioning matrix by blurring the mask of sampled pixels, as described in Section 5.1 (see Eqs. (25)(26)). In this example, the results of the bicubic interpolation in Fig. 2(a) displays strong aliasing artifacts. In Fig. 2(b), using our preconditioned PnPADMM with (i.e. no update of and no blurring of the mask) partially removes the aliasing. However, better results are obtained in Fig. 2(c), when updating using .
Although many superresolution algorithms exist and excellent performances have been obtained by deep learning techniques, these methods are generally trained assuming that the degradation already includes a given antialiasing filter. Therefore these methods are not suitable for the interpolation. Hence, we only compare our approach to the nonpreconditioned PnPADMM, as well as the bicubic interpolation which we also use as an initialisation in the ADMM. The results of the PnPADMM either with or without preconditioning depends on the chosen number of iterations
. Fig. 3(a)(b) shows that without preconditioning, the best results are obtained using respectively and iterations for x2 and x4 interpolation, while with our preconditioning, we only need to use and respectively. Using these values for the parameter in each case, the results in Table IV and Fig. 4 show that in addition to the faster convergence, our preconditioning improves the final results of the PnPADMM both in x2 and x4 interpolation.6.3 Completion
Set5  CBSD68  
Total Variation regularization  26.11  24.80  
Edge Enhancing Diffusion  28.61  25.55  
PnPADMM w/o preco. ()  30.20  26.75  
PnPADMM w/ preco. ()  30.94  27.43  
Total Variation regularization  22.86  22.66  
Edge Enhancing Diffusion  25.85  23.43  
PnPADMM w/o preco. ()  26.20  24.06  
PnPADMM w/ preco. ()  27.52  24.95 




PnPADMM  
no preconditioning  with preconditioning  
(DRUNetcst)  DRUNetvarRBG  DRUNetdem  
Kodak  34.68  41.85  42.05  40.26  41.45  42.37  42.32  
McMaster  34.46  39.13  39.50  36.84  39.24  39.27  39.31 



PnPADMM  
no preconditioning  with preconditioning  
(DRUNetcst)  DRUNetvarRBG  DRUNetdem  
Kodak  27.54  33.27  30.96  33.24  33.90  34.18  
McMaster  27.69  33.18  30.19  33.35  33.89  34.06  
Kodak  22.38  30.04  23.81  29.92  30.84  31.18  
McMaster  22.75  30.18  23.65  30.09  31.09  31.36 
For the completion problem, we evaluate the results using a rate of either or of known pixels. Unlike the interpolation problem, the known pixels do not form a regular grid and are instead selected randomly, which prevents the aliasing artifacts. For this reason, contrary to what we observed for the interpolation in Fig. 2, updating the preconditioning matrix with a blurred version of the mask does not improve the completion results. Hence, we use in the following experiments.
Based on the results in Fig. 3(c)(d), we use and iterations respectively for the and completion problems when using our preconditioning. Similarly to the interpolation problem, the convergence of the nonpreconditioned PnPADMM is significantly slower: the best results for the Set5 dataset are obtained using respectively and for and completion.
For further comparisons, we also provide completion results obtained with the conventional total variation (TV) regularization, using the implementation in [34]. We also compare our results to a state of the art diffusion based method. A review of diffusion methods in [35] have shown that the best completion performance was obtained with a nonlinear anisotropic diffusion scheme referred to as edge enhancing diffusion (EED). The EED was also exploited within compression schemes in more recent works (e.g. [36],[37]), where only a small percentage of pixels are encoded and the remaining ones must be recovered by the decoder. Therefore, we use the EED scheme for our comparisons.
The results in Fig. 5 show that, while the EED diffusion better recovers large image structures than solving the inverse problem with TV regularizatioin, the PnPADMM scheme retrieves finer details and sharper edges thanks to the advanced regularization provided by the trained denoiser. However, without our preconditioning, some important structures of the image may be lost as shown in Fig. 5(e) (e.g. white spots on the butterfly, zebras’ stripes). On the other hand, our preconditioned version recovers all the structures that could be inferred from the input data, hence resulting in a large gain in PSNR. Note that this observation is consistent with the interpolation results in Fig. 4, where important details of the image are lost in the nonpreconditioned PnPADMM but are preserved with our preconditioning. The average PSNR results in Table V confirm the superiority of the proposed preconditioned PnPADMM over the other approaches for image completion from either or pixels.
6.4 Demosaicing
For the demosaicing, we evaluate the results both in the noisefree scenario and in the presence of Gaussian noise of standard deviation or . In each case, the input images are filtered by the traditional Bayer CFA, i.e. only one of the red green or blue components is known at each pixel. The Bayer CFA forms a repetitive pattern of 2x2 blocks, each containing one red, one blue and two green pixels. For our preconditioned PnPADMM method, we evaluate the results using either the generic denoising network DRUNetvarRBG or the specialized network DRUNetdem that was trained for noise level patterns generated from the Bayer CFA, as explained in Section 5.2.
Similarly to the interpolation problem, the repetition of the 2x2 block pattern in the Bayer CFA may cause artifacts comparable to aliasing. When using the generic network DRUNetvarRBG in our preconditioning scheme, these artifacts are better removed by updating the matrix with a blurred version of the mask. In this case, we use to control the blur at each iteration. However, when using the denoiser DRUNetdem specialized for such patterns, the aliasing artifacts from the Bayer CFA are naturally removed without the need for altering the preconditioning matrix. Thus we use in this case.
Note that for all the experiments, a fast initialisation is performed with Matlab’s demosaicing method [32] that uses bilinear interpolation and gradientbased corrections. Given this initialisation, we fix the number of iterations of the PnPADMM according to the preliminary experiment in Fig. 7. Using our preconditioning (Fig. 7(a)), close to optimal results are obtained using , either with or without noise, and for both the DRUNetvarRBG and DRUNetdem networks. More iterations are needed for the nonpreconditioned PnPADMM which reaches its best performance using in the noisefree scenario (), and using in both noisy settings.
For the evaluation, we use the Kodak [38] and McMaster [39] test datasets that are widely used for demosaicing benchmarks as they contain natural images which include challenging features for this application. We compare both versions of our preconditioned PnPADMM with the nonpreconditioned version (based on the DRUNetcst denoiser), as well as other reference methods including Matlab’s demosaicing method by Malvar et al. [32] and the deep learning based methods of Gharbi et al. (DeepJoint) [31], Henz et al. [21] and Kokkinos et al. (MMNet) [33].
Table VI and Fig. 6 present the results in the noisefree scenario. Our method performs among the best in average PSNR, indicating that it successfully recovers the fine details overall, and with greater accuracy than the nonpreconditioned PnPADMM. However, in very challenging areas with high frequency patterns (see Fig. 6), our preconditioning based on the generic network DRUNetvarRBG leaves some zipper and color artifacts. Note that, on closer inspection, similar artifacts also remain in the Henz et al. and DeepJoint methods. However, using the specialized denoiser DRUNetdem in our scheme completely solves this issue. This shows that it was not a limitation of the preconditioning or the PnPADMM scheme itself, but rather a limitation of the DRUNetvarRBG denoiser that does not perform optimally when the noise level map is a structured pattern constructed from the Bayer CFA.
6.5 Poisson Denoising
For Poisson denoising, we compare the results of the PnPADMM either with or without our preconditioning. Note that the nonpreconditioned version was previously described in [14] for this application. However, the BM3D denoiser was used in [14]. For a fair evaluation of our approach, we use instead the DRUNetcst network which gives the best performance when no preconditioning is required. For our preconditioned scheme, we use the DRUNetvarRBG denoiser since the Poisson noise variance depends not only on pixels’ positions, but also on the colour component.
First, we analyse the results for moderate to high levels of Poisson noise (using , , ). Table VIII and Fig. 10(a) show that the the proposed preconditioning improves the results compared to the original PnPADMM, regardless of the fixed number of iterations . Furthermore, a better convergence is obtained with our approach as shown in Fig. 10. While the results of the nonpreconditioned PnPADMM start degrading when using more than iterations, our method successfully converges to the best result. The improved convergence is confirmed in Fig. 10(b) showing that the equality constraint between the two ADMM variables is better satisfied with our preconditioning. The visual results in Fig. 9 also show that our approach can recover finer details. Using a large number of iterations further improves the PSNR in this case, although the visual results are similar (see Fig. 9(e,f)). On the other hand, in the nonpreconditioned PnPADMM, using a too large number of iterations visually degrades the results by removing details in the bright regions, while leaving more noise in the dark regions (see Fig. 9(c,d)).
Let us now evaluate our method for an extreme noise level, i.e. using . In this situation, the noisy image is a very inaccurate estimate of the noise variance, and it can’t be used to initialize the preconditioning matrix. Instead, we compute an initial estimation using variance stabilization and Gaussian denoising: the Anscombe transform [40] is first applied to obtain an image with approximately constant noise variance; our Gaussian denoiser is then used assuming a constant standard deviation; finally the inverse Anscombe transform is applied after denoising. The preconditioning matrix is thus computed as the square root of this initial estimate. We also disabled the update of which degraded our results in this case. The PSNR results using this initialization method are presented in Table IX, and visual comparisons are shown in Fig. 11. Here, the best results are obtained after convergence either with or without the preconditioning. Hence the results are given for a large number of iterations . Our approach significantly outperforms both the nonpreconditioned PnPADMM and the Anscombe variance stabilization method used for the initialization.
Without Preconditioning  With Preconditioning  

Dataset  

Set5  36.54  36.33  36.82  36.81 
CBSD68  36.35  36.25  36.58  36.57  

Set5  33.34  33.13  33.65  33.69 
CBSD68  32.36  32.19  32.72  32.74  

Set5  31.73  31.41  31.94  32.06 
CBSD68  30.42  30.20  30.81  30.89 




Set5  17.50  22.04  23.82  
CBSD68  17.30  20.50  23.05 
7 Conclusion
In this paper, we have proposed a new preconditioning approach for PlugandPlay optimization methods in the context of image restoration. Existing PnP schemes perform regularization of inverse problems using any conventional denoiser that assumes independent and identically distributed Gaussian noise. On the theoretical level, we have shown that our preconditioning removes this i.i.d. assumption by introducing in the algorithm a denoiser parameterized with a covariance matrix (directly related to the chosen preconditioning matrix) instead of a single standard deviation parameter. Greater control is thus granted for better adapting the algorithm to each task by accounting for the specific error distribution of the current image estimate.
For our practical implementation, we have proposed a training procedure that generalizes a stateoftheart CNN denoiser, enabling its parameterization with an arbitrary noise level map. While this restricts our study to diagonal covariance and preconditioning matrices, it is sufficient for many applications where the input image is degraded independently on each pixel. Such applications include image interpolation, completion, demosaicing and Poisson denoising. For each of these tasks, we have defined a suitable preconditioning scheme that significantly outperforms the nonpreconditioned version both in convergence speed and image quality. The proposed method also preserves the genericity of the PnP approach since a denoiser is used to solve several inverse problems. Nevertheless, as we have shown with demosaicing, our method may be further specialized and improved for a given task by training the denoiser using the corresponding noise level maps.
A possible direction for future work would be to develop denoisers suitable for correlated noise (i.e. nondiagonal covariance matrix), hence extending the use of our preconditioned PnP scheme to a broader range of applications.
Appendix A Poisson data term subproblem
Let us find the closed form solution of the minimization in Eq. (30) which we rewrite here without the iteration numbers for simplicity of notation:
(34) 
where is a diagonal matrix. Here, we additionally assume , and ).
Since is diagonal, the problem is equivalently expressed for each element independently. Using the scalar notations , and , the problem is to find the value that minimizes the function defined as:
(35) 
The function is defined for and is convex. Therefore, finding such that gives the global minimum. Differentiating gives:
(36)  
(37) 
Therefore, if we can find such that is a root of the second order polynomial , then minimizes the function . For , the solution always exists and is:
(38)  
(39) 
Note that if and , then the highest root is , which is not in the domain of . However, when , we can simply redefine using which is defined for . So in practice, we can extend the domain of to the value (negative values are not valid pixel values and should remain excluded). With this definition, Eq. (39) gives the solution in every case.
Acknowledgments
This work was supported in part by the french ANR research agency in the context of the artificial intelligence project DeepCIM, and in part by the EU H2020 Research and Innovation Programme under grant agreement No 694122 (ERC advanced grant CLIM).
References
 [1] S. Diamond, V. Sitzmann, F. Heide, and G. Wetzstein, “Unrolled optimization with deep priors,” arXiv preprint arXiv:1705.08041, 2017.
 [2] M. Mardani, Q. Sun, S. Vasawanal, V. Papyan, H. Monajemi, J. Pauly, and D. Donoho, “Neural proximal gradient descent for compressive imaging,” arXiv preprintarXiv:1806.03963, 2018.
 [3] Y. Yang, J. Sun, H. Li, and Z. Xu, “Deep admmnet for compressive sensing mri,” in Proceedings of the 30th international conference on neural information processing systems, 2016, pp. 10–18.
 [4] D. Gilton, G. Ongie, and R. Willett, “Neumann networks for linear inverse problems in imaging,” IEEE Transactions on Computational Imaging, vol. 6, pp. 328–343, 2019.
 [5] Q. Ning, W. Dong, G. Shi, L. Li, and X. Li, “Accurate and lightweight image superresolution with modelguided deep unfolding network,” IEEE Journal of Selected Topics in Signal Processing, 2020.

[6]
S. Lefkimmiatis, “Universal denoising networks : A novel cnn architecture
for image denoising,” in
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
, Jun. 2018, pp. 3204–3213.  [7] W. Dong, P. Wang, W. Yin, G. Shi, F. Wu, and X. Lu, “Denoising prior driven deep neural network for image restoration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 10, pp. 2305–2318, 2019.
 [8] S. Boyd, N. Parikh, and E. Chu, Distributed optimization and statistical learning via the alternating direction method of multipliers. Now Publishers Inc, 2011.
 [9] D. Geman and C. Yang, “Nonlinear image recovery with halfquadratic regularization,” IEEE transactions on Image Processing, vol. 4, no. 7, pp. 932–946, 1995.
 [10] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plugandplay priors for model based reconstruction,” in 2013 IEEE Global Conference on Signal and Information Processing, 2013, pp. 945–948.
 [11] K. Zhang, Y. Li, W. Zuo, L. Zhang, L. Van Gool, and R. Timofte, “Plugandplay image restoration with deep denoiser prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
 [12] K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep cnn denoiser prior for image restoration,” in IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3929–3938.
 [13] J. R. Chang, C.L. Li, B. Póczos, B. Vijaya Kumar, and A. C. Sankaranarayanan, “One network to solve them all — solving linear inverse problems using deep projection models,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 5889–5898.
 [14] A. Rond, R. Giryes, and M. Elad, “Poisson inverse problems by the plugandplay scheme,” Journal of Visual Communication and Image Representation, vol. 41, pp. 96–108, 2016.
 [15] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3d transformdomain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
 [16] M. Aharon, M. Elad, and A. Bruckstein, “Ksvd: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.
 [17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
 [18] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and ComputerAssisted Intervention, 2015, pp. 234–241.
 [19] T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” in International Conference on Computer Vision (ICCV), Oct. 2017, pp. 1799–1808.
 [20] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–3155, Jul. 2017.
 [21] B. Henz, E. S. Gastal, and M. M. Oliveira, “Deep joint design of color filter arrays and demosaicing,” in Computer Graphics Forum, vol. 37, 2018, pp. 389–399.
 [22] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 5, pp. 898–916, May 2011.
 [23] K. Ma, Z. Duanmu, Q. Wu, Z. Wang, H. Yong, H. Li, and L. Zhang, “Waterloo exploration database: New challenges for image quality assessment models,” IEEE Transactions on Image Processing, vol. 26, no. 2, pp. 1004–1016, 2017.
 [24] E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image superresolution: Dataset and study,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Jul. 2017.
 [25] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image superresolution,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1132–1140.
 [26] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in ICLR (Poster), 2015.
 [27] K. Zhang, W. Zuo, and L. Zhang, “FFDNet: Toward a fast and flexible solution for CNNbased image denoising,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4608–4622, 2018.
 [28] Y. Zhang, K. Li, K. Li, B. Zhong, and Y. Fu, “Residual nonlocal attention networks for image restoration,” in International Conference on Learning Representations (ICLR), May 2019.
 [29] M. Bevilacqua, A. Roumy, C. Guillemot, and M. line Alberi Morel, “Lowcomplexity singleimage superresolution based on nonnegative neighbor embedding,” in Proceedings of the British Machine Vision Conference, 2012, pp. 135.1–135.10.
 [30] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proc. 8th Int’l Conf. Computer Vision, vol. 2, Jul. 2001, pp. 416–423.
 [31] M. Gharbi, G. Chaurasia, S. Paris, and F. Durand, “Deep joint demosaicking and denoising,” ACM Transactions on Graphics (TOG), vol. 35, no. 6, pp. 1–12, 2016.
 [32] H. S. Malvar, L.W. He, and R. Cutler, “Highquality linear interpolation for demosaicing of bayerpatterned color images,” in Proc. IEEE ICASSP, 2004.
 [33] F. Kokkinos and S. Lefkimmiatis, “Iterative joint image demosaicking and denoising using a residual denoising network,” IEEE Transactions on Image Processing, vol. 28, no. 8, pp. 4177–4188, 2019.
 [34] J. Dahl, P. C. Hansen, S. H. Jensen, and T. L. Jensen, “Algorithms and software for total variation image reconstruction via firstorder methods,” in Numerical Algorithms, vol. 53, no. 1, Jul. 2009, pp. 67–92.
 [35] I. Galić, J. Weickert, M. Welk, A. Bruhn, A. Belyaev, and H.P. Seidel, “Image compression with anisotropic diffusion,” Journal of Mathematical Imaging and Vision, vol. 31, no. 2, pp. 255–269, Apr. 2008.
 [36] S. Andris, P. Peter, and J. Weickert, “A proofofconcept framework for pdebased video compression,” in Picture Coding Symposium (PCS), 2016, pp. 1–5.
 [37] C. Schmaltz, P. Peter, M. Mainberger, F. Ebel, J. Weickert, and A. Bruhn, “Understanding, optimising, and extending data compression with anisotropic diffusion,” Int. J. Comput. Vision, vol. 108, no. 3, p. 222–240, Jul. 2014.
 [38] R. Franzen, “Kodak lossless true color image suites,” 1999, source: http://r0k.us/graphics/kodak.
 [39] L. Zhang, X. Wu, A. Buades, and X. Li, “Color demosaicking by local directional interpolation and nonlocal adaptive thresholding,” Journal of Electronic Imaging, vol. 20, no. 2, pp. 1 – 17, 2011.
 [40] F. Anscombe, “The transformation of poisson, binomial and negativebinomial data,” in Biometrika, vol. 35, no. 34, Dec. 1948, pp. 246–254.
Comments
There are no comments yet.