In this paper, we propose a method for wavelet denoising of signals contaminated with Gaussian noise when prior information about the

The proposed methodology is particularly well suited in denoising tasks when the signal-to-noise ratio is low, which is illustrated by simulations on a battery of some standard test functions. Comparison to some commonly used wavelet shrinkage methods is provided.

Function estimation by using shrinkage estimators in the wavelet domain was initiated in the works of David Donoho and Iain Johnstone in the early 1990’s. Since then various shrinkage methods have been proposed. In this article, we propose and investigate a novel approach for estimation in the wavelet domain. This approach brings together the classic principle of Γ-minimaxity, Huber’s famous

For the rest of this introductory section, we review fundamentals of Γ-minimax estimation, wavelet shrinkage, and Bayesian approaches to wavelet shrinkage. This section is mostly expository and gives background and the needed motivation.

The Γ-minimax paradigm, originally proposed by Robbins [

The Γ-minimax paradigm incorporates the prior information about the statistical model by a family of plausible priors, denoted by Γ, rather than by a single prior. Elicitation of “prior families” is often encountered in practice. Given the family of priors, the decision maker selects a rule that is optimal with respect to the least favorable prior in the family.

We note that least favorable priors typically do not exist in unbounded parameter spaces. For example, there is no least favorable prior for estimation of unconstrained normal or Poisson means. There is a least favorable prior for estimating a binomial parameter, and this is a compact parameter space. But no amount of intuition can predict the least favorable prior for the binomial. And, as soon as the loss is changed to a normalized squared error, the least favorable prior changes. There are no links or intuition regarding least favorable priors that are portable. It is too delicate and problem specific. For example, in a quite general setup, Clarke and Barron in [

Inference of this kind is often interpreted in terms of game theory. Formally, if

In a nutshell, the Γ-minimax is a philosophical compromise between subjective Bayes with a single prior and the conservative approach of not having a prior at all.

We apply the Γ-minimax approach to the classical nonparametric regression problem

Wavelet shrinkage rules have been extensively studied in the literature, but mostly when no additional information on the parameter space Θ is available. For the implementation of wavelet methods in nonparametric regression problems, we also refer to [

Bayesian shrinkage methods in wavelet domains have received considerable attention in recent years. Depending on a prior, Bayes’ rules are shrinkage rules. The shrinkage process is defined as follows: A shrinkage rule is applied in the wavelet domain and the observed wavelet coefficients

Bayesian models on the wavelet coefficients have proved capable of incorporating some prior information about the unknown signal, such as smoothness, periodicity, sparseness, self-similarity and, for some particular bases (e.g., Haar), monotonicity.

The shrinkage is usually achieved by eliciting a single prior distribution

It is well known that most of the noiseless signals encountered in practical applications have (for each resolution level) empirical distributions of wavelet coefficients centered around zero and peaked at zero. A realistic Bayesian model that takes into account this prior knowledge should consider a prior distribution for which the prior predictive distribution produces a reasonable agreement with observations. A realistic prior distribution on the wavelet coefficient

Our approach here is different from all the prior literature. Here is the motivation.

It is clear that specifying a single prior distribution

In this paper, we incorporate prior information on the boundedness of the energy of the signal (the

It is well known that estimating a bounded normal mean represents a challenging problem. In our context, if the structure of the prior (

Let Γ denote the family

We consider two models; both assume that wavelet coefficients follow normal distribution (which is a statement about the distribution of the noise),

The rest of the paper is organized as follows. Section

In this section, we discuss the existence and characterization of Γ-minimax shrinkage rules that are Bayes’ with respect to least favorable priors on the interval

We emphasize that we treat the case of Γ-minimax estimation in very general compact convex sets in general finite dimensional Euclidean spaces below. By giving a complete and self-contained derivation of the existence of a least favorable prior and the Γ-minimax estimator in such great generality when the parameter space is sufficiently small, we have given a useful unification of the Γ-minimax problem.

Γ-Minimax Rule for Model I.

Figure

Values of

0.0 | 1.05674 | 0.81758 |

0.1 | 1.15020 | 0.91678 |

0.2 | 1.27739 | 1.05298 |

0.3 | 1.46988 | 1.25773 |

0.4 | 1.84922 | 1.52579 |

0.5 | 2.28384 | 1.74714 |

0.6 | 2.41918 | 1.91515 |

0.7 | 2.50918 | 2.05511 |

0.8 | 2.58807 | 2.19721 |

0.9 | 2.69942 | 2.40872 |

0.95 | 2.81605 | 2.63323 |

0.99 | 3.10039 | 3.24539 |

In Model II, the variance

The model is:

The marginal likelihood is double exponential as an exponential scale mixture of normals,

Sketches of proofs of Theorems

Γ-Minimax Rule for Model II. Left: Rules for

Risk of Γ-Minimax rule for

Frequentist risk of a rule

To explore the behavior of the two risk components in the context of Models I and II, we selected the risk of Γ-minimax rule for

Risk components of Γ-minimax rule for

The most interesting finding is the behavior of the risk function of the rule. The risk function has a wavy shape with maxima at the locations depending on

The proposed Bayesian shrinkage procedures with three-point priors depend on three parameters,

As we indicated,

The nine test signals used in the simulation study.

In the simulation study, we assessed the performance of the proposed shrinkage procedures on the battery of standard test signals. We used nine different test signals (

We generated noisy data samples of the nine test signals by adding normal noise with zero mean and variance

Noisy versions of the nine signals from Fig.

Change in average MSE(AMSE) as a function of hyper-parameters

The shrinkage procedure was applied to each test signal and the AMSE was computed for a range of parameter values of

Based on our simulations, the optimal hyper-parameter values of

In general, we suggest

Estimation of

We compared the performance of the proposed three-point prior estimator with eight existing estimation techniques. The selected existing estimation techniques include: Bayesian adaptive multiresolution shrinker [

In the simulation study, we computed the AMSE using the parameter values of

The box plots of MSE for the ten estimation methods: (1) Rule-I, (2) Rule-II, (3) Bayesian adaptive multiresolution shrinker (BAMS), (4) Decompsh, (5) Block-median, (6) Block-mean, (7) Hybrid version of the block-median procedure, (8) BlockJS, (9) Visu-Shrink, and (10) Generalized cross-validation. The MSE was computed by using

The same plot as in Fig.

We proposed a method for wavelet denoising of signals when prior information about the size of the signal is available. Simple, level-dependent shrinkage rules that are Bayes with respect to a prior supported on three discrete points are at the same time Gamma-minimax, for a set of all priors with a bounded support symmetric about zero with a fixed point mass at zero. This statement is true when the signal is bounded on

As demonstrated by simulations on a battery of standard test functions, the performance of the proposed shrinkage rules in terms of AMSE is comparable, and for some signals superior, to the state-of-art methods when SNR is low and the signals are smooth.

Let

Define the family of prior distributions

Consider the classes

This shows that a least favorable prior distribution exists for general

Denote the least favorable prior on

Model I: Frequentist risks of W-, VVV-, and V-shape,

Now consider the special case that

In the entire one parameter exponential family, the risk function

In the setup of Model I, the point mass at 0 is a part of every prior; the second component

The corresponding Bayes rule

To find values of

For small values of

This is a typical shape for a risk of the least favorable distribution in a class of all bounded on

For values of

The case when

If in Model II the normal likelihood is replaced by the marginal likelihood, after

Model II: Frequentist risk of the Γ-minimax rule,

The argument is similar as in Model I, for the W-shape (Left panel in Fig.

The authors thank the editor and the three anonymous referees for careful reading of the manuscript and constructive comments.