awsdens {aws} R Documentation

## local constant density estimation

### Description

This function uses adaptive weights smoothing for density estimation. The estimate is constructed using poisson regression and the structural assumption of a local constant density. Implementation is for 1D, 2D and 3D problems, although it applications may be interesting only in 2D and 3D.

### Usage

```awsdens(y, ngrid = NULL, nempty = NULL, qlambda = NULL, eta = 0.5,
lkern = "Triangle", fu = NULL, hinit = 1, hincr = 1.2,
hmax = NULL, graph = FALSE, demo = FALSE, symmetric = TRUE)
```

### Arguments

 `y` `y` contains the observed random variables or vectors. In case of random vectors `y` is supposed to be an observation matrix with columns containing the observed vectors `ngrid` `ngrid` determines the size of the grid used for binning. If `ngrid` is a vector its components are supposed to determine the number of intervals for the corresponding component of `y`; if it is a scalar `rep(ngrid,dim(y))` is used. If `is.null(ngrid)` a default of `2*n^{1/d}`, with `dim(y)=c(d,n)`, is chosen, i.e. a grid with `2^d*n` bins. `nempty` determines the width of a boundary region of the grid containing only empty bins. `nempty` defaults to `0.1*ngrid`. The grid generated is equispaced with `prod(ngrid)` bins and grid-coordinates `(nempty+1)[i]` and `(ngrid+1-nempty)[i]` determined by `min(y[i,])` and `max(y[i,])`. `qlambda` `qlambda` determines the scale parameter `qlambda` for the stochastic penalty. The scaling parameter in the stochastic penalty `lambda` is choosen as the `qlambda`-quantile of a Chi-square-distribution with number of parameters in the polynomial model as degrees of freedom. If `qlambda=NULL` a standard value depending on `model` and `symmetric` is choosen. `eta` `eta` is a memory parameter used to stabilize the procedure. `eta` has to be between `0` and `1`, with `eta=.5` being the default. `lkern` `lkern` determines the location kernel to be used. Options are `"Uniform"`, `"Triangle"`, `"Quadratic"`, `"Cubic"` and `"Exponential"`. Default is `"Triangle"`. The Kernel operates on the squared distance, so `"Triangle"` corresponds to the use of an Epanechnikov kernel in kernel smoothing. `"Exponential"` requires larger values of `hmax` and therefore more iterations to reach comparable results. `fu` `fu` can be used to specify a function to calculate the values of the true density on the grid. This may be used for test purposes to calculate Mean Squared Error (MSE) and Mean Absolute Error (MAE) `hinit` `hinit` Initial bandwidth for the location penalty. Appropriate value is choosen in case of `hinit==NULL` `hincr` `hincr` `hincr^(1/d)`, with `d` the dimensionality of the design, is used as a factor to increase the bandwidth between iterations. Defauts to `hincr=1.2` `hmax` `hmax` Maximal bandwidth to be used. Determines the number of iterations and is used as the stopping rule. `graph` `graph` if `TRUE` results are displayed after each iteration step. `demo` `demo` if `TRUE` after each iteration step results are displayed and the process waits for user interaction. `symmetric` If `symmetric==TRUE` the stochastic penalty is symmetrized, i.e. `(sij + sji)/lambda` is used instead of `sij/lambda`. See references for details.

### Details

The density estimate ist constructed using an approach proposed by Lindsay (1974a,b). 1, 2 or 3-dimensional binning is used to produce counts, i.e. numbers of observations, per bin, with bins defined by a regular grid. THe bin-counts are then considered as poisson random variables with intensity depending on the location within the grid. Adaptive Weights Smoothing for poisson regression models, i.e. function `laws` with parameter `model="Poisson"`, is used to construct a location dependent intensity estimate which is then transformed into a density estimate, see Section 6 in Polzehl and Spokoiny (2002b) for details.

### Value

The returned object is a list with components

 `bin ` Bin counts `dens ` density values for the bin. Values correspond to the grid-centers determined by component `xgrid` `xgrid ` list with components containing the coordinates of bin-centers. The `dim(y)` components of `xgrid` correspond to the components of the grid-coordinates. `call ` actual function call

### Author(s)

Joerg Polzehl, polzehl@wias-berlin.de

### References

Polzehl, J. and Spokoiny, V. (2002). Local likelihood modelling by adaptive weights smoothing, WIAS-Preprint 787
Polzehl, J. and Spokoiny, V. (2000). Adaptive Weights Smoothing with applications to image restoration, J.R.Statist.Soc. B, 62, Part 2, pp.335-354
Lindsay, J. (1974a). Comparison of probabbility distributions, J. Royal Statist. Soc. Ser. B 36, 38-47.
Lindsay, J. (1974b). Construction and comparison of statistical models, J. Royal Statist. Soc. Ser. B 36, 418-425.

SEE ALSO `aws`, `laws`

### Examples

```###
###   univariate density estimation
###
f0 <- function(x) 1.5*(x>=0)-(x>=.25)+(x>=.75)-1.5*(x>1)
set.seed(1)
m <- 1000
x1 <- runif(m)
ind <- (runif(m)<f0(x1)/1.5)
n <- 200
y <- x1[ind][1:n]
tmp0 <- awsdens(y,440,20,hmax=2000)
plot(tmp0\$xgrid[],tmp0\$dens,type="l",lwd=2,
ylim =range(0,1.5,tmp0\$dens),xlab="x",ylab="Density")
lines(tmp0\$xgrid[],f0(tmp0\$xgrid[]),lty=3,col=2,lwd=2)
title("Estimated and true density (n=200)")
###
###   bivariate density estimation
###
f1 <- function(x,y) 7.5*pmax(x*(1-x^2-y^2),0)*(x>0)*(y>0)
set.seed(1)
m <- 10000
x1 <- runif(m)
x2 <- runif(m)
fx12 <- f1(x1,x2)
ind <- (runif(m)<f1(x1,x2)/.385/7.5)
n <- 2500
y <- rbind(x1[ind],x2[ind])[,1:n]
tmp <- awsdens(y,120,10,hmax=5)
image(tmp\$xgrid[],tmp\$xgrid[],tmp\$dens,xlab="x1",ylab="x2",col=gray(.5+(255:0)/511),lwd=2)
#   thats the estimated density on the grid
lines(seq(0,1,.01),sqrt(1-seq(0,1,.01)^2),col=2,lty=2,lwd=2)
lines(c(0,1),c(0,0),col=2,lty=2,lwd=2)
lines(c(0,0),c(0,1),col=2,lty=2,lwd=2)
#    thats the boundary of the support
title("Estimated density (n=2500)")
#    now add contour lines of the estimated density