awsdens {aws}R Documentation

local constant density estimation

Description

This function uses adaptive weights smoothing for density estimation. The estimate is constructed using poisson regression and the structural assumption of a local constant density. Implementation is for 1D, 2D and 3D problems, although it applications may be interesting only in 2D and 3D.

Usage

awsdens(y, ngrid = NULL, nempty = NULL, qlambda = NULL, eta = 0.5, 
        lkern = "Triangle", fu = NULL, hinit = 1, hincr = 1.2, 
    hmax = NULL, graph = FALSE, demo = FALSE, symmetric = TRUE)

Arguments

y y contains the observed random variables or vectors. In case of random vectors y is supposed to be an observation matrix with columns containing the observed vectors
ngrid ngrid determines the size of the grid used for binning. If ngrid is a vector its components are supposed to determine the number of intervals for the corresponding component of y; if it is a scalar rep(ngrid,dim(y)[1]) is used. If is.null(ngrid) a default of 2*n^{1/d}, with dim(y)=c(d,n), is chosen, i.e. a grid with 2^d*n bins.
nempty determines the width of a boundary region of the grid containing only empty bins. nempty defaults to 0.1*ngrid. The grid generated is equispaced with prod(ngrid) bins and grid-coordinates (nempty+1)[i] and (ngrid+1-nempty)[i] determined by min(y[i,]) and max(y[i,]).
qlambda qlambda determines the scale parameter qlambda for the stochastic penalty. The scaling parameter in the stochastic penalty lambda is choosen as the qlambda-quantile of a Chi-square-distribution with number of parameters in the polynomial model as degrees of freedom. If qlambda=NULL a standard value depending on model and symmetric is choosen.
eta eta is a memory parameter used to stabilize the procedure. eta has to be between 0 and 1, with eta=.5 being the default.
lkern lkern determines the location kernel to be used. Options are "Uniform", "Triangle", "Quadratic", "Cubic" and "Exponential". Default is "Triangle". The Kernel operates on the squared distance, so "Triangle" corresponds to the use of an Epanechnikov kernel in kernel smoothing. "Exponential" requires larger values of hmax and therefore more iterations to reach comparable results.
fu fu can be used to specify a function to calculate the values of the true density on the grid. This may be used for test purposes to calculate Mean Squared Error (MSE) and Mean Absolute Error (MAE)
hinit hinit Initial bandwidth for the location penalty. Appropriate value is choosen in case of hinit==NULL
hincr hincr hincr^(1/d), with d the dimensionality of the design, is used as a factor to increase the bandwidth between iterations. Defauts to hincr=1.2
hmax hmax Maximal bandwidth to be used. Determines the number of iterations and is used as the stopping rule.
graph graph if TRUE results are displayed after each iteration step.
demo demo if TRUE after each iteration step results are displayed and the process waits for user interaction.
symmetric If symmetric==TRUE the stochastic penalty is symmetrized, i.e. (sij + sji)/lambda is used instead of sij/lambda. See references for details.

Details

The density estimate ist constructed using an approach proposed by Lindsay (1974a,b). 1, 2 or 3-dimensional binning is used to produce counts, i.e. numbers of observations, per bin, with bins defined by a regular grid. THe bin-counts are then considered as poisson random variables with intensity depending on the location within the grid. Adaptive Weights Smoothing for poisson regression models, i.e. function laws with parameter model="Poisson", is used to construct a location dependent intensity estimate which is then transformed into a density estimate, see Section 6 in Polzehl and Spokoiny (2002b) for details.

Value

The returned object is a list with components

bin Bin counts
dens density values for the bin. Values correspond to the grid-centers determined by component xgrid
xgrid list with components containing the coordinates of bin-centers. The dim(y)[1] components of xgrid correspond to the components of the grid-coordinates.
call actual function call

Author(s)

Joerg Polzehl, polzehl@wias-berlin.de

References

Polzehl, J. and Spokoiny, V. (2002). Local likelihood modelling by adaptive weights smoothing, WIAS-Preprint 787
Polzehl, J. and Spokoiny, V. (2000). Adaptive Weights Smoothing with applications to image restoration, J.R.Statist.Soc. B, 62, Part 2, pp.335-354
Lindsay, J. (1974a). Comparison of probabbility distributions, J. Royal Statist. Soc. Ser. B 36, 38-47.
Lindsay, J. (1974b). Construction and comparison of statistical models, J. Royal Statist. Soc. Ser. B 36, 418-425.

See Also

SEE ALSO aws, laws

Examples

###
###   univariate density estimation
###
f0 <- function(x) 1.5*(x>=0)-(x>=.25)+(x>=.75)-1.5*(x>1)
set.seed(1)
m <- 1000
x1 <- runif(m)
ind <- (runif(m)<f0(x1)/1.5)
n <- 200
y <- x1[ind][1:n]
tmp0 <- awsdens(y,440,20,hmax=2000)
plot(tmp0$xgrid[[1]],tmp0$dens,type="l",lwd=2,
     ylim =range(0,1.5,tmp0$dens),xlab="x",ylab="Density")
lines(tmp0$xgrid[[1]],f0(tmp0$xgrid[[1]]),lty=3,col=2,lwd=2)
title("Estimated and true density (n=200)")
###
###   bivariate density estimation
###
f1 <- function(x,y) 7.5*pmax(x*(1-x^2-y^2),0)*(x>0)*(y>0)
set.seed(1)
m <- 10000
x1 <- runif(m)
x2 <- runif(m)
fx12 <- f1(x1,x2)
ind <- (runif(m)<f1(x1,x2)/.385/7.5)
n <- 2500
y <- rbind(x1[ind],x2[ind])[,1:n]
tmp <- awsdens(y,120,10,hmax=5)
image(tmp$xgrid[[1]],tmp$xgrid[[2]],tmp$dens,xlab="x1",ylab="x2",col=gray(.5+(255:0)/511),lwd=2)
#   thats the estimated density on the grid
lines(seq(0,1,.01),sqrt(1-seq(0,1,.01)^2),col=2,lty=2,lwd=2)
lines(c(0,1),c(0,0),col=2,lty=2,lwd=2)
lines(c(0,0),c(0,1),col=2,lty=2,lwd=2)
#    thats the boundary of the support
title("Estimated density (n=2500)")
#    now add contour lines of the estimated density
contour(tmp$xgrid[[1]],tmp$xgrid[[2]],tmp$dens,nlevels=20,add=TRUE,col=1,lty=1,labcex=.1)
rm(f0,m,x1,ind,n,y,tmp0,f1,x2,fx12,tmp)

[Package aws version 1.3-0 Index]