varset {ipred}R Documentation

Simulation Model


Three sets of variables are calculated: explanatory, intermediate and response variables.


varset(N, sigma=0.1, theta=90, threshold=0, u=1:3)


N number of simulated observations.
sigma standard deviation of the error term.
theta angle between two u vectors.
threshold cutpoint for classifying to 0 or 1.
u starting values.


For each observation values of two explanatory variables x = (x_1, x_2)^{top} and of two responses y = (y_1, y_2)^{top} are simulated, following the formula:

y = U*x+e = ({u_1^{top} atop u_2^{top}})*x+e

where x is the evaluation of as standard normal random variable and e is generated by a normal variable with standard deviation sigma. U is a 2*2 Matrix, where

u_1 = ({u_{1, 1} atop u_{1, 2}}), u_2 = ({u_{2, 1} atop u_{2, 2}}), ||u_1|| = ||u_2|| = 1,

i.e. a matrix of two normalised vectors.


A list containing the following arguments

explanatory N*2 matrix of 2 explanatory variables.
intermediate N*2 matrix of 2 intermediate variables.
response response vectors with values 0 or 1.


Andrea Peters <>


David J. Hand, Hua Gui Li, Niall M. Adams (2001), Supervised classification with structured class definitions. Computational Statistics & Data Analysis 36, 209–225.


theta90 <- varset(N = 1000, sigma = 0.1, theta = 90, threshold = 0)
theta0 <- varset(N = 1000, sigma = 0.1, theta = 0, threshold = 0)
par(mfrow = c(1, 2))

[Package ipred version 0.8-1 Index]