bars {BARS} | R Documentation |
Solves the generalized nonparametric regression (curve-fitting) problem by assuming the function f(x) may be approximated with a cubic spline. Utilizes reversible-jump MCMC on the sets of knots, allowing the spline to adapt to the data across the domain of x.
bars<-(x,y,iknots=25,prior="uniform", priorparam=c(1,60),burnin=200,sims=2000,tau=50.0,c=0.4,fits=TRUE, peak=FALSE,conf=0.95,gridfit=FALSE,ngridpoints=150)
Required: x,y
Optional:
Initial knots settings: iknots
Settings on prior for knots: prior, priorparam
MCMC settings: burnin, nsims, tau, c
Output settings: fits, peak, conf
Other: gridfit, ngridpoints
x |
a vector of the independent variable, in increasing order |
y |
a vector of the dependent variable |
iknots |
the initial number of knots for the spline [default = 25] |
prior |
the type of prior being used for the knots; the only acceptable answers are "poisson", "uniform", and "user" [default is "uniform"] |
priorparam |
the parameter for the prior (See Details) |
burnin |
the desired length of the burn-in for the MCMC chain [default = 200] |
nsims |
the number of simulations (number of draws from the posterior) desired for the MCMC chain [default = 2000] |
tau |
a parameter that controls the spread for the knot proposal distribution [default = 50.0] |
c |
a paramater that controls the probability of birth and death candidates [default = 0.4] |
fits |
a logical. The default 'TRUE' will cause the program to return the fitted values for each data point for each run of the simulation. Please note that if the number of data points and/or simulations is large, there may be a lengthy delay as the necessary data are read. |
peak |
a logical. 'TRUE' will cause the program to return the location and height of the highest point on the fitted curve [default = 'FALSE'] |
conf |
for use with peak. Sets the probability for the credible intervals for the location and height of the peak [default = 0.95 for 95% credible intervals] |
ngridpoints |
the number of bins one desires the x-axis to be divided into. Used to handle unbinned data and calculate posterior modes [default = 150] |
Only x and y are required arguments. Unspecified arguments will use default values:
bars(x, y)
bars(x, y, iknots=5)
For the parameter for the prior:
a. if using Poisson, the choice for lambda = mean
b. if using Uniform, a vector of length 2 which includes the minimum number of knots followed by the maximum number of knots [default = c(1,60)]
c. if using user-defined prior, a matrix with 2 columns. The first column should be the number of knots and the second column should be the probability of obtaining this number of knots. Note the following example:
2 | 0.05 |
3 | 0.15 |
4 | 0.30 |
5 | 0.30 |
6 | 0.10 |
7 | 0.10 |
Note that due to the large amount of output generated by the program, it is desirable to save the results of the program into a variable, such as:
out <- bars(x,y,.....)
a list containing the following: Always: postmeans, postmodes, simnum, nknots, sampknots, sampBICs, sampllikes Optional, if fits = TRUE: sampfits Optional, if gridfit = TRUE: gridmeans, gridmodes Optional, if peak = TRUE: samplpeaks, samphpeaks, peaklocationquantile, peaklocationmean, peaklocationmode, peakheightquantile, peakheightmean, peakheightmode
postmeans |
vector of the posterior means of f(x) evaluated at the x values |
postmodes |
vector of the posterior modes evaluated at the x values |
simnum |
vector of each simulation number (the number of that draw from the posterior), beginning at burnin + 1 and ending at burnin + nsims |
nknots |
vector of the number of knots used at each draw from the posterior (does not include burnin iterations) |
sampknots |
matrix containing the position of the knots at each iteration. Length of the matrix is equal to the number of iterations, not including burnin iterations, with the width of the matrix equal to the maximum number of knots at any iteration. NAs are used to fill in the matrix at iteration numbers that have less than the maximum number of knots. |
sampBICs |
vector of the calculated BIC at each draw from the posterior (does not include burnin iterations) |
sampllikes |
vector of the calculated loglikelihood at each draw from the posterior (does not include burnin iterations) |
sampfits |
optional, if fits='TRUE': matrix of fits for each draw from the posterior (does not include burnin iterations), with the rows of the matrix corresponding to the individual draw from the posterior. The columns represent the fits at each value of x. |
gridmeans |
optional, if gridfit='TRUE': vector of the posterior means evaluated at equally spaced intervals with number of intervals determined by value of "ngridpoints" |
gridmodes |
optional, if gridfit='TRUE': vector of the posterior modes evaluated at equally spaced intervals with number of intervals determined by value of "ngridpoints" |
samplpeaks |
optional, if peaks='TRUE': vector of the x location of the highest point in the fitted curve for each draw from the posterior (does not include burnin iterations) |
samphpeaks |
optional, if peaks='TRUE': vector of the y value (height) of the highest point in the fitted curve for each draw from the posterior (does not include burnin iterations) |
peaklocationquantile |
optional, if peaks='TRUE': a credible interval for the x location of the highest peak; width of the interval is dependent upon the setting chosen for conf |
peaklocationmean |
optional, if peaks='TRUE': the mean x location for the highest peak |
peaklocationmode |
optional, if peaks='TRUE': the mode x location for the highest peak |
peakheightquantile |
optional, if peaks='TRUE': a credible interval for the y value (height) of the heighest peak; width of the interval is dependent upon the setting chosen for conf |
peakheightmean |
optional, if peaks='TRUE': the mean y value (height) of the heighest peak |
peakheightmode |
optional, if peaks='TRUE': the mode y value (height) of the highest peak |
Garrick Wallstrom
This function is designed to work with Wallstrom and Kass's Barslib, (C) 2003.
Barslib uses Hansen and Kooperberg's Logspline, (C) 1997, as the default method for selecting initial knots.
Barslib uses Bates and Venables Routines for manipulating B-splines, (C) 1998. The routines were released under the GNU General Public License version
see http://www.stat.cmu.edu/~kass/bars/bars.html
x<-c(1:110) mu<-20*abs(exp(-x/100)*cos(x/5)) y<-rnorm(110,mu,1) out = bars(x,y,prior="uniform",priorparam=c(1,15),fits=TRUE) upper=apply(out$sampfits,2,quantile,0.975) lower=apply(out$sampfits,2,quantile,0.025) matplot(x,cbind(upper,out$postmeans,lower),lty=c(2,1,2),type='l') points(x,y)