Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function collection p to w #265

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
3 changes: 2 additions & 1 deletion BayesianTools/R/MAP.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@

#' calculates the Maxiumum APosteriori value (MAP)
#' @author Florian Hartig
#' @param bayesianOutput an object of class BayesianOutput (mcmcSampler, smcSampler, or mcmcList)
#' @param ... optional values to be passed on the the getSample function
#' @details Currently, this function simply returns the parameter combination with the highest posterior in the chain. A more refined option would be to take the MCMC sample and do additional calculations, e.g. use an optimizer, a kerne delnsity estimator, or some other tool to search / interpolate around the best value in the chain
#' @details Currently, this function simply returns the parameter combination with the highest posterior in the chain. A more refined option would be to take the MCMC sample and do additional calculations, e.g. use an optimizer, a kernel density estimator, or some other tool to search / interpolate around the best value in the chain.
#' @seealso \code{\link{WAIC}}, \code{\link{DIC}}, \code{\link{marginalLikelihood}}
#' @export
MAP <- function(bayesianOutput, ...){
Expand Down
6 changes: 2 additions & 4 deletions BayesianTools/R/SBC.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
#' Simulation-based calibration tests
#'
#' This function performs simulation-based calibration tests based on the idea that posteriors averaged over the prior should yield the prior.
#'
#' @param posteriorList a list with posterior samples. List items must be of a class that is supported by \code{\link{getSample}}. This includes BayesianTools objects, but also matrix and data.frame
#' @param priorDraws a matrix with parameter values, drawn from the prior, that were used to simulate the data underlying the posteriorList. If colnames are provided, these will be used in the plots
#' @param posteriorList a list of posterior samples. List items must be of a class that is supported by \code{\link{getSample}}. This includes BayesianTools objects, but also matrix and data.frame objects.
#' @param priorDraws a matrix of parameter values, drawn from the prior, that were used to simulate the data underlying the posteriorList. If colnames are provided, they are used in the plots
#' @param ... arguments to be passed to \code{\link{getSample}}. Consider in particular the thinning option.
#'
#' @details The purpose of this function is to evaluate the results of a simulation-based calibration of an MCMC analysis.
#'
#' Briefly, the idea is to repeatedly
Expand Down
7 changes: 4 additions & 3 deletions BayesianTools/R/SMC.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@

#' SMC sampler
#' @author Florian Hartig
#' @description Sequential Monte Carlo Sampler
#' @param bayesianSetup either an object of class bayesianSetup created by \code{\link{createBayesianSetup}} (recommended), or a log target function
#' @param initialParticles initial particles - either a draw from the prior, provided as a matrix with the single parameters as columns and each row being one particle (parameter vector), or a numeric value with the number of desired particles. In this case, the sampling option must be provided in the prior of the BayesianSetup.
#' @param iterations number of iterations
#' @param resampling if new particles should be created at each iteration
#' @param resampling logical, specifies whether new particles should be created at each iteration
#' @param resamplingSteps how many resampling (MCMC) steps between the iterations
#' @param proposal optional proposal class
#' @param adaptive should the covariance of the proposal be adapted during sampling
#' @param proposal optional, proposal class
#' @param adaptive logical, should the covariance of the proposal be adapted during sampling?
#' @param proposalScale scaling factor for the proposal generation. Can be adapted if there is too much / too little rejection
#' @details The sampler can be used for rejection sampling as well as for sequential Monte Carlo. For the former case set the iterations to one.
#'
Expand Down
9 changes: 5 additions & 4 deletions BayesianTools/R/VSEM.R
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@

#' Very simple ecosystem model
#' @description A very simple ecosystem model, based on three carbon pools and a basic LUE model
#' @param pars a parameter vector with parameters and initial states
#' @param PAR Forcing, photosynthetically active radiation (PAR) MJ /m2 /day
#' @param C switch to choose whether to use the C or R version of the model. C is much faster.
#' @return a matrix with colums NEE, CV, CR and CS units and explanations see details
#' @return a matrix with columns NEE, CV, CR and CS units and explanations see details
#' @import Rcpp
#' @useDynLib BayesianTools, .registration = TRUE
#' @details This Very Simple Ecosystem Model (VSEM) is a 'toy' model designed to be very simple but yet bear some resemblance to deterministic processed based ecosystem models (PBMs) that are commonly used in forest modelling.
Expand All @@ -12,7 +13,7 @@
#'
#' The model calculates Gross Primary Productivity (GPP) using a very simple light-use efficiency (LUE) formulation multiplied by light interception. Light interception is calculated via Beer's law with a constant light extinction coefficient operating on Leaf Area Index (LAI).
#'
#' A parameter (GAMMA) determines the fraction of GPP that is autotrophic respiration. The Net Primary Productivity (NPP) is then allocated to above and below-ground vegetation via a fixed allocation fraction. Carbon is lost from the plant pools to a single soil pool via fixed turnover rates. Heterotropic respiration in the soil is determined via a soil turnover rate.
#' A parameter (GAMMA) determines the fraction of GPP that is autotrophic respiration. The Net Primary Productivity (NPP) is then allocated to above and below-ground vegetation via a fixed allocation fraction. Carbon is lost from the plant pools to a single soil pool via fixed turnover rates. Heterotrophic respiration in the soil is determined via a soil turnover rate.
#'
#' The model equations are
#'
Expand Down Expand Up @@ -166,8 +167,8 @@ VSEMcreatePAR <- function(days = 1:(3*365)){

#' Create an example dataset, and from that a likelihood or posterior for the VSEM model
#' @author Florian Hartig
#' @param likelihoodOnly switch to devide whether to create only a likelihood, or a full bayesianSetup with uniform priors.
#' @param plot switch to decide whether data should be plotted
#' @param likelihoodOnly logical, decides whether to create only a likelihood, or a full bayesianSetup with uniform priors.
#' @param plot logical, decides whether data should be plotted
#' @param selection vector containing the indices of the selected parameters
#' @details The purpose of this function is to be able to conveniently create a likelihood for the VSEM model for demonstration purposes. The function creates example data --> likelihood --> BayesianSetup, where the latter is the
#' @export
Expand Down
3 changes: 1 addition & 2 deletions BayesianTools/R/WAIC.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# TODO - implement WAIC as AIC, can look at https://github.com/jrnold/mcmcStats/blob/master/R/waic.R, check against http:https://finzi.psych.upenn.edu/library/blmeco/html/WAIC.html, https://cran.r-project.org/web/packages/loo/index.html, http:https://stats.stackexchange.com/questions/173128/watanabe-akaike-widely-applicable-information-criterion-waic-using-pymc


Expand All @@ -7,7 +6,7 @@
#' @param bayesianOutput an object of class BayesianOutput. Must implement a log-likelihood density function that can return point-wise log-likelihood values ("sum" argument).
#' @param numSamples the number of samples to calculate the WAIC
#' @param ... optional values to be passed on the the getSample function
#' @note The function requires that the likelihood passed on to BayesianSetup contains the option sum = T/F, with defaul F. If set to true, the likelihood for each data point must be returned.
#' @note The function requires that the likelihood passed on to BayesianSetup contains the option sum = T/F, with default F. If set to true, the likelihood for each data point must be returned.
#' @details
#'
#'
Expand Down
4 changes: 2 additions & 2 deletions BayesianTools/R/blockUpdate.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#' Determine the groups of correlated parameters
#' @author Stefan Paul
#' @param chain MCMC chain including only the parameters (not logP,ll, logP)
#' @param blockSettings list with settings
#' @param blockSettings a list with settings
#' @return groups
#' @keywords internal
updateGroups <- function(chain,blockSettings){
Expand Down Expand Up @@ -60,7 +60,7 @@ getBlock <- function(blockSettings){


#' getblockSettings
#' @description Transforms the original settings in settings used in the model runs
#' @description Transforms the original settings to settings used in the model runs
#' @param blockUpdate input settings
#' @return list with block settings
#' @keywords internal
Expand Down
13 changes: 6 additions & 7 deletions BayesianTools/R/classBayesianOutput.R
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
# NOTE: The functions in this class are just templates that are to be implemented for all subclasses of BayesianOutput. They are not functional.


#' Extracts the sample from a bayesianOutput
#' @author Florian Hartig
#' @param sampler an object of class mcmcSampler, mcmcSamplerList, smcSampler, smcSamplerList, mcmc, mcmc.list, double, numeric
#' @param parametersOnly for a BT output, if F, likelihood, posterior and prior values are also provided in the output
#' @param coda works only for mcmc classes - provides output as a coda object. Note: if mcmcSamplerList contains mcmc samplers such as DE that have several chains, the internal chains will be collapsed. This may not be the desired behavior for all applications.
#' @param start for mcmc samplers start value in the chain. For SMC samplers, start particle
#' @param coda works only for mcmc classes - returns output as a coda object. Note: if mcmcSamplerList contains mcmc samplers such as DE that have several chains, the internal chains will be collapsed. This may not be desired for all applications.
#' @param start for mcmc samplers, start value in the chain. For SMC samplers, start particle
#' @param end for mcmc samplers end value in the chain. For SMC samplers, end particle
#' @param thin thinning parameter. Either an integer determining the thinning intervall (default is 1) or "auto" for automatic thinning.
#' @param numSamples sample size (only used if thin = 1). If you want to use numSamples set thin to 1.
#' @param thin thinning parameter. Either an integer determining the thinning interval (default is 1) or "auto" for automatic thinning.
#' @param numSamples sample size (only used if thin = 1). If you want to use numSamples, set thin to 1.
#' @param whichParameters possibility to select parameters by index
#' @param reportDiagnostics logical, determines whether settings should be included in the output
#' @param ... further arguments
#' @example /inst/examples/getSampleHelp.R
#' @details If thin is greater than the total number of samples in the sampler object the first and the last element (of each chain if a sampler with multiples chains is used) are sampled. If numSamples is greater than the total number of samples all samples are selected. In both cases a warning is displayed.
#' @details If thin and numSamples is passed, the function will use the thin argument if it is valid and greater than 1, else numSamples will be used.
#' @details If thin is greater than the total number of samples in the sampler object, the first and the last element (of each chain if a sampler with multiples chains is used) are sampled. If numSamples is greater than the total number of samples all samples are selected. A warning will be displayed in both cases.
#' @details If both thin and numSamples are provided, the function will use thin only if it is valid and greater than 1; otherwise, numSamples will be used.
#' @export
getSample <- function(sampler, parametersOnly = T, coda = F, start = 1, end = NULL, thin = 1, numSamples = NULL, whichParameters = NULL, reportDiagnostics = FALSE, ...) UseMethod("getSample")

Expand Down
12 changes: 6 additions & 6 deletions BayesianTools/R/classBayesianSetup.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,20 @@
#' @param best vector with best prior values
#' @param names optional vector with parameter names
#' @param parallel parallelization option. Default is F. Other options include T, or "external". See details.
#' @param parallelOptions list containing three lists. First "packages" determines the R packages necessary to run the likelihood function. Second "variables" the objects in the global environment needed to run the likelihood function and third "dlls" the DLLs needed to run the likelihood function (see Details and Examples).
#' @param catchDuplicates Logical, determines whether unique parameter combinations should only be evaluated once. Only used when the likelihood accepts a matrix with parameter as columns.
#' @param parallelOptions list containing three lists.\itemize{ \item First, "packages" determines the R packages necessary to run the likelihood function.\item Second, "variables" - the objects in the global environment needed to run the likelihood function and \item Third, "dlls" is needed to run the likelihood function (see Details and Examples).}
#' @param catchDuplicates logical, determines whether unique parameter combinations should only be evaluated once. Only used when the likelihood accepts a matrix with parameter as columns.
#' @param plotLower vector with lower limits for plotting
#' @param plotUpper vector with upper limits for plotting
#' @param plotBest vector with best values for plotting
#' @details If prior is of class prior (e.g. create with \code{\link{createPrior}}), priorSampler, lower, upper and best will be ignored.\cr If prior is a function (log prior density), priorSampler (custom sampler), or lower/upper (uniform sampler) is required.\cr If prior is NULL, and lower and upper are passed, a uniform prior (see \code{\link{createUniformPrior}}) will be created with boundaries lower and upper.
#'
#' For parallelization, Bayesiantools requies that the likelihood can evaluate several parameter vectors (supplied as a matrix) in parallel.
#' For parallelization, Bayesiantools requires that the likelihood can evaluate multiple parameter vectors (supplied as a matrix) in parallel.
#'
#' * parallel = T means that an automatic parallelization of the likelihood via a standard R socket cluster is attempted, using the function \code{\link{generateParallelExecuter}}. By default, of the N cores detected on the computer, N-1 cores are requested. Alternatively, you can provide a integer number to parallel, specifying the cores reserved for the cluster. When the cluster is cluster is created, a copy of your workspace, including DLLs and objects are exported to the cluster workers. Because this can be very inefficient, you can explicitly specify the packages, objects and DLLs that are to be exported via parallelOptions. Using parallel = T requires that the function to be parallelized is well encapsulate, i.e. can run on a shared memory / shared hard disk machine in parallel without interfering with each other.
#' * parallel = T attempts to parallelize likelihood via a standard R socket cluster using the \code{\link{generateParallelExecuter}} function. By default, of the N cores detected on the computer, N-1 cores are requested. Alternatively, you can provide a integer number to parallel, specifying the cores reserved for the cluster. When the cluster is created, a copy of your workspace, including DLLs and objects are exported to the cluster workers. As this approach can be highly inefficient, it is recommended to explicitly specify the packages, objects and DLLs to export using parallelOptions. Using parallel = T requires that the function to be parallelized is well encapsulated, i.e. can run in parallel on a shared memory / shared hard disk machine in parallel without interfering with each other.
#'
#' If automatic parallelization cannot be done (e.g. because dlls are not thread-safe or write to shared disk), and only in this case, you should specify parallel = "external". In this case, it is assumed that the likelihood is programmed such that it accepts a matrix with parameters as columns and the different model runs as rows. It is then up to the user if and how to parallelize this function. This option gives most flexibility to the user, in particular for complicated parallel architecture or shared memory problems.
#' If automatic parallelization is not possible (e.g., because dlls are not thread-safe or write to shared disk), and only in this case, you should specify parallel = "external". In this case, it is assumed that the likelihood is programmed to accept a matrix with parameters as columns and the different model runs as rows. The user can then choose whether and how to parallelize this function. This option provides optimal flexibility for the user, especially regarding complicated parallel architectures or shared memory issues.
#'
#' For more details on parallelization, make sure to read both vignettes, in particular the section on the likelihood in the main vignette, and the section on parallelization in the vignette on interfacing models.
#' For more details on parallelization, make sure to read both vignettes, especially the section on likelihood in the main vignette and the section on parallelization in the vignette on interfacing models.
#'
#' @export
#' @seealso \code{\link{checkBayesianSetup}} \cr
Expand Down
11 changes: 5 additions & 6 deletions BayesianTools/R/classLikelihood.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
#' Creates a standardized likelihood class#'
#' @author Florian Hartig
#' @param likelihood Log likelihood density
#' @param names Parameter names (optional)
#' @param parallel parallelization , either i) no parallelization --> F, ii) native R parallelization --> T / "auto" will select n-1 of your available cores, or provide a number for how many cores to use, or iii) external parallelization --> "external". External means that the likelihood is already able to execute parallel runs in form of a matrix with
#' @param catchDuplicates Logical, determines whether unique parameter combinations should only be evaluated once. Only used when the likelihood accepts a matrix with parameter as columns.
#' @param parallelOptions list containing two lists. First "packages" determines the R packages necessary to run the likelihood function. Second "objects" the objects in the global envirnment needed to run the likelihood function (for details see \code{\link{createBayesianSetup}}).
#' @param likelihood log likelihood density
#' @param names parameter names (optional)
#' @param parallel parallelization , either i) no parallelization --> F, ii) native R parallelization --> T / "auto" will select n-1 of your available cores, or provide a number for how many cores to use, or iii) external parallelization --> "external". External means that the likelihood is already able to execute parallel runs in the form of a matrix.
#' @param catchDuplicates logical, determines whether unique parameter combinations should only be evaluated once. This is only applicable when the likelihood accepts a matrix with parameters as columns.
#' @param parallelOptions a list containing two lists. First, "packages" specifies the R packages necessary to run the likelihood function. Second, "objects" contains the objects in the global environment needed to run the likelihood function (for details see \code{\link{createBayesianSetup}}).
#' @param sampler sampler
#' @seealso \code{\link{likelihoodIidNormal}} \cr
#' \code{\link{likelihoodAR1}} \cr
Expand Down Expand Up @@ -105,7 +105,6 @@ createLikelihood <- function(likelihood, names = NULL, parallel = F, catchDuplic

#library(mvtnorm)
#library(sparseMVN)

#' Normal / Gaussian Likelihood function
#' @author Florian Hartig
#' @param predicted vector of predicted values
Expand Down
1 change: 1 addition & 0 deletions BayesianTools/R/classMcmcSampler.R
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ getSample.mcmcSampler <- function(sampler, parametersOnly = T, coda = F, start =




#' @method summary mcmcSampler
#' @author Stefan Paul
#' @export
Expand Down
5 changes: 3 additions & 2 deletions BayesianTools/R/classMcmcSamplerList.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@

#' Convenience function to create an object of class mcmcSamplerList from a list of mcmc samplers
#' @author Florian Hartig
#' @param mcmcList a list with each object being an mcmcSampler
#' @return Object of class "mcmcSamplerList"
#' @param mcmcList list of objects, each of which is an mcmcSampler
#' @return object of class "mcmcSamplerList"
#' @export
createMcmcSamplerList <- function(mcmcList){
# mcmcList <- list(mcmcList) -> This line didn't make any sense at all. Better would be to allow the user to simply provide several inputs without a list, but I guess the list option should be maintained, as this is convenient when scripting.
Expand Down
Loading
Loading