bayesflow package
Subpackages
- bayesflow.benchmarks package
- Submodules
- bayesflow.benchmarks.bernoulli_glm module
- bayesflow.benchmarks.bernoulli_glm_raw module
- bayesflow.benchmarks.gaussian_linear module
- bayesflow.benchmarks.gaussian_linear_uniform module
- bayesflow.benchmarks.gaussian_mixture module
- bayesflow.benchmarks.lotka_volterra module
- bayesflow.benchmarks.sir module
- bayesflow.benchmarks.slcp module
- bayesflow.benchmarks.slcp_distractors module
- bayesflow.benchmarks.two_moons module
- Module contents
Submodules
bayesflow.amortizers module
- class bayesflow.amortizers.AmortizedLikelihood(*args, **kwargs)[source]
Bases:
Model
,AmortizedTarget
An interface for a surrogate model of a simulator, or an implicit likelihood
p(data | parameters, context).
- call(input_dict, **kwargs)[source]
Performs a forward pass through the summary and inference network.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys: observables - the observables over which a condition density is learned (i.e., the data) conditions - the conditioning variables that the directly passed to the inference network
- Returns
the outputs of
surrogate_net(theta, summary_net(x, c_s), c_d)
, usually a latent variable and log(det(Jacobian)), that is a tuple(z, log_det_J)
.- Return type
net_out
- call_loop(input_list, **kwargs)[source]
Performs a forward pass through the surrogate network given a list of dicts with the appropriate entries (i.e., as used for the standard call method).
This method is useful when GPU memory is limited or data sets have a different (non-Tensor) structure.
- Parameters
input_list (list of dicts, where each dict contains the following mandatory keys, if DEFAULT keys unchanged:) – observables - the observables over which a condition density is learned (i.e., the data) conditions - the conditioning variables that the directly passed to the inference network
- Returns
net_out or (net_out, summary_out) – the outputs of
inference_net(theta, summary_net(x, c_s), c_d)
, usually a latent variable and log(det(Jacobian)), that is a tuple(z, log_det_J)
.- Return type
tuple of tf.Tensor
- compute_loss(input_dict, **kwargs)[source]
Computes the loss of the amortized given input data provided in input_dict.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys: data - the observables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that the directly passed to the inference network
- Returns
loss
- Return type
tf.Tensor of shape (1,) - the total computed loss given input variables
- log_likelihood(input_dict, to_numpy=True, **kwargs)[source]
Calculates the approximate log-likelihood of targets given conditional variables via the change-of-variable formula for a conditional normalizing flow.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: observables - the variables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that the directly passed to the inference network
to_numpy (bool, optional, default: True) – Boolean flag indicating whether to return the log-lik values as a np.array or a tf.Tensor
- Returns
log_lik – the approximate log-likelihood of each data point in each data set
- Return type
tf.Tensor of shape (batch_size, n_obs)
- log_prob(input_dict, to_numpy=True, **kwargs)[source]
Identical to log_likelihood(input_dict, to_numpy, **kwargs).
- sample(input_dict, n_samples, to_numpy=True, **kwargs)[source]
Generates n_samples random draws from the surrogate likelihood given input conditions.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: conditions - the conditioning variables that the directly passed to the inference network
n_samples (int) – The number of posterior samples to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor
- Returns
lik_samples – Simulated batch of observables from the surrogate likelihood.
- Return type
tf.Tensor or np.ndarray of shape (n_datasets, n_samples, None)
- sample_loop(input_list, n_samples, to_numpy=True, **kwargs)[source]
Generates random draws from the surrogate network given a list of dicts with conditonal variables. Useful when GPU memory is limited or data sets have a different (non-Tensor) structure.
- Parameters
input_list (list of dictionaries, each dictionary having the following mandatory keys, if DEFAULT KEYS unchanged:) – conditions - the conditioning variables that the directly passed to the inference network
n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor
**kwargs (dict, optional) – Additional keyword arguments passed to the networks
- Returns
post_samples – the sampled parameters per data set
- Return type
tf.Tensor or np.ndarray of shape (n_data_sets, n_samples, data_dim)
- class bayesflow.amortizers.AmortizedModelComparison(*args, **kwargs)[source]
Bases:
Model
An interface to connect an evidential network for Bayesian model comparison with an optional summary network, as described in the original paper on evidential neural networks for model comparison:
[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.
Note: the original paper does not distinguish between the summary and the evidential networks, but treats them as a whole, with the appropriate architetcure dictated by the model application. For the sake of consistency, the BayesFlow library distinguisahes the two modules.
- compute_loss(input_dict, **kwargs)[source]
Computes the loss of the amortized model comparison instance.
- Parameters
input_dict (dict) –
- Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged::
summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the evidence network model_indices - the ground-truth, one-hot encoded model indices sampled from the model prior
- Returns
total_loss
- Return type
tf.Tensor of shape (1,) - the total computed loss given input variables
- evidence(input_dict, to_numpy=True, **kwargs)[source]
Computes the evidence for the competing models given the data sets contained in input_dict.
- sample(input_dict, to_numpy=True, **kwargs)[source]
Samples posterior model probabilities from the higher order Dirichlet density.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the evidential network model_indices - the ground-truth, one-hot encoded model indices sampled from the model prior
n_samples (int) – Number of samples to obtain from the approximate posterior
to_numpy (bool, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor
- Returns
pm_samples – The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)
- Return type
tf.Tensor or np.array
- class bayesflow.amortizers.AmortizedPosterior(*args, **kwargs)[source]
Bases:
Model
,AmortizedTarget
A wrapper to connect an inference network for parameter estimation with an optional summary network as in the original BayesFlow set-up described in the paper:
[1] Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L., & Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems.
But also allowing for augmented functionality, such as model misspecification detection in summary space:
[2] Schmitt, M., Bürkner, P. C., Köthe, U., & Radev, S. T. (2022). Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks arXiv preprint arXiv:2112.08866.
And learning of fat-tailed posteriors with a Student-t latent pushforward density:
[3] Jaini, P., Kobyzev, I., Yu, Y., & Brubaker, M. (2020, November). Tails of lipschitz triangular flows. In International Conference on Machine Learning (pp. 4673-4681). PMLR.
Serves as in interface for learning
p(parameters | data, context).
- call(input_dict, return_summary=False, **kwargs)[source]
Performs a forward pass through the summary and inference network.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT keys unchanged: parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables (including data) that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network
return_summary (bool, optional, default: False) – A flag which determines whether the learnable data summaries (representations) are returned or not.
- Returns
net_out or (net_out, summary_out) – the outputs of
inference_net(theta, summary_net(x, c_s), c_d)
, usually a latent variable and log(det(Jacobian)), that is a tuple(z, log_det_J) or (sum_outputs, (z, log_det_J)) if return_summary is set to True and a summary network is defined.
- Return type
tuple of tf.Tensor
- call_loop(input_list, return_summary=False, **kwargs)[source]
Performs a forward pass through the summary and inference network given a list of dicts with the appropriate entries (i.e., as used for the standard call method).
This method is useful when GPU memory is limited or data sets have a different (non-Tensor) structure.
- Parameters
input_list (list of dicts, where each dict contains the following mandatory keys, if DEFAULT keys unchanged:) – parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables (including data) that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network
return_summary (bool, optional, default: False) – A flag which determines whether the learnable data summaries (representations) are returned or not.
- Returns
net_out or (net_out, summary_out) – the outputs of
inference_net(theta, summary_net(x, c_s), c_d)
, usually a latent variable and log(det(Jacobian)), that is a tuple(z, log_det_J) or (sum_outputs, (z, log_det_J)) if return_summary is set to True and a summary network is defined.
- Return type
tuple of tf.Tensor
- compute_loss(input_dict, **kwargs)[source]
Computes the loss of the posterior amortizer given an input dictionary.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys: parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network
- Returns
total_loss
- Return type
tf.Tensor of shape (1,) - the total computed loss given input variables
- log_posterior(input_dict, to_numpy=True, **kwargs)[source]
Calculates the approximate log-posterior of targets given conditional variables via the change-of-variable formula for a conditional normalizing flow.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: parameters : the latent model parameters over which a conditional density (i.e., a posterior) is learned summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that are directly passed to the inference network
to_numpy (bool, optional, default: True) – Flag indicating whether to return the lpdf values as a np.array or a tf.Tensor
- Returns
log_post – the approximate log-posterior density of each each parameter
- Return type
tf.Tensor of shape (batch_size, n_obs)
- log_prob(input_dict, to_numpy=True, **kwargs)[source]
Identical to log_posterior(input_dict, to_numpy, **kwargs).
- sample(input_dict, n_samples, to_numpy=True, **kwargs)[source]
Generates random draws from the approximate posterior given a dictionary with conditonal variables.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT KEYS unchanged: summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network
n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor.
**kwargs (dict, optional) – Additional keyword arguments passed to the networks
- Returns
post_samples – the sampled parameters per data set
- Return type
tf.Tensor or np.ndarray of shape (n_data_sets, n_samples, n_params)
- sample_loop(input_list, n_samples, to_numpy=True, **kwargs)[source]
Generates random draws from the approximate posterior given a list of dicts with conditonal variables. Useful when GPU memory is limited or data sets have a different (non-Tensor) structure.
- Parameters
input_list (list of dictionaries, each dictionary having the following mandatory keys, if DEFAULT KEYS unchanged:) – summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network
n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.darray or a tf.Tensor
**kwargs (dict, optional) – Additional keyword arguments passed to the networks
- Returns
post_samples – the sampled parameters per data set
- Return type
tf.Tensor or np.ndarray of shape (n_datasets, n_samples, n_params)
- class bayesflow.amortizers.AmortizedPosteriorLikelihood(*args, **kwargs)[source]
Bases:
Model
,AmortizedTarget
An interface for jointly learning a surrogate model of the simulator and an approximate posterior given a generative model.
- call(input_dict, **kwargs)[source]
Performs a forward pass through both amortizers.
- Parameters
input_dict (dict) – Input dictionary containing the following mandatory keys: posterior_inputs - The input dictionary for the amortized posterior likelihood_inputs - The input dictionary for the amortized likelihood
- Returns
(post_out, lik_out) – The outputs of the posterior and likelihood networks given input variables.
- Return type
tuple
- compute_loss(input_dict, **kwargs)[source]
Computes the loss of the join amortizer by summing the corresponding amortized posterior and likelihood losses.
- Parameters
input_dict (dict) –
- Nested input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged::
posterior_inputs - The input dictionary for the amortized posterior likelihood_inputs - The input dictionary for the amortized likelihood
- Returns
total_losses – A dictionary with keys Post.Loss and Lik.Loss containing the individual losses for the two amortizers.
- Return type
dict
- log_likelihood(input_dict, to_numpy=True, **kwargs)[source]
Calculates the approximate log-likelihood of data given conditional variables via the change-of-variable formula for conditional normalizing flows.
- Parameters
input_dict (dict) –
Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:
observables - the variables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that are directly passed to the inference network
OR a nested dictionary with key likelihood_inputs containing the above input dictionary
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor
- Returns
log_lik – the approximate log-likelihood of each data point in each data set
- Return type
tf.Tensor of shape (batch_size, n_obs)
- log_posterior(input_dict, to_numpy=True, **kwargs)[source]
Calculates the approximate log-posterior of targets given conditional variables via the change-of-variable formula for conditional normalizing flows.
- Parameters
input_dict (dict) –
Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:
parameters - the latent generative model parameters over which a condition density is learned summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network
OR a nested dictionary with key posterior_inputs containing the above input dictionary
- Returns
log_post – the approximate log-likelihood of each data point in each data set
- Return type
tf.Tensor of shape (batch_size, n_obs)
- log_prob(input_dict, to_numpy=True, **kwargs)[source]
Identical to calling separate log_likelihood() and log_posterior().
- Returns
out_dict (dict with keys log_posterior and log_likelihood corresponding)
to the computed log_pdfs of the approximate posterior and likelihood.
- sample(input_dict, n_post_samples, n_lik_samples, to_numpy=True, **kwargs)[source]
Identical to calling sample_parameters() and sample_data() separately.
- Returns
out_dict (dict with keys posterior_samples and likelihood_samples corresponding)
to the n_samples from the approximate posterior and likelihood, respectively
- sample_data(input_dict, n_samples, to_numpy=True, **kwargs)[source]
Generates n_samples random draws from the surrogate likelihood given input conditions.
- Parameters
input_dict (dict) –
Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:
conditions - the conditioning variables that the directly passed to the inference network
OR a nested dictionary with key likelihood_inputs containing the above input dictionary
n_samples (int) – The number of posterior samples to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor
- Returns
lik_samples – Simulated observables from the surrogate likelihood.
- Return type
tf.Tensor or np.ndarray of shape (n_datasets, n_samples, None)
- sample_parameters(input_dict, n_samples, to_numpy=True, **kwargs)[source]
Generates random draws from the approximate posterior given conditonal variables.
- Parameters
input_dict (dict) –
Input dictionary containing the following mandatory keys, if DEFAULT KEYS unchanged:
summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network
OR a nested dictionary with key posterior_inputs containing the above input dictionary
n_samples (int) – The number of posterior samples to obtain from the approximate posterior
to_numpy (bool, optional, default: True) – Boolean flag indicating whether to return the samples as a np.array or a tf.Tensor
- Returns
post_samples – the sampled parameters per data set
- Return type
tf.Tensor or np.ndarray of shape (n_datasets, n_samples, n_params)
- class bayesflow.amortizers.AmortizedTarget(*args, **kwargs)[source]
Bases:
ABC
An abstract interface for an amortized learned distribution. Children should implement the following public methods:
compute_loss(self, input_dict, **kwargs)
sample(input_dict, **kwargs)
log_prob(input_dict, **kwargs)
- class bayesflow.amortizers.SingleModelAmortizer(*args, **kwargs)[source]
Bases:
AmortizedPosterior
Deprecated class for amortizer posterior estimation.
bayesflow.computational_utilities module
- bayesflow.computational_utilities.expected_calibration_error(m_true, m_pred, n_bins=15)[source]
Estimates the calibration error of a model comparison network.
Important
Make sure that
m_true
are one-hot encoded classes!- Parameters
m_true (np.array or list) – True model indices
m_pred (np.array or list) – Predicted model indices
n_bins (int, default: 15) – Number of bins for plot
- Return type
#TODO
- bayesflow.computational_utilities.gaussian_kernel_matrix(x, y, sigmas=None)[source]
Computes a Gaussian radial basis functions (RBFs) between the samples of x and y.
We create a sum of multiple Gaussian kernels each having a width \(\sigma_i\).
- Parameters
x (tf.Tensor of shape (num_draws_x, num_features)) – Comprises num_draws_x Random draws from the “source” distribution P.
y (tf.Tensor of shape (num_draws_y, num_features)) – Comprises num_draws_y Random draws from the “source” distribution Q.
sigmas (list(float), optional, default: None) – List which denotes the widths of each of the gaussians in the kernel. If sigmas is None, a default range will be used, contained in bayesflow.default_settings.MMD_BANDWIDTH_LIST
- Returns
kernel – The kernel matrix between pairs from x and y.
- Return type
tf.Tensor of shape (num_draws_x, num_draws_y)
- bayesflow.computational_utilities.get_coverage_probs(z, u)[source]
Vectorized function to compute the minimal coverage probability for uniform ECDFs given evaluation points z and a sample of samples u.
- Parameters
z (np.ndarray of shape (num_points, )) – The vector of evaluation points.
u (np.ndarray of shape (num_simulations, num_samples)) – The matrix of simulated draws (samples) from U(0, 1)
- bayesflow.computational_utilities.inverse_multiquadratic_kernel_matrix(x, y, sigmas=None)[source]
Computes an inverse multiquadratic RBF between the samples of x and y.
We create a sum of multiple IM-RBF kernels each having a width \(\sigma_i\).
- Parameters
x (tf.Tensor of shape (num_draws_x, num_features)) – Comprises num_draws_x Random draws from the “source” distribution P.
y (tf.Tensor of shape (num_draws_y, num_features)) – Comprises num_draws_y Random draws from the “source” distribution Q.
sigmas (list(float), optional, default: None) – List which denotes the widths of each of the gaussians in the kernel. If sigmas is None, a default range will be used, contained in bayesflow.default_settings.MMD_BANDWIDTH_LIST
- Returns
kernel – The kernel matrix between pairs from x and y.
- Return type
tf.Tensor of shape (num_draws_x, num_draws_y)
- bayesflow.computational_utilities.maximum_mean_discrepancy(source_samples, target_samples, kernel='gaussian', mmd_weight=1.0, minimum=0.0)[source]
Computes the MMD given a particular choice of kernel.
For details, consult Gretton et al. (2012): https://www.jmlr.org/papers/volume13/gretton12a/gretton12a.pdf
- Parameters
source_samples (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution.
target_samples (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution.
kernel (str in ('gaussian', 'inverse_multiquadratic'), optional, default: 'gaussian') – The kernel to use for computing the distance between pairs of random draws.
mmd_weight (float, optional, default: 1.0) – The weight of the MMD value.
minimum (float, optional, default: 0.0) – The lower bound of the MMD value.
- Returns
loss_value – A scalar Maximum Mean Discrepancy, shape (,)
- Return type
tf.Tensor
- bayesflow.computational_utilities.mmd_kernel(x, y, kernel)[source]
Computes the estimator of the Maximum Mean Discrepancy (MMD) between two samples: x and y.
Maximum Mean Discrepancy (MMD) is a distance-measure between random draws from the distributions x ~ P and y ~ Q.
- Parameters
x (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution x ~ P.
y (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution y ~ Q.
kernel (callable) – A function which computes the distance between pairs of samples.
- Returns
loss – The statistically biased squared maximum mean discrepancy (MMD) value.
- Return type
tf.Tensor of shape (,)
- bayesflow.computational_utilities.mmd_kernel_unbiased(x, y, kernel)[source]
Computes the unbiased estimator of the Maximum Mean Discrepancy (MMD) between two samples: x and y. Maximum Mean Discrepancy (MMD) is a distance-measure between the samples of the distributions x ~ P and y ~ Q.
- Parameters
x (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution x ~ P.
y (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution y ~ Q.
kernel (callable) – A function which computes the distance between pairs of random draws from x and y.
- Returns
loss – The statistically unbiaserd squared maximum mean discrepancy (MMD) value.
- Return type
tf.Tensor of shape (,)
- bayesflow.computational_utilities.simultaneous_ecdf_bands(num_samples, num_points=None, num_simulations=1000, confidence=0.95, eps=1e-05, max_num_points=1000)[source]
Computes the simultaneous ECDF bands through simulation according to the algorithm described in Section 2.2:
https://link.springer.com/content/pdf/10.1007/s11222-022-10090-6.pdf
Depends on the vectorized utility function get_coverage_probs(z, u).
- Parameters
num_samples (int) – The sample size used for computing the ECDF. Will equal to the number of posterior samples when used for calibrarion. Corresponds to N in the paper above.
num_points (int, optional, default: None) – The number of evaluation points on the interval (0, 1). Defaults to num_points = num_samples if not explicitly specified. Correspond to K in the paper above.
num_simulations (int, optional, default: 1000) – The number of samples of size n_samples to simulate for determining the simultaneous CIs.
confidence (float in (0, 1), optional, default: 0.95) – The confidence level, confidence = 1 - alpha specifies the width of the confidence interval.
eps (float, optional, default: 1e-5) – Small number to add to the lower and subtract from the upper bound of the interval [0, 1] to avoid edge artefacts. No need to touch this.
max_num_points (int, optional, default: 1000) – Upper bound on num_points. Saves computation time when num_samples is large.
- Returns
the evaluation points, the lower, and the upper confidence bands, respectively.
- Return type
(alpha, z, L, U) - tuple of scalar and three arrays of size (num_samples,) containing the confidence level as well as
bayesflow.configuration module
- class bayesflow.configuration.DefaultJointConfigurator(default_float_type=<class 'numpy.float32'>)[source]
Bases:
object
Fallback class for a generic configrator for joint posterior and likelihood approximation.
- class bayesflow.configuration.DefaultLikelihoodConfigurator(default_float_type=<class 'numpy.float32'>)[source]
Bases:
object
Fallback class for a generic configrator for amortized likelihood approximation.
bayesflow.coupling_networks module
- class bayesflow.coupling_networks.AffineCouplingLayer(*args, **kwargs)[source]
Bases:
Model
Implements a conditional version of the INN coupling layer.
- call(target_or_z, condition, inverse=False, **kwargs)[source]
Performs one pass through a the affine coupling layer (either inverse or forward).
- Parameters
target_or_z (tf.Tensor) – The estimation quantites of interest or latent representations z ~ p(z), shape (batch_size, …)
condition (tf.Tensor or None) – The conditioning data of interest, for instance, x = summary_fun(x), shape (batch_size, …). If condition is None, then the layer recuces to an unconditional ACL.
inverse (bool, optional, default: False) – Flag indicating whether to run the block forward or backward.
- Returns
(z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, z shape: (batch_size, inp_dim), log_det_J shape: (batch_size, )
target (tf.Tensor) – If inverse=True: The back-transformed z, shape (batch_size, inp_dim)
Important
If
inverse=False
, the return is(z, log_det_J)
.If
inverse=True
, the return istarget
- forward(target, condition, **kwargs)[source]
Performs a forward pass through a coupling layer with an optinal Permutation and ActNorm layers.
- Parameters
target (tf.Tensor) – The estimation quantities of interest, for instance, parameter vector of shape (batch_size, theta_dim)
condition (tf.Tensor or None) – The conditioning vector of interest, for instance, x = summary(x), shape (batch_size, summary_dim) If None, transformation amounts to unconditional estimation.
- Returns
(z, log_det_J) – The transformed input and the corresponding Jacobian of the transformation.
- Return type
tuple(tf.Tensor, tf.Tensor)
- inverse(z, condition, **kwargs)[source]
Performs an inverse pass through a coupling layer with an optinal Permutation and ActNorm layers.
- Parameters
z (tf.Tensor) – latent variables z ~ p(z), shape (batch_size, theta_dim)
condition (tf.Tensor or None) – The conditioning vector of interest, for instance, x = summary(x), shape (batch_size, summary_dim). If None, transformation amounts to unconditional estimation.
- Returns
target – The back-transformed latent variable z.
- Return type
tf.Tensor
bayesflow.default_settings module
bayesflow.diagnostics module
- bayesflow.diagnostics.plot_calibration_curves(m_true, m_pred, model_names=None, n_bins=10, font_size=12, fig_size=(12, 4))[source]
Plots the calibration curves and the ECE for a model comparison problem. Depends on the expected_calibration_error function for computing the ECE.
- Parameters
TODO –
- bayesflow.diagnostics.plot_latent_space_2d(z_samples, height=2.5, color='#8f2727', **kwargs)[source]
Creates pairplots for the latent space learned by the inference network. Enables visual inspection of the the latent space and whether its structrue corresponds to the one enforced by the optimization criterion.
- Parameters
z_samples (np.ndarray or tf.Tensor of shape (n_sim, n_params)) – The latent samples computed through a forward pass of the inference network.
height (float, optional, default: 2.5) – The height of the pair plot.
color (str, optional, defailt : '#8f2727') – The color of the plot
**kwargs (dict, optional) – Additional keyword arguments passed to the sns.PairGrid constructor
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- bayesflow.diagnostics.plot_losses(history, fig_size=None, color='#8f2727', label_fontsize=14, title_fontsize=16)[source]
A generic helper function to plot the losses of a series of training epochs and runs.
- Parameters
history (pd.DataFrame or bayesflow.LossHistory object) – The (plottable) history as returned by a train_[…] method of a Trainer instance.
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- bayesflow.diagnostics.plot_posterior_2d(posterior_draws, prior=None, prior_draws=None, param_names=None, height=3, legend_fontsize=14, post_color='#8f2727', prior_color='gray', post_alpha=0.9, prior_alpha=0.7)[source]
Generates a bivariate pairplot given posterior draws and optional prior or prior draws.
- posterior_drawsnp.ndarray of shape (n_post_draws, n_params)
The posterior draws obtained for a SINGLE observed data set.
- priorbayesflow.forward_inference.Prior instance or None, optional, default: None
The optional prior object having an input-output signature as given by ayesflow.forward_inference.Prior
- prior_drawsnp.ndarray of shape (n_prior_draws, n_params) or None, optonal (default: None)
The optional prior draws obtained from the prior. If both prior and prior_draws are provided, prior_draws will be used.
- param_nameslist or None, optional, default: None
The parameter names for nice plot titles. Inferred if None
- heightfloat, optional, default: 3.
The height of the pairplot.
- legend_fontsizeint, optional, default: 14
The font size of the legend text.
- post_colorstr, optional, default: ‘#8f2727’
The color for the posterior histograms and KDEs.
- priors_colorstr, optional, default: gray
The color for the optional prior histograms and KDEs.
- post_alphafloat in [0, 1], optonal, default: 0.9
The opacity of the posterior plots.
- prior_alphafloat in [0, 1], optonal, default: 0.7
The opacity of the prior plots.
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- Raises
AssertionError – If the shape of posterior_draws is not 2-dimensional.
- bayesflow.diagnostics.plot_prior2d(prior, param_names=None, n_samples=2000, height=2.5, color='#8f2727', **kwargs)[source]
Creates pairplots for a given joint prior.
- Parameters
prior (callable) – The prior object which takes a single integer argument and generates random draws.
param_names (list of str or None, optional, default None) – An optional list of strings which
n_samples (int, optional, default: 1000) – The number of random draws from the joint prior
height (float, optional, default: 2.5) – The height of the pair plot
color (str, optional, defailt : '#8f2727') – The color of the plot
**kwargs (dict, optional) – Additional keyword arguments passed to the sns.PairGrid constructor
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- bayesflow.diagnostics.plot_recovery(post_samples, prior_samples, point_agg=<function mean>, uncertainty_agg=<function std>, param_names=None, fig_size=None, label_fontsize=14, title_fontsize=16, metric_fontsize=16, add_corr=True, add_r2=True, color='#8f2727', n_col=None, n_row=None)[source]
Creates and plots publication-ready recovery plot with true vs. point estimate + uncertainty. The point estimate can be controlled with the point_agg argument, and the uncertainty estimate can be controlled with the uncertainty_agg argument.
This plot yields the same information as the “posterior z-score”:
https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html
Important: Posterior aggregates play no special role in Bayesian inference and should only be used heuristically. For instanec, in the case of multi-modal posteriors, common point estimates, such as mean, (geometric) median, or maximum a posteriori (MAP) mean nothing.
- Parameters
post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets
prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws (true parameters) obtained for generating the n_data_sets
point_agg (callable, optional, default: np.mean) – The function to apply to the posterior draws to get a point estimate for each marginal.
uncertainty_agg (callable or None, optional, default: np.std) – The function to apply to the posterior draws to get an uncertainty estimate. If None provided, a simple scatter will be plotted.
param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None
fig_size (tuple or None, optional, default : None) – The figure size passed to the matplotlib constructor. Inferred if None.
label_fontsize (int, optional, default: 14) – The font size of the y-label text
title_fontsize (int, optional, default: 16) – The font size of the title text
metric_fontsize (int, optional, default: 16) – The font size of the goodness-of-fit metric (if provided)
add_corr (boolean, optional, default: True) – A flag for adding correlation between true and estimates to the plot.
add_r2 (boolean, optional, default: True) – A flag for adding R^2 between true and estimates to the plot.
color (str, optional, default: '#8f2727') – The color for the true vs. estimated scatter points and errobars.
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- Raises
ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.
- bayesflow.diagnostics.plot_sbc_ecdf(post_samples, prior_samples, difference=False, stacked=False, fig_size=None, param_names=None, label_fontsize=14, legend_fontsize=14, title_fontsize=16, rank_ecdf_color='#a34f4f', fill_color='grey', **kwargs)[source]
Creates the empirical CDFs for each marginal rank distribution and plots it against a uniform ECDF. ECDF simultaneous bands are drawn using simulations from the uniform. Inspired by:
[1] Säilynoja, T., Bürkner, P. C., & Vehtari, A. (2022). Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison. Statistics and Computing, 32(2), 1-21. https://arxiv.org/abs/2103.10522
For models with many parameters, use stacked=True to obtain an idea of the overall calibration of a posterior approximator.
- Parameters
post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets
prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets
difference (boolean, optional, default: False) – If True, plots the ECDF difference. Enables a more dynamic visualization range.
stacked (boolean, optional, default: False) – If True, all ECDFs will be plotted on the same plot. If False, each ECDF will have its own subplot, similar to the behavior of plot_sbc_histograms.
param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None. Only relevant if stacked=False.
fig_size (tuple or None, optional, default: None) – The figure size passed to the matplotlib constructor. Inferred if None.
label_fontsize (int, optional, default: 14) – The font size of the y-label and y-label texts
legend_fontsize (int, optional, default: 14) – The font size of the legend text
title_fontsize (int, optional, default: 16) – The font size of the title text. Only relevant if stacked=False
rank_ecdf_color (str, optional, default: '#a34f4f') – The color to use for the rank ECDFs
fill_color (str, optional, default: 'grey') – The color of the fill arguments.
**kwargs (dict, optional, default: {}) – Keyword arguments can be passed to control the behavior of ECDF simultaneous band computation through the ecdf_bands_kwargs dictionary. See simultaneous_ecdf_bands for keyword arguments
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- Raises
ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.
- bayesflow.diagnostics.plot_sbc_histograms(post_samples, prior_samples, param_names=None, fig_size=None, num_bins=None, binomial_interval=0.99, label_fontsize=14, title_fontsize=16, hist_color='#a34f4f')[source]
Creates and plots publication-ready histograms of rank statistics for simulation-based calibration (SBC) checks according to:
[1] Talts, S., Betancourt, M., Simpson, D., Vehtari, A., & Gelman, A. (2018). Validating Bayesian inference algorithms with simulation-based calibration. arXiv preprint arXiv:1804.06788.
Any deviation from uniformity indicates miscalibration and thus poor convergence of the networks or poor combination between generative model / networks.
- Parameters
post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets
prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets
param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None
fig_size (tuple or None, optional, default : None) – The figure size passed to the matplotlib constructor. Inferred if None.
num_bins (int, optional, default: 10) – The number of bins to use for each marginal histogram
binomial_interval (float in (0, 1), optional, default: 0.95) – The width of the confidence interval for the binomial distribution
label_fontsize (int, optional, default: 14) – The font size of the y-label text
title_fontsize (int, optional, default: 16) – The font size of the title text
hist_color (str, optional, default '#a34f4f') – The color to use for the histogram body
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- Raises
ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.
bayesflow.exceptions module
- exception bayesflow.exceptions.ConfigurationError[source]
Bases:
Exception
Class for error in model configuration, e.g. in meta dict
- exception bayesflow.exceptions.InferenceError[source]
Bases:
Exception
Class for error in forward/inverse pass of a neural components.
- exception bayesflow.exceptions.LossError[source]
Bases:
Exception
Class for error in applying loss.
- exception bayesflow.exceptions.OperationNotSupportedError[source]
Bases:
Exception
Class for error that occurs when an operation is demanded but not supported, e.g. when a trainer is initialized without generative model but the user demands it to simulate data.
- exception bayesflow.exceptions.ShapeError[source]
Bases:
Exception
Class for error in expected shappes.
bayesflow.helper_classes module
- class bayesflow.helper_classes.LossHistory[source]
Bases:
object
Helper class to keep track of losses during training.
- add_entry(epoch, current_loss)[source]
Adds loss entry for current epoch into internal memory data structure.
- file_name = 'history'
- get_running_losses(epoch)[source]
Compute and return running means of the losses for current epoch.
- save_to_file(file_path, max_to_keep)[source]
Saves a LossHistory object to a pickled dictionary in file_path. If max_to_keep saved loss history files are found in file_path, the oldest is deleted before a new one is saved.
- property total_loss
- class bayesflow.helper_classes.MemoryReplayBuffer(capacity_in_batches=500)[source]
Bases:
object
Implements a memory replay buffer for simulation-based inference.
- class bayesflow.helper_classes.RegressionLRAdjuster(optimizer, period=1000, wait_between_fits=10, patience=10, tolerance=- 0.05, reduction_factor=0.25, cooldown_factor=2, num_resets=3, **kwargs)[source]
Bases:
object
This class will compute the slope of the loss trajectory and inform learning rate decay.
- file_name = 'lr_adjuster'
- class bayesflow.helper_classes.SimulationDataset(forward_dict, batch_size, buffer_size=1024)[source]
Bases:
object
Helper class to create a tensorflow.data.Dataset which parses simulation dictionaries and returns simulation dictionaries as expected by BayesFlow amortizers.
- class bayesflow.helper_classes.SimulationMemory(stores_raw=True, capacity_in_batches=50)[source]
Bases:
object
Helper class to keep track of a pre-determined number of simulations during training.
- file_name = 'memory'
bayesflow.helper_functions module
- bayesflow.helper_functions.backprop_step(input_dict, amortizer, optimizer, **kwargs)[source]
Computes the loss of the provided amortizer given an input dictionary and applies gradients.
- Parameters
input_dict (dict) – The configured output of the genrative model
amortizer (tf.keras.Model) – The custom amortizer. Needs to implement a compute_loss method.
optimizer (tf.keras.optimizers.Optimizer) – The optimizer used to update the amortizer’s parameters.
**kwargs (dict) – Optional keyword arguments passed to the network’s compute_loss method
- Returns
loss – The outputs of the compute_loss() method of the amortizer comprising all loss components, such as divergences or regularization.
- Return type
dict
- bayesflow.helper_functions.build_meta_dict(user_dict: dict, default_setting: MetaDictSetting) dict [source]
Integrates a user-defined dictionary into a default dictionary.
Takes a user-defined dictionary and a default dictionary.
Scan the user_dict for violations by unspecified mandatory fields.
Merge user_dict entries into the default_dict. Considers nested dict structure.
- Parameters
user_dict (dict) – The user’s dictionary
default_setting (MetaDictSetting) –
The specified default setting with attributes:
meta_dict: dictionary with default values.
mandatory_fields: list(str) keys that need to be specified by the user_dict
- Returns
merged_dict – Merged dictionary.
- Return type
dict
- bayesflow.helper_functions.check_posterior_prior_shapes(post_samples, prior_samples)[source]
Checks requirements for the shapes of posterior and prior draws as necessitated by most diagnostic functions.
- Parameters
post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets
prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets
- Raises
ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.
- bayesflow.helper_functions.extract_current_lr(optimizer)[source]
Extracts current learning rate from optimizer.
- Parameters
optimizer (instance of subclass of tf.keras.optimizers.Optimizer) – Optimizer to extract the learning rate from
- Returns
current_lr – Current learning rate, or None if it can’t be determined
- Return type
np.float or NoneType
bayesflow.helper_networks module
- class bayesflow.helper_networks.ActNorm(*args, **kwargs)[source]
Bases:
Model
Implements an Activation Normalization (ActNorm) Layer.
- call(target, inverse=False)[source]
Performs one pass through the actnorm layer (either inverse or forward) and normalizes the last axis of target.
- Parameters
target (tf.Tensor of shape (batch_size, ...)) – the target variables of interest, i.e., parameters for posterior estimation
inverse (bool, optional, default: False) – Flag indicating whether to run the block forward or backwards
- Returns
(z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, inp_dim), log_det_J shape: (,)
target (tf.Tensor) – If inverse=True: The inversly transformed targets, shape == target.shape
Important
If
inverse=False
, the return is(z, log_det_J)
.If
inverse=True
, the return istarget
.
- class bayesflow.helper_networks.DenseCouplingNet(*args, **kwargs)[source]
Bases:
Model
Implements a conditional version of a standard fully connected (FC) network. Would also work as an unconditional estimator.
- call(target, condition, **kwargs)[source]
Concatenates target and condition and performs a forward pass through the coupling net.
- Parameters
target (tf.Tensor) – The split estimation quntities, for instance, parameters \(\theta \sim p(\theta)\) of interest, shape (batch_size, …)
condition (tf.Tensor or None) – the conditioning vector of interest, for instance
x = summary(x)
, shape (batch_size, summary_dim)
- class bayesflow.helper_networks.EquivariantModule(*args, **kwargs)[source]
Bases:
Model
Implements an equivariant module performing an equivariant transform.
For details and justification, see:
- class bayesflow.helper_networks.InvariantModule(*args, **kwargs)[source]
Bases:
Model
Implements an invariant module performing a permutation-invariant transform.
For details and rationale, see:
- class bayesflow.helper_networks.MultiConv1D(*args, **kwargs)[source]
Bases:
Model
Implements an inception-inspired 1D convolutional layer using different kernel sizes.
- class bayesflow.helper_networks.Permutation(*args, **kwargs)[source]
Bases:
Model
Implements a layer to permute the inputs entering a (conditional) coupling layer. Uses fixed permutations, as these perform equally well compared to learned permutations.
- call(target, inverse=False)[source]
Permutes a batch of target vectors over the last axis.
- Parameters
target (tf.Tensor of shape (batch_size, ...)) – The target vector to be permuted over its last axis.
inverse (bool, optional, default: False) – Controls if the current pass is forward (
inverse=False
) or inverse (inverse=True
).
- Returns
out – The (un-)permuted target vector.
- Return type
tf.Tensor of the same shape as target.
bayesflow.inference_networks module
- class bayesflow.inference_networks.EvidentialNetwork(*args, **kwargs)[source]
Bases:
Model
Implements a network whose outputs are the concentration parameters of a Dirichlet density.
Follows ideas from:
[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized Bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.
[2] Sensoy, M., Kaplan, L., & Kandemir, M. (2018). Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31.
- call(condition, **kwargs)[source]
Computes evidences for model comparison given a batch of data and optional concatenated context, typically passed through a summayr network.
- Parameters
condition (tf.Tensor of shape (batch_size, ...)) – The input variables used for determining p(model | condition)
- Returns
evidence
- Return type
tf.Tensor of shape (batch_size, num_models) – the learned model evidences
- classmethod create_config(**kwargs)[source]
“Used to create the settings dictionary for the internal networks of the invertible network. Will fill in missing
- sample(condition, n_samples, **kwargs)[source]
Samples posterior model probabilities from the higher-order Dirichlet density.
- Parameters
condition (tf.Tensor) – The summary of the observed (or simulated) data, shape (n_data_sets, …)
n_samples (int) – Number of samples to obtain from the approximate posterior
- Returns
pm_samples – The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)
- Return type
tf.Tensor or np.array
- class bayesflow.inference_networks.InvertibleNetwork(*args, **kwargs)[source]
Bases:
Model
Implements a chain of conditional invertible coupling layers for conditional density estimation.
- call(targets, condition, inverse=False, **kwargs)[source]
Performs one pass through an invertible chain (either inverse or forward).
- Parameters
targets (tf.Tensor) – The estimation quantities of interest, shape (batch_size, …)
condition (tf.Tensor) – The conditional data x, shape (batch_size, summary_dim)
inverse (bool, default: False) – Flag indicating whether to run the chain forward or backwards
- Returns
(z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, …), log_det_J shape: (batch_size, …)
target (tf.Tensor) – If inverse=True: The transformed out, shape (batch_size, …)
Important
If
inverse=False
, the return is(z, log_det_J)
.If
inverse=True
, the return istarget
.
bayesflow.losses module
- bayesflow.losses.kl_dirichlet(model_indices, alpha)[source]
Computes the KL divergence between a Dirichlet distribution with parameter vector alpha and a uniform Dirichlet.
- Parameters
model_indices (tf.Tensor of shape (batch_size, n_models)) – one-hot-encoded true model indices
alpha (tf.Tensor of shape (batch_size, n_models)) – positive network outputs in
[1, +inf]
- Returns
kl – A single scalar representing \(D_{KL}(\mathrm{Dir}(\alpha) | \mathrm{Dir}(1,1,\ldots,1) )\), shape (,)
- Return type
tf.Tensor
- bayesflow.losses.kl_latent_space_gaussian(z, log_det_J)[source]
Computes the Kullback-Leibler divergence between true and approximate posterior assuming a Gaussian latent space as a source distribution.
- Parameters
z (tf.Tensor of shape (batch_size, ...)) – The (latent transformed) target variables
log_det_J (tf.Tensor of shape (batch_size, ...)) – The logartihm of the Jacobian determinant of the transformation.
- Returns
loss – A single scalar value representing the KL loss, shape (,)
- Return type
tf.Tensor
Examples
Parameter estimation
>>> kl_latent_space_gaussian(z, log_det_J)
- bayesflow.losses.kl_latent_space_student(v, z, log_det_J)[source]
Computes the Kullback-Leibler divergence between true and approximate posterior assuming latent student t-distribution as a source distribution.
- Parameters
v (tf Tensor of shape (batch_size, ...)) – The degrees of freedom of the latent student t-distribution
z (tf.Tensor of shape (batch_size, ...)) – The (latent transformed) target variables
log_det_J (tf.Tensor of shape (batch_size, ...)) – The logartihm of the Jacobian determinant of the transformation.
- Returns
loss – A single scalar value representing the KL loss, shape (,)
- Return type
tf.Tensor
- bayesflow.losses.log_loss(model_indices, alpha)[source]
Computes the logloss given output probs and true model indices m_true.
- Parameters
model_indices (tf.Tensor of shape (batch_size, n_models)) – one-hot-encoded true model indices
alpha (tf.Tensor of shape (batch_size, n_models)) – positive network outputs in
[1, +inf]
- Returns
loss – A single scalar Monte-Carlo approximation of the log-loss, shape (,)
- Return type
tf.Tensor
- bayesflow.losses.mmd_summary_space(summary_outputs, z_dist=<function random_normal>, kernel='gaussian')[source]
Computes the MMD(p(summary_otuputs) | z_dist) to re-shape the summary network outputs in an information-preserving manner.
- Parameters
summary_outputs (tf Tensor of shape (batch_size, ...)) – The outputs of the summary network.
z_dist (callable, default tf.random.normal) – The latent data distribution towards which the summary outputs are optimized.
kernel (str in ('gaussian', 'inverse_multiquadratic'), default 'gaussian') – The kernel function to use for MMD computation.
bayesflow.mcmc module
- class bayesflow.mcmc.MCMCSurrogateLikelihood(amortized_likelihood, configurator=None, likelihood_postprocessor=None, grad_postprocessor=None)[source]
Bases:
object
An interface to provide likelihood evaluation and gradient estimation of a pre-trained
AmortizedLikelihood
instance, which can be used in tandem with (HMC)-MCMC, as implemented, for instance, inPyMC3
.- log_likelihood(*args, **kwargs)[source]
Calculates the approximate log-likelihood of targets given conditional variables.
- Parameters
configurator (The parameters as expected by configurator. For the default) –
:param : :param the first parameter has to be a dictionary containing the following mandatory keys: :param : :param if DEFAULT_KEYS unchanged:
observables
- the variables over which a condition density is learned (i.e., the observables)conditions
- the conditioning variables that the directly passed to the inference network- Returns
out – The output as returned by
likelihood_postprocessor
. For the default postprocessor, this is the total log-likelihood given by the sum of all log-likelihood values.- Return type
np.ndarray
- log_likelihood_grad(*args, **kwargs)[source]
Calculates the gradient of the surrogate likelihood with respect to every parameter in
conditions
.- Parameters
configurator (The parameters as expected by configurator. For the default) –
:param : :param the first parameter has to be a dictionary containing the following mandatory keys: :param : :param if
DEFAULT_KEYS
unchanged:observables
- the variables over which a condition density is learned (i.e., the observables)conditions
- the conditioning variables that the directly passed to the inference network- Returns
out – The output as returned by
grad_postprocessor
. For the default postprocessor, this is an array containing the derivative with respect to each value inconditions
as returned byconfigurator
.- Return type
np.ndarray
- class bayesflow.mcmc.PyMCSurrogateLikelihood(amortized_likelihood, observables, configurator=None, likelihood_postprocessor=None, grad_postprocessor=None, default_pymc_type=<class 'numpy.float64'>, default_tf_type=<class 'numpy.float32'>)[source]
Bases:
Op
,MCMCSurrogateLikelihood
- grad(inputs, output_grads)[source]
Aggregates gradients with respect to
inputs
(typically the parameter vector)- Parameters
inputs (The input variables.) –
output_grads (The gradients of the output variables.) –
- Returns
grads
- Return type
The gradients with respect to each
Variable
ininputs
.
- itypes: Optional[Sequence[Type]] = [TensorType(float64, (None,))]
- otypes: Optional[Sequence[Type]] = [TensorType(float64, ())]
- perform(node, inputs, outputs)[source]
Computes the log-likelihood of
inputs
(typically the parameter vector of a model).- Parameters
node (The symbolic
aesara.graph.basic.Apply
node that represents this computation.) –inputs (Immutable sequence of non-symbolic/numeric inputs. These are the values of each) –
Variable
innode.inputs
.outputs (List of mutable single-element lists (do not change the length of these lists).) – Each sub-list corresponds to value of each
Variable
innode.outputs
. The primary purpose of this method is to set the values of these sub-lists.
bayesflow.networks module
Meta-module for easy access of different neural network architecture interfaces
bayesflow.simulation module
- class bayesflow.simulation.ContextGenerator(batchable_context_fun: Optional[callable] = None, non_batchable_context_fun: Optional[callable] = None, use_non_batchable_for_batchable: bool = False)[source]
Bases:
object
Basic interface for a simulation module responsible for generating variables over which we want to amortize during simulation-based training, but do not want to perform inference on. Both priors and simulators in a generative framework can have their own context generators, depending on the particular modeling goals.
The interface distinguishes between two types of context: batchable and non-batchable.
Batchable context variables differ for each simulation in each training batch
Non-batchable context varibales stay the same for each simulation in a batch, but differ across batches
Examples for batchable context variables include experimental design variables, design matrices, etc. Examples for non-batchable context variables include the number of observations in an experiment, positional encodings, time indices, etc.
While the latter can also be considered batchable in principle, batching them would require non-Tensor (i.e., non-rectangular) data structures, which usually means inefficient computations.
Example for a simulation context which will generate a random number of observations between 1 and 100 for each training batch:
>>> gen = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101))
- batchable_context(batch_size, *args, **kwargs)[source]
Generates ‘batch_size’ context variables given optional arguments. Return type is a list of context variables.
- generate_context(batch_size, *args, **kwargs)[source]
Creates a dictionary with batchable and non batchable context.
Parameters
- batch_sizeint
The batch_size argument used for batchable context.
- Returns
context_dict (dictionary) – A dictionary with context variables with the following keys, if default keys not changed:
batchable_context
: valuenon_batchable_context
: valueNote, that the values of the context variables will be
None
, if thecorresponding context-generating functions have not been provided when
initializing this object.
- class bayesflow.simulation.GenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = False, name: str = 'anonymous')[source]
Bases:
object
Basic interface for a generative model in a simulation-based context. Generally, a generative model consists of two mandatory components:
Prior : A randomized function returning random parameter draws from a prior distribution;
Simulator : A function which transforms the parameters into observables in a non-deterministic manner.
- plot_pushforward(parameter_draws=None, funcs_list=None, funcs_labels=None, batch_size=1000, show_raw_sims=True)[source]
Creates simulations from parameter_draws (generated from self.prior if they are not passed as an argument) and plots visualizations for them.
- Parameters
parameter_draws (numpy ndarray of the shape (batch_size, parameter_values)) – A sample of parameters. May be drawn from either the prior (which is also the default behavior if no input is specified) or from the posterior to do a prior/posterior pushforward.
funcs_list (list of callables) – A list of functions that can be used to aggregate simulation data (map a single simulation to a single real value). The default behavior without user input is to use numpy’s mean and standard deviation functions.
funcs_labels (list of strings) – A list of labels for the functions in funcs_list. The default behavior without user input is to call the functions “Aggregator function 1, Aggregator function 2, etc.”
batch_size (integer) – The number of prior draws to generate (and then create and visualizes simulations from)
show_raw_sims (boolean) – Flag determining whether or not a plot of 49 raw (i.e. unaggregated) simulations is generated. Useful for very general data exploration.
- Returns
parameters_draws (numpy ndarray) – The parameters provided by the user or generated internally.
simulations (numpy ndarray) – The simulations generated from parameter_draws (or prior draws generated on the fly)
aggregated_data (list of numpy 1d arrays) – Arrays generated from the simulations with the functions in funcs_list
- presimulate_and_save(batch_size, folder_path, total_iterations=None, memory_limit=None, iterations_per_epoch=None, epochs=None, extend_from=0, parallel=True)[source]
Simulates a dataset for single-pass offline training (called via the train_from_presimulation method of the Trainer class in the trainers.py script).
One of the following pairs of parameters has to be provided:
(iterations_per_epoch, epochs),
(total_iterations, iterations_per_epoch)
(total_iterations, epochs)
Providing all three of the parameters in these pairs leads to a consistency check, since incompatible combinations are possible. memory_limit is an upper bound on the size of individual files; this can be useful to avoid running out of RAM during training.
- class bayesflow.simulation.MultiGenerativeModel(generative_models: list, model_probs='equal')[source]
Bases:
object
Basic interface for multiple generative models in a simulation-based context. A
MultiveGenerativeModel
instance consists of a list ofGenerativeModel
instances and a prior distribution over candidate models defined by a list of probabilities.
- class bayesflow.simulation.Prior(batch_prior_fun: Optional[callable] = None, prior_fun: Optional[callable] = None, context_generator: Optional[callable] = None, param_names: Optional[list] = None)[source]
Bases:
object
Basic interface for a simulation module responsible for generating random draws from a prior distribution.
The prior functions should return a np.array of simulation parameters which will be internally used by the GenerativeModel interface for simulations.
An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()
- estimate_means_and_stds(n_draws=1000, *args, **kwargs)[source]
Estimates prior means and stds given n_draws from the prior, useful for z-standardization of the prior draws.
- Parameters
n_draws (int, optional (default = 1000)) – The number of random draws to obtain from the joint prior.
*args (tuple) – Optional positional arguments passed to the generator functions.
**kwargs (dict) – Optional keyword arguments passed to the generator functions.
- Returns
The estimated means and stds of the joint prior.
- Return type
(prior_means, prior_stds) - tuple of np.ndarrays
- plot_prior2d(**kwargs)[source]
Generates a 2D plot representing bivariate prior ditributions. Uses the function `bayesflow.diagnostics.plot_prior2d() internally for generating the plot.
- Parameters
**kwargs (dict) – Optional keyword arguments passed to the plot_prior2d function.
- Returns
f
- Return type
plt.Figure - the figure instance for optional saving
- class bayesflow.simulation.Simulator(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]
Bases:
object
Basic interface for a simulation module responsible for generating randomized simulations given a prior parameter distribution and optional context variables, given a user-provided simulation function.
The user-provided simulator functions should return a np.array of synthetic data which will be used internally by the GenerativeModel interface for simulations.
An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()
bayesflow.summary_networks module
- class bayesflow.summary_networks.InvariantNetwork(*args, **kwargs)[source]
Bases:
Model
Implements a deep permutation-invariant network according to [1] and [2].
[1] Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R. R., & Smola, A. J. (2017). Deep sets. Advances in neural information processing systems, 30.
[2] Bloem-Reddy, B., & Teh, Y. W. (2020). Probabilistic Symmetries and Invariant Neural Networks. J. Mach. Learn. Res., 21, 90-1.
- call(x)[source]
Performs the forward pass of a learnable deep invariant transformation consisting of a sequence of equivariant transforms followed by an invariant transform.
- Parameters
x (tf.Tensor) – Input of shape (batch_size, n_obs, data_dim)
- Returns
out – Output of shape (batch_size, out_dim)
- Return type
tf.Tensor
- class bayesflow.summary_networks.SequentialNetwork(*args, **kwargs)[source]
Bases:
Model
Implements a sequence of MultiConv1D layers followed by an LSTM network.
For details and rationale, see [1]:
[1] Radev, S. T., Graw, F., Chen, S., Mutters, N. T., Eichel, V. M., Bärnighausen, T., & Köthe, U. (2021). OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany. PLoS computational biology, 17(10), e1009472.
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009472
- call(x, **kwargs)[source]
Performs a forward pass through the network by first passing x through the sequence of multi-convolutional layers and then applying the LSTM network.
- Parameters
x (tf.Tensor) – Input of shape (batch_size, n_time_steps, n_time_series)
- Returns
out – Output of shape (batch_size, summary_dim)
- Return type
tf.Tensor
bayesflow.trainers module
- class bayesflow.trainers.Trainer(amortizer, generative_model=None, configurator=None, checkpoint_path=None, max_to_keep=3, default_lr=0.001, skip_checks=False, memory=True, **kwargs)[source]
Bases:
object
This class connects a generative model (or, already simulated data from a model) with a configurator and a neural inference architecture for amortized inference (amortizer). A Trainer instance is responsible for optimizing the amortizer via various forms of simulation-based training.
At the very minimum, the trainer must be initialized with an
amortizer
instance, which is capable of processing the (configured) outputs of a generative model. Aconfigurator
will then process the outputs of the generative model and convert them into suitable inputs for the amortizer. Users can choose from a palette of default configurators or create their own configurators, essentially building a modularized pipeline GenerativeModel -> Configurator -> Amortizer. Most complex models wtill require custom configurators.Currently, the trainer supports the following simulation-based training regimes, based on efficiency considerations:
- Online training
Usage: >>> trainer.train_online(epochs, iterations_per_epoch, batch_size, **kwargs)
This training regime is optimal for fast generative models which can efficiently simulated data on-the-fly. In order for this training regime to be efficient, on-the-fly batch simulations should not take longer than 2-3 seconds.
- Experience replay training
Usage: >>> trainer.train_experience_replay(epochs, iterations_per_epoch, batch_size, **kwargs)
This training regime is also good for fast generative models capable of efficiently simulating data on-the-fly. Compare to pure online training, this training will keep an experience replay buffer from which simulations are randomly sampled, so the networks will likely see some simulations multiple times.
- Round-based training
Usage: >>> trainer.train_rounds(rounds, sim_per_round, epochs, batch_size, **kwargs)
This training regime is optimal for slow, but still reasonably performant generative models. In order for this training regime to be efficient, on-the-fly batch simulations should not take longer than one 2-3 minutes.
Important: overfitting presents a danger when using small numbers of simulated data sets, so it is recommended to use some amount of regularization for the neural amortizer(s).
- Offline taining
Usage: >>> trainer.train_offline(simulations_dict, epochs, batch_size, **kwargs)
This training regime is optimal for very slow, external simulators, which take several minutes for a single simulation. It assumes that all training data has been already simulated and stored on disk.
Important: overfitting presents a danger when using a small simulated data set, so it is recommended to use some amount of regularization for the neural amortizer(s).
Note: For extremely slow simulators (i.e., more than an hour of a single simulation), the BayesFlow framework might not be the ideal choice and should probably be considered in combination with a black-box surrogate optimization method, such as Bayesian optimization.
- diagnose_latent2d(inputs=None, **kwargs)[source]
Performs visual pre-inference diagnostics of latent space on either provided validation data (new simulations) or internal simulation memory. If
inputs is not None
, then diagnostics will be performed on the inputs, regardless whether thesimulation_memory
of the trainer is empty or not. Ifinputs is None
, then the trainer will try to access is memory or raise aConfigurationError
.- Parameters
inputs (None, list or dict, optional (default - None)) – The optional inputs to use
**kwargs (dict, optional) – Optional keyword arguments, which could be:
conf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizerplot_args
- optional keyword arguments passed toplot_latent_space_2d
- Returns
losses – A dictionary storing the losses across epochs and iterations
- Return type
dict(ep_num : list(losses))
- diagnose_sbc_histograms(inputs=None, n_samples=None, **kwargs)[source]
Performs visual pre-inference diagnostics via simulation-based calibration (SBC) (new simulations) or internal simulation memory. If
inputs is not None
, then diagnostics will be performed on the inputs, regardless whether thesimulation_memory
of the trainer is empty or not. Ifinputs is None
, then the trainer will try to access is memory or raise aConfigurationError
.- Parameters
inputs (None, list or dict, optional (default - None)) – The optional inputs to use
n_samples (int, optional (default - None)) – The number of posterior samples to draw for each simulated data set. If None, the number will be heuristically determined so n_sim / n_draws ~= 20
**kwargs (dict, optional) – Optional keyword arguments, which could be:
conf_args
- optional keyword arguments passed to the configurator net_args` - optional keyword arguments passed to the amortizerplot_args
- optional keyword arguments passed toplot_sbc
- Returns
losses – A dictionary storing the losses across epochs and iterations
- Return type
dict(ep_num : list(losses))
- load_pretrained_network()[source]
Attempts to load a pre-trained network if checkpoint path is provided and a checkpoint manager exists.
- train_experience_replay(epochs, iterations_per_epoch, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, buffer_capacity=1000, optional_stopping=True, use_autograph=True, **kwargs)[source]
Trains the network(s) via experience replay using a memory replay buffer, as utilized in reinforcement learning. Additional keyword arguments are passed to the generative mode, configurator, and amortizer. Read below for signature.
- Parameters
epochs (int) – Number of epochs (and number of times a checkpoint is stored)
iterations_per_epoch (int) – Number of batch simulations to perform per epoch
batch_size (int) – Number of simulations to perform at each backpropagation step.
save_checkpoint (bool, optional, default: True) – A flag to decide whether to save checkpoints after each epoch, if a
checkpoint_path
provided during initialization, otherwise ignored.optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network.
None
will result intf.keras.optimizers.Adam
using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If
False
, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored asself.optimizer
and re-used in further training runs.buffer_capacity (int, optional, default: 1000) – Max number of batches to store in buffer. For instance, if
batch_size=32
andcapacity_in_batches=1000
, then the buffer will hold a maximum of 32 * 1000 = 32000 simulations. Be careful with memory!optional_stopping (bool, optional, default: True) – Whether to use optional stopping or not during training. Could speed up training.
use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug. Important! Argument will be ignored if buffer has previously been initialized!
**kwargs (dict, optional, default: {}) – Optional keyword arguments, which can be:
model_args
- optional keyword arguments passed to the generative modelconf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizer
- Returns
losses – A dictionary or a data frame storing the losses across epochs and iterations.
- Return type
dict
orpandas.DataFrame
- train_from_presimulation(presimulation_path, optimizer, save_checkpoint=True, max_epochs=None, reuse_optimizer=False, custom_loader=None, optional_stopping=True, use_autograph=True, **kwargs)[source]
Trains an amortizer via a modified form of offline training.
Like regular offline training, it assumes that parameters, data and optional context have already been simulated (i.e., forward inference has been performed).
Also like regular offline training, it is faster than online training in scenarios where simulations are slow. Unlike regular offline training, it uses each batch from the presimulated dataset only once during training. A larger presimulated dataset is therefore required than for offline training, and the increase in speed gained by loading simulations instead of generating them on the fly comes at a cost: a large presimulated dataset takes up a large amount of hard drive space.
- Parameters
presimulation_path (str) –
File path to the folder containing the files from the precomputed simulation. Ideally generated using a GenerativeModel’s presimulate_and_save method, otherwise must match the structure produced by that method:
Each file contains the data for one epoch (i.e. a number of batches), and must be compatible with the custom_loader provided. The custom_loader must read each file into a collection (either a dictionary or a list) of simulation_dict objects. This is easily achieved with the pickle library: if the files were generated from collections of simulation_dict objects using pickle.dump, the _default_loader (default for custom_load) will load them using pickle.load. Training parameters like number of iterations and batch size are inferred from the files during training.
optimizer (tf.keras.optimizer.Optimizer) – Optimizer for the neural network training. Since for this training, it is impossible to guess the number of iterations beforehead, an optimizer must be provided.
save_checkpoint (bool, optional, default : True) – Determines whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.
max_epochs (int or None, optional, default: None) – An optional parameter to limit the number of epochs.
reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If
False
, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored asself.optimizer
and re-used in further training runs.custom_loader (callable, optional, default: self._default_loader) – Must take a string file_path as an input and output a collection (dictionary or list) of simulation_dict objects. A simulation_dict has the keys -
prior_non_batchable_context
, -prior_batchable_context
, -prior_draws
, -sim_non_batchable_context
, -sim_batchable_context
, -sim_data
.prior_draws
andsim_data
must have actual data as values, the rest are optional.optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.
use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.
**kwargs (dict, optional) – Optional keyword arguments, which can be:
conf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizer
- Returns
losses – A dictionary or a data frame storing the losses across epochs and iterations
- Return type
dict
orpandas.DataFrame
- train_offline(simulations_dict, epochs, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]
Trains an amortizer via offline learning. Assume parameters, data and optional context have already been simulated (i.e., forward inference has been performed).
- Parameters
simulations_dict (dict) – A dictionaty containing the simulated data / context, if using the default keys, the method expects at least the mandatory keys
sim_data
andprior_draws
to be presentepochs (int) – Number of epochs (and number of times a checkpoint is stored)
batch_size (int) – Number of simulations to perform at each backpropagation step
save_checkpoint (bool (default - True)) – Determines whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.
optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network.
None
will result intf.keras.optimizers.Adam
using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If
False
, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored asself.optimizer
and re-used in further training runs.optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.
use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.
**kwargs (dict, optional) – Optional keyword arguments, which can be:
model_args
- optional keyword arguments passed to the generative modelconf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizer
- Returns
losses – A dictionary or a data frame storing the losses across epochs and iterations
- Return type
dict
orpandas.DataFrame
- train_online(epochs, iterations_per_epoch, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]
Trains an amortizer via online learning. Additional keyword arguments are passed to the generative mode, configurator, and amortizer.
- Parameters
epochs (int) – Number of epochs (and number of times a checkpoint is stored)
iterations_per_epoch (int) – Number of batch simulations to perform per epoch
batch_size (int) – Number of simulations to perform at each backprop step
save_checkpoint (bool (default - True)) – A flag to decide whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.
optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network.
None
will result intf.keras.optimizers.Adam
using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If
False
, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as ``self.optimizer` and re-used in further training runs.optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.
use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.
**kwargs (dict, optional) – Optional keyword arguments, which can be:
model_args
- optional keyword arguments passed to the generative modelconf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizer
- Returns
losses – A dictionary storing the losses across epochs and iterations
- Return type
dict or pandas.DataFrame
- train_rounds(rounds, sim_per_round, epochs, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]
Trains an amortizer via round-based learning. In each round,
sim_per_round
data sets are simulated from the generative model and added to the data sets simulated in previous round. Then, the networks are trained forepochs
on the augmented set of data sets.Important: Training time will increase from round to round, since the number of simulations increases correspondingly. The final round will then train the networks on
rounds * sim_per_round
data sets, so make sure this number does not eat up all available memory.- Parameters
rounds (int) – Number of rounds to perform (outer loop)
sim_per_round (int) – Number of simulations per round
epochs (int) – Number of epochs (and number of times a checkpoint is stored, inner loop) within a round.
batch_size (int) – Number of simulations to use at each backpropagation step
save_checkpoint (bool, optional, (default - True)) – A flag to decide whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.
optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network training.
None
will result intf.keras.optimizers.Adam
using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If
False
, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored asself.optimizer
and re-used in further training runs.optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.
use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.
**kwargs (dict, optional) – Optional keyword arguments, which can be:
model_args
- optional keyword arguments passed to the generative modelconf_args
- optional keyword arguments passed to the configuratornet_args
- optional keyword arguments passed to the amortizer
- Returns
losses – A dictionary or a data frame storing the losses across epochs and iterations
- Return type
dict
orpandas.DataFrame
bayesflow.version module
bayesflow.wrappers module
- class bayesflow.wrappers.SpectralNormalization(*args, **kwargs)[source]
Bases:
Wrapper
Performs spectral normalization on neural network weights. Adapted from:
https://www.tensorflow.org/addons/api_docs/python/tfa/layers/SpectralNormalization
This wrapper controls the Lipschitz constant of a layer by constraining its spectral norm, which can stabilize the training of generative networks.
See Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/abs/1802.05957).
- call(inputs, training=False)[source]
Call Layer
- Parameters
inputs (tf.Tensor of shape (None,...,condition_dim + target_dim)) – The inputs to the corresponding layer.
- get_config()[source]
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).
Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.
- Returns
Python dictionary.