bayesflow package

Subpackages

Submodules

bayesflow.amortizers module

class bayesflow.amortizers.AmortizedLikelihood(*args, **kwargs)[source]

Bases: Model, AmortizedTarget

An interface for a surrogate model of a simulator, or an implicit likelihood p(data | parameters, context).

call(input_dict, **kwargs)[source]

Performs a forward pass through the summary and inference network.

Parameters

input_dict (dict) – Input dictionary containing the following mandatory keys: observables - the observables over which a condition density is learned (i.e., the data) conditions - the conditioning variables that the directly passed to the inference network

Returns

the outputs of surrogate_net(theta, summary_net(x, c_s), c_d), usually a latent variable and log(det(Jacobian)), that is a tuple (z, log_det_J).

Return type

net_out

call_loop(input_list, **kwargs)[source]

Performs a forward pass through the surrogate network given a list of dicts with the appropriate entries (i.e., as used for the standard call method).

This method is useful when GPU memory is limited or data sets have a different (non-Tensor) structure.

Parameters

input_list (list of dicts, where each dict contains the following mandatory keys, if DEFAULT keys unchanged:) – observables - the observables over which a condition density is learned (i.e., the data) conditions - the conditioning variables that the directly passed to the inference network

Returns

net_out or (net_out, summary_out) – the outputs of inference_net(theta, summary_net(x, c_s), c_d), usually a latent variable and log(det(Jacobian)), that is a tuple (z, log_det_J).

Return type

tuple of tf.Tensor

compute_loss(input_dict, **kwargs)[source]

Computes the loss of the amortized given input data provided in input_dict.

Parameters

input_dict (dict) – Input dictionary containing the following mandatory keys: data - the observables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that the directly passed to the inference network

Returns

loss

Return type

tf.Tensor of shape (1,) - the total computed loss given input variables

log_likelihood(input_dict, to_numpy=True, **kwargs)[source]

Calculates the approximate log-likelihood of targets given conditional variables via the change-of-variable formula for a conditional normalizing flow.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: observables - the variables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that the directly passed to the inference network

  • to_numpy (bool, optional, default: True) – Boolean flag indicating whether to return the log-lik values as a np.array or a tf.Tensor

Returns

log_lik – the approximate log-likelihood of each data point in each data set

Return type

tf.Tensor of shape (batch_size, n_obs)

log_prob(input_dict, to_numpy=True, **kwargs)[source]

Identical to log_likelihood(input_dict, to_numpy, **kwargs).

sample(input_dict, n_samples, to_numpy=True, **kwargs)[source]

Generates n_samples random draws from the surrogate likelihood given input conditions.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: conditions - the conditioning variables that the directly passed to the inference network

  • n_samples (int) – The number of posterior samples to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor

Returns

lik_samples – Simulated batch of observables from the surrogate likelihood.

Return type

tf.Tensor or np.ndarray of shape (n_datasets, n_samples, None)

sample_loop(input_list, n_samples, to_numpy=True, **kwargs)[source]

Generates random draws from the surrogate network given a list of dicts with conditonal variables. Useful when GPU memory is limited or data sets have a different (non-Tensor) structure.

Parameters
  • input_list (list of dictionaries, each dictionary having the following mandatory keys, if DEFAULT KEYS unchanged:) – conditions - the conditioning variables that the directly passed to the inference network

  • n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor

  • **kwargs (dict, optional) – Additional keyword arguments passed to the networks

Returns

post_samples – the sampled parameters per data set

Return type

tf.Tensor or np.ndarray of shape (n_data_sets, n_samples, data_dim)

class bayesflow.amortizers.AmortizedModelComparison(*args, **kwargs)[source]

Bases: Model

An interface to connect an evidential network for Bayesian model comparison with an optional summary network, as described in the original paper on evidential neural networks for model comparison:

[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.

Note: the original paper does not distinguish between the summary and the evidential networks, but treats them as a whole, with the appropriate architetcure dictated by the model application. For the sake of consistency, the BayesFlow library distinguisahes the two modules.

compute_loss(input_dict, **kwargs)[source]

Computes the loss of the amortized model comparison instance.

Parameters

input_dict (dict) –

Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged::

summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the evidence network model_indices - the ground-truth, one-hot encoded model indices sampled from the model prior

Returns

total_loss

Return type

tf.Tensor of shape (1,) - the total computed loss given input variables

evidence(input_dict, to_numpy=True, **kwargs)[source]

Computes the evidence for the competing models given the data sets contained in input_dict.

sample(input_dict, to_numpy=True, **kwargs)[source]

Samples posterior model probabilities from the higher order Dirichlet density.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the evidential network model_indices - the ground-truth, one-hot encoded model indices sampled from the model prior

  • n_samples (int) – Number of samples to obtain from the approximate posterior

  • to_numpy (bool, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor

Returns

pm_samples – The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)

Return type

tf.Tensor or np.array

uncertainty_score(input_dict, to_numpy=True, **kwargs)[source]

Computes the uncertainy score according to sum(alphas) / num_models.

class bayesflow.amortizers.AmortizedPosterior(*args, **kwargs)[source]

Bases: Model, AmortizedTarget

A wrapper to connect an inference network for parameter estimation with an optional summary network as in the original BayesFlow set-up described in the paper:

[1] Radev, S. T., Mertens, U. K., Voss, A., Ardizzone, L., & Köthe, U. (2020). BayesFlow: Learning complex stochastic models with invertible neural networks. IEEE Transactions on Neural Networks and Learning Systems.

But also allowing for augmented functionality, such as model misspecification detection in summary space:

[2] Schmitt, M., Bürkner, P. C., Köthe, U., & Radev, S. T. (2022). Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks arXiv preprint arXiv:2112.08866.

And learning of fat-tailed posteriors with a Student-t latent pushforward density:

[3] Jaini, P., Kobyzev, I., Yu, Y., & Brubaker, M. (2020, November). Tails of lipschitz triangular flows. In International Conference on Machine Learning (pp. 4673-4681). PMLR.

Serves as in interface for learning p(parameters | data, context).

call(input_dict, return_summary=False, **kwargs)[source]

Performs a forward pass through the summary and inference network.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT keys unchanged: parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables (including data) that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network

  • return_summary (bool, optional, default: False) – A flag which determines whether the learnable data summaries (representations) are returned or not.

Returns

net_out or (net_out, summary_out) – the outputs of inference_net(theta, summary_net(x, c_s), c_d), usually a latent variable and log(det(Jacobian)), that is a tuple (z, log_det_J) or (sum_outputs, (z, log_det_J)) if return_summary is set to True and a summary network is defined.

Return type

tuple of tf.Tensor

call_loop(input_list, return_summary=False, **kwargs)[source]

Performs a forward pass through the summary and inference network given a list of dicts with the appropriate entries (i.e., as used for the standard call method).

This method is useful when GPU memory is limited or data sets have a different (non-Tensor) structure.

Parameters
  • input_list (list of dicts, where each dict contains the following mandatory keys, if DEFAULT keys unchanged:) – parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables (including data) that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network

  • return_summary (bool, optional, default: False) – A flag which determines whether the learnable data summaries (representations) are returned or not.

Returns

net_out or (net_out, summary_out) – the outputs of inference_net(theta, summary_net(x, c_s), c_d), usually a latent variable and log(det(Jacobian)), that is a tuple (z, log_det_J) or (sum_outputs, (z, log_det_J)) if return_summary is set to True and a summary network is defined.

Return type

tuple of tf.Tensor

compute_loss(input_dict, **kwargs)[source]

Computes the loss of the posterior amortizer given an input dictionary.

Parameters

input_dict (dict) – Input dictionary containing the following mandatory keys: parameters - the latent model parameters over which a condition density is learned summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network

Returns

total_loss

Return type

tf.Tensor of shape (1,) - the total computed loss given input variables

log_posterior(input_dict, to_numpy=True, **kwargs)[source]

Calculates the approximate log-posterior of targets given conditional variables via the change-of-variable formula for a conditional normalizing flow.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged: parameters : the latent model parameters over which a conditional density (i.e., a posterior) is learned summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that are directly passed to the inference network

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the lpdf values as a np.array or a tf.Tensor

Returns

log_post – the approximate log-posterior density of each each parameter

Return type

tf.Tensor of shape (batch_size, n_obs)

log_prob(input_dict, to_numpy=True, **kwargs)[source]

Identical to log_posterior(input_dict, to_numpy, **kwargs).

sample(input_dict, n_samples, to_numpy=True, **kwargs)[source]

Generates random draws from the approximate posterior given a dictionary with conditonal variables.

Parameters
  • input_dict (dict) – Input dictionary containing the following mandatory keys, if DEFAULT KEYS unchanged: summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network

  • n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor.

  • **kwargs (dict, optional) – Additional keyword arguments passed to the networks

Returns

post_samples – the sampled parameters per data set

Return type

tf.Tensor or np.ndarray of shape (n_data_sets, n_samples, n_params)

sample_loop(input_list, n_samples, to_numpy=True, **kwargs)[source]

Generates random draws from the approximate posterior given a list of dicts with conditonal variables. Useful when GPU memory is limited or data sets have a different (non-Tensor) structure.

Parameters
  • input_list (list of dictionaries, each dictionary having the following mandatory keys, if DEFAULT KEYS unchanged:) – summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network

  • n_samples (int) – The number of posterior draws (samples) to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.darray or a tf.Tensor

  • **kwargs (dict, optional) – Additional keyword arguments passed to the networks

Returns

post_samples – the sampled parameters per data set

Return type

tf.Tensor or np.ndarray of shape (n_datasets, n_samples, n_params)

class bayesflow.amortizers.AmortizedPosteriorLikelihood(*args, **kwargs)[source]

Bases: Model, AmortizedTarget

An interface for jointly learning a surrogate model of the simulator and an approximate posterior given a generative model.

call(input_dict, **kwargs)[source]

Performs a forward pass through both amortizers.

Parameters

input_dict (dict) – Input dictionary containing the following mandatory keys: posterior_inputs - The input dictionary for the amortized posterior likelihood_inputs - The input dictionary for the amortized likelihood

Returns

(post_out, lik_out) – The outputs of the posterior and likelihood networks given input variables.

Return type

tuple

compute_loss(input_dict, **kwargs)[source]

Computes the loss of the join amortizer by summing the corresponding amortized posterior and likelihood losses.

Parameters

input_dict (dict) –

Nested input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged::

posterior_inputs - The input dictionary for the amortized posterior likelihood_inputs - The input dictionary for the amortized likelihood

Returns

total_losses – A dictionary with keys Post.Loss and Lik.Loss containing the individual losses for the two amortizers.

Return type

dict

log_likelihood(input_dict, to_numpy=True, **kwargs)[source]

Calculates the approximate log-likelihood of data given conditional variables via the change-of-variable formula for conditional normalizing flows.

Parameters
  • input_dict (dict) –

    Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:

    observables - the variables over which a condition density is learned (i.e., the observables) conditions - the conditioning variables that are directly passed to the inference network

    OR a nested dictionary with key likelihood_inputs containing the above input dictionary

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor

Returns

log_lik – the approximate log-likelihood of each data point in each data set

Return type

tf.Tensor of shape (batch_size, n_obs)

log_posterior(input_dict, to_numpy=True, **kwargs)[source]

Calculates the approximate log-posterior of targets given conditional variables via the change-of-variable formula for conditional normalizing flows.

Parameters

input_dict (dict) –

Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:

parameters - the latent generative model parameters over which a condition density is learned summary_conditions - the conditioning variables that are first passed through a summary network direct_conditions - the conditioning variables that the directly passed to the inference network

OR a nested dictionary with key posterior_inputs containing the above input dictionary

Returns

log_post – the approximate log-likelihood of each data point in each data set

Return type

tf.Tensor of shape (batch_size, n_obs)

log_prob(input_dict, to_numpy=True, **kwargs)[source]

Identical to calling separate log_likelihood() and log_posterior().

Returns

  • out_dict (dict with keys log_posterior and log_likelihood corresponding)

  • to the computed log_pdfs of the approximate posterior and likelihood.

sample(input_dict, n_post_samples, n_lik_samples, to_numpy=True, **kwargs)[source]

Identical to calling sample_parameters() and sample_data() separately.

Returns

  • out_dict (dict with keys posterior_samples and likelihood_samples corresponding)

  • to the n_samples from the approximate posterior and likelihood, respectively

sample_data(input_dict, n_samples, to_numpy=True, **kwargs)[source]

Generates n_samples random draws from the surrogate likelihood given input conditions.

Parameters
  • input_dict (dict) –

    Input dictionary containing the following mandatory keys, if DEFAULT_KEYS unchanged:

    conditions - the conditioning variables that the directly passed to the inference network

    OR a nested dictionary with key likelihood_inputs containing the above input dictionary

  • n_samples (int) – The number of posterior samples to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Flag indicating whether to return the samples as a np.array or a tf.Tensor

Returns

lik_samples – Simulated observables from the surrogate likelihood.

Return type

tf.Tensor or np.ndarray of shape (n_datasets, n_samples, None)

sample_parameters(input_dict, n_samples, to_numpy=True, **kwargs)[source]

Generates random draws from the approximate posterior given conditonal variables.

Parameters
  • input_dict (dict) –

    Input dictionary containing the following mandatory keys, if DEFAULT KEYS unchanged:

    summary_conditions : the conditioning variables (including data) that are first passed through a summary network direct_conditions : the conditioning variables that the directly passed to the inference network

    OR a nested dictionary with key posterior_inputs containing the above input dictionary

  • n_samples (int) – The number of posterior samples to obtain from the approximate posterior

  • to_numpy (bool, optional, default: True) – Boolean flag indicating whether to return the samples as a np.array or a tf.Tensor

Returns

post_samples – the sampled parameters per data set

Return type

tf.Tensor or np.ndarray of shape (n_datasets, n_samples, n_params)

class bayesflow.amortizers.AmortizedTarget(*args, **kwargs)[source]

Bases: ABC

An abstract interface for an amortized learned distribution. Children should implement the following public methods:

  1. compute_loss(self, input_dict, **kwargs)

  2. sample(input_dict, **kwargs)

  3. log_prob(input_dict, **kwargs)

abstract compute_loss(input_dict, **kwargs)[source]
abstract log_prob(**kwargs)[source]
abstract sample(**kwargs)[source]
class bayesflow.amortizers.SingleModelAmortizer(*args, **kwargs)[source]

Bases: AmortizedPosterior

Deprecated class for amortizer posterior estimation.

bayesflow.computational_utilities module

bayesflow.computational_utilities.expected_calibration_error(m_true, m_pred, n_bins=15)[source]

Estimates the calibration error of a model comparison network.

Important

Make sure that m_true are one-hot encoded classes!

Parameters
  • m_true (np.array or list) – True model indices

  • m_pred (np.array or list) – Predicted model indices

  • n_bins (int, default: 15) – Number of bins for plot

Return type

#TODO

bayesflow.computational_utilities.gaussian_kernel_matrix(x, y, sigmas=None)[source]

Computes a Gaussian radial basis functions (RBFs) between the samples of x and y.

We create a sum of multiple Gaussian kernels each having a width \(\sigma_i\).

Parameters
  • x (tf.Tensor of shape (num_draws_x, num_features)) – Comprises num_draws_x Random draws from the “source” distribution P.

  • y (tf.Tensor of shape (num_draws_y, num_features)) – Comprises num_draws_y Random draws from the “source” distribution Q.

  • sigmas (list(float), optional, default: None) – List which denotes the widths of each of the gaussians in the kernel. If sigmas is None, a default range will be used, contained in bayesflow.default_settings.MMD_BANDWIDTH_LIST

Returns

kernel – The kernel matrix between pairs from x and y.

Return type

tf.Tensor of shape (num_draws_x, num_draws_y)

bayesflow.computational_utilities.get_coverage_probs(z, u)[source]

Vectorized function to compute the minimal coverage probability for uniform ECDFs given evaluation points z and a sample of samples u.

Parameters
  • z (np.ndarray of shape (num_points, )) – The vector of evaluation points.

  • u (np.ndarray of shape (num_simulations, num_samples)) – The matrix of simulated draws (samples) from U(0, 1)

bayesflow.computational_utilities.inverse_multiquadratic_kernel_matrix(x, y, sigmas=None)[source]

Computes an inverse multiquadratic RBF between the samples of x and y.

We create a sum of multiple IM-RBF kernels each having a width \(\sigma_i\).

Parameters
  • x (tf.Tensor of shape (num_draws_x, num_features)) – Comprises num_draws_x Random draws from the “source” distribution P.

  • y (tf.Tensor of shape (num_draws_y, num_features)) – Comprises num_draws_y Random draws from the “source” distribution Q.

  • sigmas (list(float), optional, default: None) – List which denotes the widths of each of the gaussians in the kernel. If sigmas is None, a default range will be used, contained in bayesflow.default_settings.MMD_BANDWIDTH_LIST

Returns

kernel – The kernel matrix between pairs from x and y.

Return type

tf.Tensor of shape (num_draws_x, num_draws_y)

bayesflow.computational_utilities.maximum_mean_discrepancy(source_samples, target_samples, kernel='gaussian', mmd_weight=1.0, minimum=0.0)[source]

Computes the MMD given a particular choice of kernel.

For details, consult Gretton et al. (2012): https://www.jmlr.org/papers/volume13/gretton12a/gretton12a.pdf

Parameters
  • source_samples (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution.

  • target_samples (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution.

  • kernel (str in ('gaussian', 'inverse_multiquadratic'), optional, default: 'gaussian') – The kernel to use for computing the distance between pairs of random draws.

  • mmd_weight (float, optional, default: 1.0) – The weight of the MMD value.

  • minimum (float, optional, default: 0.0) – The lower bound of the MMD value.

Returns

loss_value – A scalar Maximum Mean Discrepancy, shape (,)

Return type

tf.Tensor

bayesflow.computational_utilities.mmd_kernel(x, y, kernel)[source]

Computes the estimator of the Maximum Mean Discrepancy (MMD) between two samples: x and y.

Maximum Mean Discrepancy (MMD) is a distance-measure between random draws from the distributions x ~ P and y ~ Q.

Parameters
  • x (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution x ~ P.

  • y (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution y ~ Q.

  • kernel (callable) – A function which computes the distance between pairs of samples.

Returns

loss – The statistically biased squared maximum mean discrepancy (MMD) value.

Return type

tf.Tensor of shape (,)

bayesflow.computational_utilities.mmd_kernel_unbiased(x, y, kernel)[source]

Computes the unbiased estimator of the Maximum Mean Discrepancy (MMD) between two samples: x and y. Maximum Mean Discrepancy (MMD) is a distance-measure between the samples of the distributions x ~ P and y ~ Q.

Parameters
  • x (tf.Tensor of shape (N, num_features)) – An array of N random draws from the “source” distribution x ~ P.

  • y (tf.Tensor of shape (M, num_features)) – An array of M random draws from the “target” distribution y ~ Q.

  • kernel (callable) – A function which computes the distance between pairs of random draws from x and y.

Returns

loss – The statistically unbiaserd squared maximum mean discrepancy (MMD) value.

Return type

tf.Tensor of shape (,)

bayesflow.computational_utilities.simultaneous_ecdf_bands(num_samples, num_points=None, num_simulations=1000, confidence=0.95, eps=1e-05, max_num_points=1000)[source]

Computes the simultaneous ECDF bands through simulation according to the algorithm described in Section 2.2:

https://link.springer.com/content/pdf/10.1007/s11222-022-10090-6.pdf

Depends on the vectorized utility function get_coverage_probs(z, u).

Parameters
  • num_samples (int) – The sample size used for computing the ECDF. Will equal to the number of posterior samples when used for calibrarion. Corresponds to N in the paper above.

  • num_points (int, optional, default: None) – The number of evaluation points on the interval (0, 1). Defaults to num_points = num_samples if not explicitly specified. Correspond to K in the paper above.

  • num_simulations (int, optional, default: 1000) – The number of samples of size n_samples to simulate for determining the simultaneous CIs.

  • confidence (float in (0, 1), optional, default: 0.95) – The confidence level, confidence = 1 - alpha specifies the width of the confidence interval.

  • eps (float, optional, default: 1e-5) – Small number to add to the lower and subtract from the upper bound of the interval [0, 1] to avoid edge artefacts. No need to touch this.

  • max_num_points (int, optional, default: 1000) – Upper bound on num_points. Saves computation time when num_samples is large.

Returns

the evaluation points, the lower, and the upper confidence bands, respectively.

Return type

(alpha, z, L, U) - tuple of scalar and three arrays of size (num_samples,) containing the confidence level as well as

bayesflow.configuration module

class bayesflow.configuration.DefaultJointConfigurator(default_float_type=<class 'numpy.float32'>)[source]

Bases: object

Fallback class for a generic configrator for joint posterior and likelihood approximation.

class bayesflow.configuration.DefaultLikelihoodConfigurator(default_float_type=<class 'numpy.float32'>)[source]

Bases: object

Fallback class for a generic configrator for amortized likelihood approximation.

class bayesflow.configuration.DefaultModelComparisonConfigurator(n_models, config=None, default_float_type=<class 'numpy.float32'>)[source]

Bases: object

Fallback class for a default configurator for amortized model comparison.

class bayesflow.configuration.DefaultPosteriorConfigurator(default_float_type=<class 'numpy.float32'>)[source]

Bases: object

Fallback class for a generic configrator for amortized posterior approximation.

bayesflow.coupling_networks module

class bayesflow.coupling_networks.AffineCouplingLayer(*args, **kwargs)[source]

Bases: Model

Implements a conditional version of the INN coupling layer.

call(target_or_z, condition, inverse=False, **kwargs)[source]

Performs one pass through a the affine coupling layer (either inverse or forward).

Parameters
  • target_or_z (tf.Tensor) – The estimation quantites of interest or latent representations z ~ p(z), shape (batch_size, …)

  • condition (tf.Tensor or None) – The conditioning data of interest, for instance, x = summary_fun(x), shape (batch_size, …). If condition is None, then the layer recuces to an unconditional ACL.

  • inverse (bool, optional, default: False) – Flag indicating whether to run the block forward or backward.

Returns

  • (z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, z shape: (batch_size, inp_dim), log_det_J shape: (batch_size, )

  • target (tf.Tensor) – If inverse=True: The back-transformed z, shape (batch_size, inp_dim)

Important

If inverse=False, the return is (z, log_det_J).

If inverse=True, the return is target

forward(target, condition, **kwargs)[source]

Performs a forward pass through a coupling layer with an optinal Permutation and ActNorm layers.

Parameters
  • target (tf.Tensor) – The estimation quantities of interest, for instance, parameter vector of shape (batch_size, theta_dim)

  • condition (tf.Tensor or None) – The conditioning vector of interest, for instance, x = summary(x), shape (batch_size, summary_dim) If None, transformation amounts to unconditional estimation.

Returns

(z, log_det_J) – The transformed input and the corresponding Jacobian of the transformation.

Return type

tuple(tf.Tensor, tf.Tensor)

inverse(z, condition, **kwargs)[source]

Performs an inverse pass through a coupling layer with an optinal Permutation and ActNorm layers.

Parameters
  • z (tf.Tensor) – latent variables z ~ p(z), shape (batch_size, theta_dim)

  • condition (tf.Tensor or None) – The conditioning vector of interest, for instance, x = summary(x), shape (batch_size, summary_dim). If None, transformation amounts to unconditional estimation.

Returns

target – The back-transformed latent variable z.

Return type

tf.Tensor

bayesflow.default_settings module

class bayesflow.default_settings.MetaDictSetting(meta_dict: dict, mandatory_fields: list = [])[source]

Bases: Setting

Implements an interface for a default meta_dict with optional mandatory fields.

class bayesflow.default_settings.Setting[source]

Bases: ABC

Abstract base class for settings. It’s here to potentially extend the setting functionality in future.

bayesflow.diagnostics module

bayesflow.diagnostics.plot_calibration_curves(m_true, m_pred, model_names=None, n_bins=10, font_size=12, fig_size=(12, 4))[source]

Plots the calibration curves and the ECE for a model comparison problem. Depends on the expected_calibration_error function for computing the ECE.

Parameters

TODO

bayesflow.diagnostics.plot_latent_space_2d(z_samples, height=2.5, color='#8f2727', **kwargs)[source]

Creates pairplots for the latent space learned by the inference network. Enables visual inspection of the the latent space and whether its structrue corresponds to the one enforced by the optimization criterion.

Parameters
  • z_samples (np.ndarray or tf.Tensor of shape (n_sim, n_params)) – The latent samples computed through a forward pass of the inference network.

  • height (float, optional, default: 2.5) – The height of the pair plot.

  • color (str, optional, defailt : '#8f2727') – The color of the plot

  • **kwargs (dict, optional) – Additional keyword arguments passed to the sns.PairGrid constructor

Returns

f

Return type

plt.Figure - the figure instance for optional saving

bayesflow.diagnostics.plot_losses(history, fig_size=None, color='#8f2727', label_fontsize=14, title_fontsize=16)[source]

A generic helper function to plot the losses of a series of training epochs and runs.

Parameters

history (pd.DataFrame or bayesflow.LossHistory object) – The (plottable) history as returned by a train_[…] method of a Trainer instance.

Returns

f

Return type

plt.Figure - the figure instance for optional saving

bayesflow.diagnostics.plot_posterior_2d(posterior_draws, prior=None, prior_draws=None, param_names=None, height=3, legend_fontsize=14, post_color='#8f2727', prior_color='gray', post_alpha=0.9, prior_alpha=0.7)[source]

Generates a bivariate pairplot given posterior draws and optional prior or prior draws.

posterior_drawsnp.ndarray of shape (n_post_draws, n_params)

The posterior draws obtained for a SINGLE observed data set.

priorbayesflow.forward_inference.Prior instance or None, optional, default: None

The optional prior object having an input-output signature as given by ayesflow.forward_inference.Prior

prior_drawsnp.ndarray of shape (n_prior_draws, n_params) or None, optonal (default: None)

The optional prior draws obtained from the prior. If both prior and prior_draws are provided, prior_draws will be used.

param_nameslist or None, optional, default: None

The parameter names for nice plot titles. Inferred if None

heightfloat, optional, default: 3.

The height of the pairplot.

legend_fontsizeint, optional, default: 14

The font size of the legend text.

post_colorstr, optional, default: ‘#8f2727’

The color for the posterior histograms and KDEs.

priors_colorstr, optional, default: gray

The color for the optional prior histograms and KDEs.

post_alphafloat in [0, 1], optonal, default: 0.9

The opacity of the posterior plots.

prior_alphafloat in [0, 1], optonal, default: 0.7

The opacity of the prior plots.

Returns

f

Return type

plt.Figure - the figure instance for optional saving

Raises

AssertionError – If the shape of posterior_draws is not 2-dimensional.

bayesflow.diagnostics.plot_prior2d(prior, param_names=None, n_samples=2000, height=2.5, color='#8f2727', **kwargs)[source]

Creates pairplots for a given joint prior.

Parameters
  • prior (callable) – The prior object which takes a single integer argument and generates random draws.

  • param_names (list of str or None, optional, default None) – An optional list of strings which

  • n_samples (int, optional, default: 1000) – The number of random draws from the joint prior

  • height (float, optional, default: 2.5) – The height of the pair plot

  • color (str, optional, defailt : '#8f2727') – The color of the plot

  • **kwargs (dict, optional) – Additional keyword arguments passed to the sns.PairGrid constructor

Returns

f

Return type

plt.Figure - the figure instance for optional saving

bayesflow.diagnostics.plot_recovery(post_samples, prior_samples, point_agg=<function mean>, uncertainty_agg=<function std>, param_names=None, fig_size=None, label_fontsize=14, title_fontsize=16, metric_fontsize=16, add_corr=True, add_r2=True, color='#8f2727', n_col=None, n_row=None)[source]

Creates and plots publication-ready recovery plot with true vs. point estimate + uncertainty. The point estimate can be controlled with the point_agg argument, and the uncertainty estimate can be controlled with the uncertainty_agg argument.

This plot yields the same information as the “posterior z-score”:

https://betanalpha.github.io/assets/case_studies/principled_bayesian_workflow.html

Important: Posterior aggregates play no special role in Bayesian inference and should only be used heuristically. For instanec, in the case of multi-modal posteriors, common point estimates, such as mean, (geometric) median, or maximum a posteriori (MAP) mean nothing.

Parameters
  • post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets

  • prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws (true parameters) obtained for generating the n_data_sets

  • point_agg (callable, optional, default: np.mean) – The function to apply to the posterior draws to get a point estimate for each marginal.

  • uncertainty_agg (callable or None, optional, default: np.std) – The function to apply to the posterior draws to get an uncertainty estimate. If None provided, a simple scatter will be plotted.

  • param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None

  • fig_size (tuple or None, optional, default : None) – The figure size passed to the matplotlib constructor. Inferred if None.

  • label_fontsize (int, optional, default: 14) – The font size of the y-label text

  • title_fontsize (int, optional, default: 16) – The font size of the title text

  • metric_fontsize (int, optional, default: 16) – The font size of the goodness-of-fit metric (if provided)

  • add_corr (boolean, optional, default: True) – A flag for adding correlation between true and estimates to the plot.

  • add_r2 (boolean, optional, default: True) – A flag for adding R^2 between true and estimates to the plot.

  • color (str, optional, default: '#8f2727') – The color for the true vs. estimated scatter points and errobars.

Returns

f

Return type

plt.Figure - the figure instance for optional saving

Raises

ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.

bayesflow.diagnostics.plot_sbc_ecdf(post_samples, prior_samples, difference=False, stacked=False, fig_size=None, param_names=None, label_fontsize=14, legend_fontsize=14, title_fontsize=16, rank_ecdf_color='#a34f4f', fill_color='grey', **kwargs)[source]

Creates the empirical CDFs for each marginal rank distribution and plots it against a uniform ECDF. ECDF simultaneous bands are drawn using simulations from the uniform. Inspired by:

[1] Säilynoja, T., Bürkner, P. C., & Vehtari, A. (2022). Graphical test for discrete uniformity and its applications in goodness-of-fit evaluation and multiple sample comparison. Statistics and Computing, 32(2), 1-21. https://arxiv.org/abs/2103.10522

For models with many parameters, use stacked=True to obtain an idea of the overall calibration of a posterior approximator.

Parameters
  • post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets

  • prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets

  • difference (boolean, optional, default: False) – If True, plots the ECDF difference. Enables a more dynamic visualization range.

  • stacked (boolean, optional, default: False) – If True, all ECDFs will be plotted on the same plot. If False, each ECDF will have its own subplot, similar to the behavior of plot_sbc_histograms.

  • param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None. Only relevant if stacked=False.

  • fig_size (tuple or None, optional, default: None) – The figure size passed to the matplotlib constructor. Inferred if None.

  • label_fontsize (int, optional, default: 14) – The font size of the y-label and y-label texts

  • legend_fontsize (int, optional, default: 14) – The font size of the legend text

  • title_fontsize (int, optional, default: 16) – The font size of the title text. Only relevant if stacked=False

  • rank_ecdf_color (str, optional, default: '#a34f4f') – The color to use for the rank ECDFs

  • fill_color (str, optional, default: 'grey') – The color of the fill arguments.

  • **kwargs (dict, optional, default: {}) – Keyword arguments can be passed to control the behavior of ECDF simultaneous band computation through the ecdf_bands_kwargs dictionary. See simultaneous_ecdf_bands for keyword arguments

Returns

f

Return type

plt.Figure - the figure instance for optional saving

Raises

ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.

bayesflow.diagnostics.plot_sbc_histograms(post_samples, prior_samples, param_names=None, fig_size=None, num_bins=None, binomial_interval=0.99, label_fontsize=14, title_fontsize=16, hist_color='#a34f4f')[source]

Creates and plots publication-ready histograms of rank statistics for simulation-based calibration (SBC) checks according to:

[1] Talts, S., Betancourt, M., Simpson, D., Vehtari, A., & Gelman, A. (2018). Validating Bayesian inference algorithms with simulation-based calibration. arXiv preprint arXiv:1804.06788.

Any deviation from uniformity indicates miscalibration and thus poor convergence of the networks or poor combination between generative model / networks.

Parameters
  • post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets

  • prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets

  • param_names (list or None, optional, default: None) – The parameter names for nice plot titles. Inferred if None

  • fig_size (tuple or None, optional, default : None) – The figure size passed to the matplotlib constructor. Inferred if None.

  • num_bins (int, optional, default: 10) – The number of bins to use for each marginal histogram

  • binomial_interval (float in (0, 1), optional, default: 0.95) – The width of the confidence interval for the binomial distribution

  • label_fontsize (int, optional, default: 14) – The font size of the y-label text

  • title_fontsize (int, optional, default: 16) – The font size of the title text

  • hist_color (str, optional, default '#a34f4f') – The color to use for the histogram body

Returns

f

Return type

plt.Figure - the figure instance for optional saving

Raises

ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.

bayesflow.exceptions module

exception bayesflow.exceptions.ConfigurationError[source]

Bases: Exception

Class for error in model configuration, e.g. in meta dict

exception bayesflow.exceptions.InferenceError[source]

Bases: Exception

Class for error in forward/inverse pass of a neural components.

exception bayesflow.exceptions.LossError[source]

Bases: Exception

Class for error in applying loss.

exception bayesflow.exceptions.OperationNotSupportedError[source]

Bases: Exception

Class for error that occurs when an operation is demanded but not supported, e.g. when a trainer is initialized without generative model but the user demands it to simulate data.

exception bayesflow.exceptions.ShapeError[source]

Bases: Exception

Class for error in expected shappes.

exception bayesflow.exceptions.SimulationError[source]

Bases: Exception

Class for an error in simulation.

exception bayesflow.exceptions.SummaryStatsError[source]

Bases: Exception

Class for error in summary statistics.

bayesflow.helper_classes module

class bayesflow.helper_classes.LossHistory[source]

Bases: object

Helper class to keep track of losses during training.

add_entry(epoch, current_loss)[source]

Adds loss entry for current epoch into internal memory data structure.

file_name = 'history'
flush()[source]

Returns current history and removes all existing loss history.

get_copy()[source]
get_plottable()[source]

Returns the losses as a nicely formatted pandas DataFrame.

get_running_losses(epoch)[source]

Compute and return running means of the losses for current epoch.

load_from_file(file_path)[source]

Loads the most recent saved LossHistory object from file_path.

save_to_file(file_path, max_to_keep)[source]

Saves a LossHistory object to a pickled dictionary in file_path. If max_to_keep saved loss history files are found in file_path, the oldest is deleted before a new one is saved.

start_new_run()[source]
property total_loss
class bayesflow.helper_classes.MemoryReplayBuffer(capacity_in_batches=500)[source]

Bases: object

Implements a memory replay buffer for simulation-based inference.

sample()[source]

Samples batch_size number of parameter vectors and simulations from buffer.

Returns

forward_dict – The (raw or configured) outputs of the forward model.

Return type

dict

store(forward_dict)[source]

Stores simulation outputs, if internal buffer is not full.

Parameters

forward_dict (dict) – The confogired outputs of the forward model.

class bayesflow.helper_classes.RegressionLRAdjuster(optimizer, period=1000, wait_between_fits=10, patience=10, tolerance=- 0.05, reduction_factor=0.25, cooldown_factor=2, num_resets=3, **kwargs)[source]

Bases: object

This class will compute the slope of the loss trajectory and inform learning rate decay.

file_name = 'lr_adjuster'
get_slope(losses)[source]

Fits a Huber regression on the provided loss trajectory or returns None if not enough data points present.

load_from_file(file_path)[source]

Loads the saved LRAdjuster object from file_path.

reset()[source]

Resets all stateful variables in preparation for a new start.

save_to_file(file_path)[source]

Saves the state parameters of a RegressionLRAdjuster object to a pickled dictionary in file_path.

class bayesflow.helper_classes.SimulationDataset(forward_dict, batch_size, buffer_size=1024)[source]

Bases: object

Helper class to create a tensorflow.data.Dataset which parses simulation dictionaries and returns simulation dictionaries as expected by BayesFlow amortizers.

class bayesflow.helper_classes.SimulationMemory(stores_raw=True, capacity_in_batches=50)[source]

Bases: object

Helper class to keep track of a pre-determined number of simulations during training.

file_name = 'memory'
get_memory()[source]
is_full()[source]

Returns True if the buffer is full, otherwise False.

load_from_file(file_path)[source]

Loads the saved SimulationMemory object from file_path.

save_to_file(file_path)[source]

Saves a SimulationMemory object to a pickled dictionary in file_path.

store(forward_dict)[source]

Stores simulation outputs in forward_dict, if internal buffer is not full.

Parameters

forward_dict (dict) – The configured outputs of the forward model.

bayesflow.helper_functions module

bayesflow.helper_functions.backprop_step(input_dict, amortizer, optimizer, **kwargs)[source]

Computes the loss of the provided amortizer given an input dictionary and applies gradients.

Parameters
  • input_dict (dict) – The configured output of the genrative model

  • amortizer (tf.keras.Model) – The custom amortizer. Needs to implement a compute_loss method.

  • optimizer (tf.keras.optimizers.Optimizer) – The optimizer used to update the amortizer’s parameters.

  • **kwargs (dict) – Optional keyword arguments passed to the network’s compute_loss method

Returns

loss – The outputs of the compute_loss() method of the amortizer comprising all loss components, such as divergences or regularization.

Return type

dict

bayesflow.helper_functions.build_meta_dict(user_dict: dict, default_setting: MetaDictSetting) dict[source]

Integrates a user-defined dictionary into a default dictionary.

Takes a user-defined dictionary and a default dictionary.

  1. Scan the user_dict for violations by unspecified mandatory fields.

  2. Merge user_dict entries into the default_dict. Considers nested dict structure.

Parameters
  • user_dict (dict) – The user’s dictionary

  • default_setting (MetaDictSetting) –

    The specified default setting with attributes:

    • meta_dict: dictionary with default values.

    • mandatory_fields: list(str) keys that need to be specified by the user_dict

Returns

merged_dict – Merged dictionary.

Return type

dict

bayesflow.helper_functions.check_posterior_prior_shapes(post_samples, prior_samples)[source]

Checks requirements for the shapes of posterior and prior draws as necessitated by most diagnostic functions.

Parameters
  • post_samples (np.ndarray of shape (n_data_sets, n_post_draws, n_params)) – The posterior draws obtained from n_data_sets

  • prior_samples (np.ndarray of shape (n_data_sets, n_params)) – The prior draws obtained for generating n_data_sets

Raises

ShapeError – If there is a deviation form the expected shapes of post_samples and prior_samples.

bayesflow.helper_functions.extract_current_lr(optimizer)[source]

Extracts current learning rate from optimizer.

Parameters

optimizer (instance of subclass of tf.keras.optimizers.Optimizer) – Optimizer to extract the learning rate from

Returns

current_lr – Current learning rate, or None if it can’t be determined

Return type

np.float or NoneType

bayesflow.helper_functions.format_loss_string(ep, it, loss, avg_dict, slope=None, lr=None, ep_str='Epoch', it_str='Iter', scalar_loss_str='Loss')[source]

Prepare loss string for displaying on progress bar.

bayesflow.helper_functions.merge_left_into_right(left_dict, right_dict)[source]

Function to merge nested dict left_dict into nested dict right_dict.

bayesflow.helper_networks module

class bayesflow.helper_networks.ActNorm(*args, **kwargs)[source]

Bases: Model

Implements an Activation Normalization (ActNorm) Layer.

call(target, inverse=False)[source]

Performs one pass through the actnorm layer (either inverse or forward) and normalizes the last axis of target.

Parameters
  • target (tf.Tensor of shape (batch_size, ...)) – the target variables of interest, i.e., parameters for posterior estimation

  • inverse (bool, optional, default: False) – Flag indicating whether to run the block forward or backwards

Returns

  • (z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, inp_dim), log_det_J shape: (,)

  • target (tf.Tensor) – If inverse=True: The inversly transformed targets, shape == target.shape

Important

If inverse=False, the return is (z, log_det_J).

If inverse=True, the return is target.

class bayesflow.helper_networks.DenseCouplingNet(*args, **kwargs)[source]

Bases: Model

Implements a conditional version of a standard fully connected (FC) network. Would also work as an unconditional estimator.

call(target, condition, **kwargs)[source]

Concatenates target and condition and performs a forward pass through the coupling net.

Parameters
  • target (tf.Tensor) – The split estimation quntities, for instance, parameters \(\theta \sim p(\theta)\) of interest, shape (batch_size, …)

  • condition (tf.Tensor or None) – the conditioning vector of interest, for instance x = summary(x), shape (batch_size, summary_dim)

class bayesflow.helper_networks.EquivariantModule(*args, **kwargs)[source]

Bases: Model

Implements an equivariant module performing an equivariant transform.

For details and justification, see:

https://www.jmlr.org/papers/volume21/19-322/19-322.pdf

call(x)[source]

Performs the forward pass of a learnable equivariant transform.

Parameters

x (tf.Tensor) – Input of shape (batch_size, N, x_dim)

Returns

out – Output of shape (batch_size, N, equiv_dim)

Return type

tf.Tensor

class bayesflow.helper_networks.InvariantModule(*args, **kwargs)[source]

Bases: Model

Implements an invariant module performing a permutation-invariant transform.

For details and rationale, see:

https://www.jmlr.org/papers/volume21/19-322/19-322.pdf

call(x)[source]

Performs the forward pass of a learnable invariant transform.

Parameters

x (tf.Tensor) – Input of shape (batch_size, N, x_dim)

Returns

out – Output of shape (batch_size, out_dim)

Return type

tf.Tensor

class bayesflow.helper_networks.MultiConv1D(*args, **kwargs)[source]

Bases: Model

Implements an inception-inspired 1D convolutional layer using different kernel sizes.

call(x, **kwargs)[source]

Performs a forward pass through the layer.

Parameters

x (tf.Tensor) – Input of shape (batch_size, n_time_steps, n_time_series)

Returns

out – Output of shape (batch_size, n_time_steps, n_filters)

Return type

tf.Tensor

class bayesflow.helper_networks.Permutation(*args, **kwargs)[source]

Bases: Model

Implements a layer to permute the inputs entering a (conditional) coupling layer. Uses fixed permutations, as these perform equally well compared to learned permutations.

call(target, inverse=False)[source]

Permutes a batch of target vectors over the last axis.

Parameters
  • target (tf.Tensor of shape (batch_size, ...)) – The target vector to be permuted over its last axis.

  • inverse (bool, optional, default: False) – Controls if the current pass is forward (inverse=False) or inverse (inverse=True).

Returns

out – The (un-)permuted target vector.

Return type

tf.Tensor of the same shape as target.

bayesflow.inference_networks module

class bayesflow.inference_networks.EvidentialNetwork(*args, **kwargs)[source]

Bases: Model

Implements a network whose outputs are the concentration parameters of a Dirichlet density.

Follows ideas from:

[1] Radev, S. T., D’Alessandro, M., Mertens, U. K., Voss, A., Köthe, U., & Bürkner, P. C. (2021). Amortized Bayesian model comparison with evidential deep learning. IEEE Transactions on Neural Networks and Learning Systems.

[2] Sensoy, M., Kaplan, L., & Kandemir, M. (2018). Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31.

call(condition, **kwargs)[source]

Computes evidences for model comparison given a batch of data and optional concatenated context, typically passed through a summayr network.

Parameters

condition (tf.Tensor of shape (batch_size, ...)) – The input variables used for determining p(model | condition)

Returns

evidence

Return type

tf.Tensor of shape (batch_size, num_models) – the learned model evidences

classmethod create_config(**kwargs)[source]

“Used to create the settings dictionary for the internal networks of the invertible network. Will fill in missing

evidence(condition, **kwargs)[source]
sample(condition, n_samples, **kwargs)[source]

Samples posterior model probabilities from the higher-order Dirichlet density.

Parameters
  • condition (tf.Tensor) – The summary of the observed (or simulated) data, shape (n_data_sets, …)

  • n_samples (int) – Number of samples to obtain from the approximate posterior

Returns

pm_samples – The posterior draws from the Dirichlet distribution, shape (num_samples, num_batch, num_models)

Return type

tf.Tensor or np.array

class bayesflow.inference_networks.InvertibleNetwork(*args, **kwargs)[source]

Bases: Model

Implements a chain of conditional invertible coupling layers for conditional density estimation.

call(targets, condition, inverse=False, **kwargs)[source]

Performs one pass through an invertible chain (either inverse or forward).

Parameters
  • targets (tf.Tensor) – The estimation quantities of interest, shape (batch_size, …)

  • condition (tf.Tensor) – The conditional data x, shape (batch_size, summary_dim)

  • inverse (bool, default: False) – Flag indicating whether to run the chain forward or backwards

Returns

  • (z, log_det_J) (tuple(tf.Tensor, tf.Tensor)) – If inverse=False: The transformed input and the corresponding Jacobian of the transformation, v shape: (batch_size, …), log_det_J shape: (batch_size, …)

  • target (tf.Tensor) – If inverse=True: The transformed out, shape (batch_size, …)

Important

If inverse=False, the return is (z, log_det_J).

If inverse=True, the return is target.

classmethod create_config(**kwargs)[source]

“Used to create the settings dictionary for the internal networks of the invertible network. Will fill in missing

forward(targets, condition, **kwargs)[source]

Performs a forward pass though the chain.

inverse(z, condition, **kwargs)[source]

Performs a reverse pass through the chain. Assumes that it is only used in inference mode, so **kwargs contains training=False.

bayesflow.losses module

bayesflow.losses.kl_dirichlet(model_indices, alpha)[source]

Computes the KL divergence between a Dirichlet distribution with parameter vector alpha and a uniform Dirichlet.

Parameters
  • model_indices (tf.Tensor of shape (batch_size, n_models)) – one-hot-encoded true model indices

  • alpha (tf.Tensor of shape (batch_size, n_models)) – positive network outputs in [1, +inf]

Returns

kl – A single scalar representing \(D_{KL}(\mathrm{Dir}(\alpha) | \mathrm{Dir}(1,1,\ldots,1) )\), shape (,)

Return type

tf.Tensor

bayesflow.losses.kl_latent_space_gaussian(z, log_det_J)[source]

Computes the Kullback-Leibler divergence between true and approximate posterior assuming a Gaussian latent space as a source distribution.

Parameters
  • z (tf.Tensor of shape (batch_size, ...)) – The (latent transformed) target variables

  • log_det_J (tf.Tensor of shape (batch_size, ...)) – The logartihm of the Jacobian determinant of the transformation.

Returns

loss – A single scalar value representing the KL loss, shape (,)

Return type

tf.Tensor

Examples

Parameter estimation

>>> kl_latent_space_gaussian(z, log_det_J)
bayesflow.losses.kl_latent_space_student(v, z, log_det_J)[source]

Computes the Kullback-Leibler divergence between true and approximate posterior assuming latent student t-distribution as a source distribution.

Parameters
  • v (tf Tensor of shape (batch_size, ...)) – The degrees of freedom of the latent student t-distribution

  • z (tf.Tensor of shape (batch_size, ...)) – The (latent transformed) target variables

  • log_det_J (tf.Tensor of shape (batch_size, ...)) – The logartihm of the Jacobian determinant of the transformation.

Returns

loss – A single scalar value representing the KL loss, shape (,)

Return type

tf.Tensor

bayesflow.losses.log_loss(model_indices, alpha)[source]

Computes the logloss given output probs and true model indices m_true.

Parameters
  • model_indices (tf.Tensor of shape (batch_size, n_models)) – one-hot-encoded true model indices

  • alpha (tf.Tensor of shape (batch_size, n_models)) – positive network outputs in [1, +inf]

Returns

loss – A single scalar Monte-Carlo approximation of the log-loss, shape (,)

Return type

tf.Tensor

bayesflow.losses.mmd_summary_space(summary_outputs, z_dist=<function random_normal>, kernel='gaussian')[source]

Computes the MMD(p(summary_otuputs) | z_dist) to re-shape the summary network outputs in an information-preserving manner.

Parameters
  • summary_outputs (tf Tensor of shape (batch_size, ...)) – The outputs of the summary network.

  • z_dist (callable, default tf.random.normal) – The latent data distribution towards which the summary outputs are optimized.

  • kernel (str in ('gaussian', 'inverse_multiquadratic'), default 'gaussian') – The kernel function to use for MMD computation.

bayesflow.mcmc module

class bayesflow.mcmc.MCMCSurrogateLikelihood(amortized_likelihood, configurator=None, likelihood_postprocessor=None, grad_postprocessor=None)[source]

Bases: object

An interface to provide likelihood evaluation and gradient estimation of a pre-trained AmortizedLikelihood instance, which can be used in tandem with (HMC)-MCMC, as implemented, for instance, in PyMC3.

log_likelihood(*args, **kwargs)[source]

Calculates the approximate log-likelihood of targets given conditional variables.

Parameters

configurator (The parameters as expected by configurator. For the default) –

:param : :param the first parameter has to be a dictionary containing the following mandatory keys: :param : :param if DEFAULT_KEYS unchanged: observables - the variables over which a condition density is learned (i.e., the observables)

conditions - the conditioning variables that the directly passed to the inference network

Returns

out – The output as returned by likelihood_postprocessor. For the default postprocessor, this is the total log-likelihood given by the sum of all log-likelihood values.

Return type

np.ndarray

log_likelihood_grad(*args, **kwargs)[source]

Calculates the gradient of the surrogate likelihood with respect to every parameter in conditions.

Parameters

configurator (The parameters as expected by configurator. For the default) –

:param : :param the first parameter has to be a dictionary containing the following mandatory keys: :param : :param if DEFAULT_KEYS unchanged: observables - the variables over which a condition density is learned (i.e., the observables)

conditions - the conditioning variables that the directly passed to the inference network

Returns

out – The output as returned by grad_postprocessor. For the default postprocessor, this is an array containing the derivative with respect to each value in conditions as returned by configurator.

Return type

np.ndarray

class bayesflow.mcmc.PyMCSurrogateLikelihood(amortized_likelihood, observables, configurator=None, likelihood_postprocessor=None, grad_postprocessor=None, default_pymc_type=<class 'numpy.float64'>, default_tf_type=<class 'numpy.float32'>)[source]

Bases: Op, MCMCSurrogateLikelihood

grad(inputs, output_grads)[source]

Aggregates gradients with respect to inputs (typically the parameter vector)

Parameters
  • inputs (The input variables.) –

  • output_grads (The gradients of the output variables.) –

Returns

grads

Return type

The gradients with respect to each Variable in inputs.

itypes: Optional[Sequence[Type]] = [TensorType(float64, (None,))]
otypes: Optional[Sequence[Type]] = [TensorType(float64, ())]
perform(node, inputs, outputs)[source]

Computes the log-likelihood of inputs (typically the parameter vector of a model).

Parameters
  • node (The symbolic aesara.graph.basic.Apply node that represents this computation.) –

  • inputs (Immutable sequence of non-symbolic/numeric inputs. These are the values of each) – Variable in node.inputs.

  • outputs (List of mutable single-element lists (do not change the length of these lists).) – Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.

bayesflow.networks module

Meta-module for easy access of different neural network architecture interfaces

bayesflow.simulation module

class bayesflow.simulation.ContextGenerator(batchable_context_fun: Optional[callable] = None, non_batchable_context_fun: Optional[callable] = None, use_non_batchable_for_batchable: bool = False)[source]

Bases: object

Basic interface for a simulation module responsible for generating variables over which we want to amortize during simulation-based training, but do not want to perform inference on. Both priors and simulators in a generative framework can have their own context generators, depending on the particular modeling goals.

The interface distinguishes between two types of context: batchable and non-batchable.

  • Batchable context variables differ for each simulation in each training batch

  • Non-batchable context varibales stay the same for each simulation in a batch, but differ across batches

Examples for batchable context variables include experimental design variables, design matrices, etc. Examples for non-batchable context variables include the number of observations in an experiment, positional encodings, time indices, etc.

While the latter can also be considered batchable in principle, batching them would require non-Tensor (i.e., non-rectangular) data structures, which usually means inefficient computations.

Example for a simulation context which will generate a random number of observations between 1 and 100 for each training batch:

>>> gen = ContextGenerator(non_batchable_context_fun=lambda : np.random.randint(1, 101))
batchable_context(batch_size, *args, **kwargs)[source]

Generates ‘batch_size’ context variables given optional arguments. Return type is a list of context variables.

generate_context(batch_size, *args, **kwargs)[source]

Creates a dictionary with batchable and non batchable context.

Parameters

batch_sizeint

The batch_size argument used for batchable context.

Returns

  • context_dict (dictionary) – A dictionary with context variables with the following keys, if default keys not changed: batchable_context : value non_batchable_context : value

  • Note, that the values of the context variables will be None, if the

  • corresponding context-generating functions have not been provided when

  • initializing this object.

non_batchable_context(*args, **kwargs)[source]

Generates a context variable shared across simulations in a given batch, given optional arguments.

class bayesflow.simulation.GenerativeModel(prior: callable, simulator: callable, skip_test: bool = False, prior_is_batched: bool = False, simulator_is_batched: bool = False, name: str = 'anonymous')[source]

Bases: object

Basic interface for a generative model in a simulation-based context. Generally, a generative model consists of two mandatory components:

  • Prior : A randomized function returning random parameter draws from a prior distribution;

  • Simulator : A function which transforms the parameters into observables in a non-deterministic manner.

plot_pushforward(parameter_draws=None, funcs_list=None, funcs_labels=None, batch_size=1000, show_raw_sims=True)[source]

Creates simulations from parameter_draws (generated from self.prior if they are not passed as an argument) and plots visualizations for them.

Parameters
  • parameter_draws (numpy ndarray of the shape (batch_size, parameter_values)) – A sample of parameters. May be drawn from either the prior (which is also the default behavior if no input is specified) or from the posterior to do a prior/posterior pushforward.

  • funcs_list (list of callables) – A list of functions that can be used to aggregate simulation data (map a single simulation to a single real value). The default behavior without user input is to use numpy’s mean and standard deviation functions.

  • funcs_labels (list of strings) – A list of labels for the functions in funcs_list. The default behavior without user input is to call the functions “Aggregator function 1, Aggregator function 2, etc.”

  • batch_size (integer) – The number of prior draws to generate (and then create and visualizes simulations from)

  • show_raw_sims (boolean) – Flag determining whether or not a plot of 49 raw (i.e. unaggregated) simulations is generated. Useful for very general data exploration.

Returns

  • parameters_draws (numpy ndarray) – The parameters provided by the user or generated internally.

  • simulations (numpy ndarray) – The simulations generated from parameter_draws (or prior draws generated on the fly)

  • aggregated_data (list of numpy 1d arrays) – Arrays generated from the simulations with the functions in funcs_list

presimulate_and_save(batch_size, folder_path, total_iterations=None, memory_limit=None, iterations_per_epoch=None, epochs=None, extend_from=0, parallel=True)[source]

Simulates a dataset for single-pass offline training (called via the train_from_presimulation method of the Trainer class in the trainers.py script).

One of the following pairs of parameters has to be provided:

  • (iterations_per_epoch, epochs),

  • (total_iterations, iterations_per_epoch)

  • (total_iterations, epochs)

Providing all three of the parameters in these pairs leads to a consistency check, since incompatible combinations are possible. memory_limit is an upper bound on the size of individual files; this can be useful to avoid running out of RAM during training.

class bayesflow.simulation.MultiGenerativeModel(generative_models: list, model_probs='equal')[source]

Bases: object

Basic interface for multiple generative models in a simulation-based context. A MultiveGenerativeModel instance consists of a list of GenerativeModel instances and a prior distribution over candidate models defined by a list of probabilities.

class bayesflow.simulation.Prior(batch_prior_fun: Optional[callable] = None, prior_fun: Optional[callable] = None, context_generator: Optional[callable] = None, param_names: Optional[list] = None)[source]

Bases: object

Basic interface for a simulation module responsible for generating random draws from a prior distribution.

The prior functions should return a np.array of simulation parameters which will be internally used by the GenerativeModel interface for simulations.

An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()

estimate_means_and_stds(n_draws=1000, *args, **kwargs)[source]

Estimates prior means and stds given n_draws from the prior, useful for z-standardization of the prior draws.

Parameters
  • n_draws (int, optional (default = 1000)) – The number of random draws to obtain from the joint prior.

  • *args (tuple) – Optional positional arguments passed to the generator functions.

  • **kwargs (dict) – Optional keyword arguments passed to the generator functions.

Returns

The estimated means and stds of the joint prior.

Return type

(prior_means, prior_stds) - tuple of np.ndarrays

logpdf(prior_draws)[source]
plot_prior2d(**kwargs)[source]

Generates a 2D plot representing bivariate prior ditributions. Uses the function `bayesflow.diagnostics.plot_prior2d() internally for generating the plot.

Parameters

**kwargs (dict) – Optional keyword arguments passed to the plot_prior2d function.

Returns

f

Return type

plt.Figure - the figure instance for optional saving

class bayesflow.simulation.Simulator(batch_simulator_fun=None, simulator_fun=None, context_generator=None)[source]

Bases: object

Basic interface for a simulation module responsible for generating randomized simulations given a prior parameter distribution and optional context variables, given a user-provided simulation function.

The user-provided simulator functions should return a np.array of synthetic data which will be used internally by the GenerativeModel interface for simulations.

An optional context generator (i.e., an instance of ContextGenerator) or a user-defined callable object implementing the following two methods can be provided: - context_generator.batchable_context(batch_size) - context_generator.non_batchable_context()

bayesflow.summary_networks module

class bayesflow.summary_networks.InvariantNetwork(*args, **kwargs)[source]

Bases: Model

Implements a deep permutation-invariant network according to [1] and [2].

[1] Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R. R., & Smola, A. J. (2017). Deep sets. Advances in neural information processing systems, 30.

[2] Bloem-Reddy, B., & Teh, Y. W. (2020). Probabilistic Symmetries and Invariant Neural Networks. J. Mach. Learn. Res., 21, 90-1.

call(x)[source]

Performs the forward pass of a learnable deep invariant transformation consisting of a sequence of equivariant transforms followed by an invariant transform.

Parameters

x (tf.Tensor) – Input of shape (batch_size, n_obs, data_dim)

Returns

out – Output of shape (batch_size, out_dim)

Return type

tf.Tensor

class bayesflow.summary_networks.SequentialNetwork(*args, **kwargs)[source]

Bases: Model

Implements a sequence of MultiConv1D layers followed by an LSTM network.

For details and rationale, see [1]:

[1] Radev, S. T., Graw, F., Chen, S., Mutters, N. T., Eichel, V. M., Bärnighausen, T., & Köthe, U. (2021). OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany. PLoS computational biology, 17(10), e1009472.

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009472

call(x, **kwargs)[source]

Performs a forward pass through the network by first passing x through the sequence of multi-convolutional layers and then applying the LSTM network.

Parameters

x (tf.Tensor) – Input of shape (batch_size, n_time_steps, n_time_series)

Returns

out – Output of shape (batch_size, summary_dim)

Return type

tf.Tensor

class bayesflow.summary_networks.SplitNetwork(*args, **kwargs)[source]

Bases: Model

Implements a vertical stack of networks and concatenates their individual outputs. Allows for splitting of data to provide an individual network for each split of the data.

call(x)[source]

Performs a forward pass through the subnetworks and concatenates their output.

Parameters

x (tf.Tensor) – Input of shape (batch_size, n_obs, data_dim)

Returns

out – Output of shape (batch_size, out_dim)

Return type

tf.Tensor

bayesflow.trainers module

class bayesflow.trainers.Trainer(amortizer, generative_model=None, configurator=None, checkpoint_path=None, max_to_keep=3, default_lr=0.001, skip_checks=False, memory=True, **kwargs)[source]

Bases: object

This class connects a generative model (or, already simulated data from a model) with a configurator and a neural inference architecture for amortized inference (amortizer). A Trainer instance is responsible for optimizing the amortizer via various forms of simulation-based training.

At the very minimum, the trainer must be initialized with an amortizer instance, which is capable of processing the (configured) outputs of a generative model. A configurator will then process the outputs of the generative model and convert them into suitable inputs for the amortizer. Users can choose from a palette of default configurators or create their own configurators, essentially building a modularized pipeline GenerativeModel -> Configurator -> Amortizer. Most complex models wtill require custom configurators.

Currently, the trainer supports the following simulation-based training regimes, based on efficiency considerations:

  • Online training

    Usage: >>> trainer.train_online(epochs, iterations_per_epoch, batch_size, **kwargs)

    This training regime is optimal for fast generative models which can efficiently simulated data on-the-fly. In order for this training regime to be efficient, on-the-fly batch simulations should not take longer than 2-3 seconds.

  • Experience replay training

    Usage: >>> trainer.train_experience_replay(epochs, iterations_per_epoch, batch_size, **kwargs)

    This training regime is also good for fast generative models capable of efficiently simulating data on-the-fly. Compare to pure online training, this training will keep an experience replay buffer from which simulations are randomly sampled, so the networks will likely see some simulations multiple times.

  • Round-based training

    Usage: >>> trainer.train_rounds(rounds, sim_per_round, epochs, batch_size, **kwargs)

    This training regime is optimal for slow, but still reasonably performant generative models. In order for this training regime to be efficient, on-the-fly batch simulations should not take longer than one 2-3 minutes.

    Important: overfitting presents a danger when using small numbers of simulated data sets, so it is recommended to use some amount of regularization for the neural amortizer(s).

  • Offline taining

    Usage: >>> trainer.train_offline(simulations_dict, epochs, batch_size, **kwargs)

    This training regime is optimal for very slow, external simulators, which take several minutes for a single simulation. It assumes that all training data has been already simulated and stored on disk.

    Important: overfitting presents a danger when using a small simulated data set, so it is recommended to use some amount of regularization for the neural amortizer(s).

Note: For extremely slow simulators (i.e., more than an hour of a single simulation), the BayesFlow framework might not be the ideal choice and should probably be considered in combination with a black-box surrogate optimization method, such as Bayesian optimization.

diagnose_latent2d(inputs=None, **kwargs)[source]

Performs visual pre-inference diagnostics of latent space on either provided validation data (new simulations) or internal simulation memory. If inputs is not None, then diagnostics will be performed on the inputs, regardless whether the simulation_memory of the trainer is empty or not. If inputs is None, then the trainer will try to access is memory or raise a ConfigurationError.

Parameters
  • inputs (None, list or dict, optional (default - None)) – The optional inputs to use

  • **kwargs (dict, optional) – Optional keyword arguments, which could be: conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer plot_args - optional keyword arguments passed to plot_latent_space_2d

Returns

losses – A dictionary storing the losses across epochs and iterations

Return type

dict(ep_num : list(losses))

diagnose_sbc_histograms(inputs=None, n_samples=None, **kwargs)[source]

Performs visual pre-inference diagnostics via simulation-based calibration (SBC) (new simulations) or internal simulation memory. If inputs is not None, then diagnostics will be performed on the inputs, regardless whether the simulation_memory of the trainer is empty or not. If inputs is None, then the trainer will try to access is memory or raise a ConfigurationError.

Parameters
  • inputs (None, list or dict, optional (default - None)) – The optional inputs to use

  • n_samples (int, optional (default - None)) – The number of posterior samples to draw for each simulated data set. If None, the number will be heuristically determined so n_sim / n_draws ~= 20

  • **kwargs (dict, optional) – Optional keyword arguments, which could be: conf_args - optional keyword arguments passed to the configurator net_args` - optional keyword arguments passed to the amortizer plot_args - optional keyword arguments passed to plot_sbc

Returns

losses – A dictionary storing the losses across epochs and iterations

Return type

dict(ep_num : list(losses))

load_pretrained_network()[source]

Attempts to load a pre-trained network if checkpoint path is provided and a checkpoint manager exists.

train_experience_replay(epochs, iterations_per_epoch, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, buffer_capacity=1000, optional_stopping=True, use_autograph=True, **kwargs)[source]

Trains the network(s) via experience replay using a memory replay buffer, as utilized in reinforcement learning. Additional keyword arguments are passed to the generative mode, configurator, and amortizer. Read below for signature.

Parameters
  • epochs (int) – Number of epochs (and number of times a checkpoint is stored)

  • iterations_per_epoch (int) – Number of batch simulations to perform per epoch

  • batch_size (int) – Number of simulations to perform at each backpropagation step.

  • save_checkpoint (bool, optional, default: True) – A flag to decide whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.

  • optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network. None will result in tf.keras.optimizers.Adam using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.

  • reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If False, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as self.optimizer and re-used in further training runs.

  • buffer_capacity (int, optional, default: 1000) – Max number of batches to store in buffer. For instance, if batch_size=32 and capacity_in_batches=1000, then the buffer will hold a maximum of 32 * 1000 = 32000 simulations. Be careful with memory!

  • optional_stopping (bool, optional, default: True) – Whether to use optional stopping or not during training. Could speed up training.

  • use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug. Important! Argument will be ignored if buffer has previously been initialized!

  • **kwargs (dict, optional, default: {}) – Optional keyword arguments, which can be: model_args - optional keyword arguments passed to the generative model conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer

Returns

losses – A dictionary or a data frame storing the losses across epochs and iterations.

Return type

dict or pandas.DataFrame

train_from_presimulation(presimulation_path, optimizer, save_checkpoint=True, max_epochs=None, reuse_optimizer=False, custom_loader=None, optional_stopping=True, use_autograph=True, **kwargs)[source]

Trains an amortizer via a modified form of offline training.

Like regular offline training, it assumes that parameters, data and optional context have already been simulated (i.e., forward inference has been performed).

Also like regular offline training, it is faster than online training in scenarios where simulations are slow. Unlike regular offline training, it uses each batch from the presimulated dataset only once during training. A larger presimulated dataset is therefore required than for offline training, and the increase in speed gained by loading simulations instead of generating them on the fly comes at a cost: a large presimulated dataset takes up a large amount of hard drive space.

Parameters
  • presimulation_path (str) –

    File path to the folder containing the files from the precomputed simulation. Ideally generated using a GenerativeModel’s presimulate_and_save method, otherwise must match the structure produced by that method:

    Each file contains the data for one epoch (i.e. a number of batches), and must be compatible with the custom_loader provided. The custom_loader must read each file into a collection (either a dictionary or a list) of simulation_dict objects. This is easily achieved with the pickle library: if the files were generated from collections of simulation_dict objects using pickle.dump, the _default_loader (default for custom_load) will load them using pickle.load. Training parameters like number of iterations and batch size are inferred from the files during training.

  • optimizer (tf.keras.optimizer.Optimizer) – Optimizer for the neural network training. Since for this training, it is impossible to guess the number of iterations beforehead, an optimizer must be provided.

  • save_checkpoint (bool, optional, default : True) – Determines whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.

  • max_epochs (int or None, optional, default: None) – An optional parameter to limit the number of epochs.

  • reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If False, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as self.optimizer and re-used in further training runs.

  • custom_loader (callable, optional, default: self._default_loader) – Must take a string file_path as an input and output a collection (dictionary or list) of simulation_dict objects. A simulation_dict has the keys - prior_non_batchable_context, - prior_batchable_context, - prior_draws, - sim_non_batchable_context, - sim_batchable_context, - sim_data. prior_draws and sim_data must have actual data as values, the rest are optional.

  • optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.

  • use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.

  • **kwargs (dict, optional) – Optional keyword arguments, which can be: conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer

Returns

losses – A dictionary or a data frame storing the losses across epochs and iterations

Return type

dict or pandas.DataFrame

train_offline(simulations_dict, epochs, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]

Trains an amortizer via offline learning. Assume parameters, data and optional context have already been simulated (i.e., forward inference has been performed).

Parameters
  • simulations_dict (dict) – A dictionaty containing the simulated data / context, if using the default keys, the method expects at least the mandatory keys sim_data and prior_draws to be present

  • epochs (int) – Number of epochs (and number of times a checkpoint is stored)

  • batch_size (int) – Number of simulations to perform at each backpropagation step

  • save_checkpoint (bool (default - True)) – Determines whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.

  • optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network. None will result in tf.keras.optimizers.Adam using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.

  • reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If False, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as self.optimizer and re-used in further training runs.

  • optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.

  • use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.

  • **kwargs (dict, optional) – Optional keyword arguments, which can be: model_args - optional keyword arguments passed to the generative model conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer

Returns

losses – A dictionary or a data frame storing the losses across epochs and iterations

Return type

dict or pandas.DataFrame

train_online(epochs, iterations_per_epoch, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]

Trains an amortizer via online learning. Additional keyword arguments are passed to the generative mode, configurator, and amortizer.

Parameters
  • epochs (int) – Number of epochs (and number of times a checkpoint is stored)

  • iterations_per_epoch (int) – Number of batch simulations to perform per epoch

  • batch_size (int) – Number of simulations to perform at each backprop step

  • save_checkpoint (bool (default - True)) – A flag to decide whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.

  • optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network. None will result in tf.keras.optimizers.Adam using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.

  • reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If False, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as ``self.optimizer` and re-used in further training runs.

  • optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.

  • use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.

  • **kwargs (dict, optional) – Optional keyword arguments, which can be: model_args - optional keyword arguments passed to the generative model conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer

Returns

losses – A dictionary storing the losses across epochs and iterations

Return type

dict or pandas.DataFrame

train_rounds(rounds, sim_per_round, epochs, batch_size, save_checkpoint=True, optimizer=None, reuse_optimizer=False, optional_stopping=True, use_autograph=True, **kwargs)[source]

Trains an amortizer via round-based learning. In each round, sim_per_round data sets are simulated from the generative model and added to the data sets simulated in previous round. Then, the networks are trained for epochs on the augmented set of data sets.

Important: Training time will increase from round to round, since the number of simulations increases correspondingly. The final round will then train the networks on rounds * sim_per_round data sets, so make sure this number does not eat up all available memory.

Parameters
  • rounds (int) – Number of rounds to perform (outer loop)

  • sim_per_round (int) – Number of simulations per round

  • epochs (int) – Number of epochs (and number of times a checkpoint is stored, inner loop) within a round.

  • batch_size (int) – Number of simulations to use at each backpropagation step

  • save_checkpoint (bool, optional, (default - True)) – A flag to decide whether to save checkpoints after each epoch, if a checkpoint_path provided during initialization, otherwise ignored.

  • optimizer (tf.keras.optimizer.Optimizer or None) – Optimizer for the neural network training. None will result in tf.keras.optimizers.Adam using a learning rate of 5e-4 and a cosine decay from 5e-4 to 0. A custom optimizer will override default learning rate and schedule settings.

  • reuse_optimizer (bool, optional, default: False) – A flag indicating whether the optimizer instance should be treated as persistent or not. If False, the optimizer and its states are not stored after training has finished. Otherwise, the optimizer will be stored as self.optimizer and re-used in further training runs.

  • optional_stopping (bool, optional, default: False) – Whether to use optional stopping or not during training. Could speed up training.

  • use_autograph (bool, optional, default: True) – Whether to use autograph for the backprop step. Could lead to enourmous speed-ups but could also be harder to debug.

  • **kwargs (dict, optional) – Optional keyword arguments, which can be: model_args - optional keyword arguments passed to the generative model conf_args - optional keyword arguments passed to the configurator net_args - optional keyword arguments passed to the amortizer

Returns

losses – A dictionary or a data frame storing the losses across epochs and iterations

Return type

dict or pandas.DataFrame

bayesflow.version module

bayesflow.wrappers module

class bayesflow.wrappers.SpectralNormalization(*args, **kwargs)[source]

Bases: Wrapper

Performs spectral normalization on neural network weights. Adapted from:

https://www.tensorflow.org/addons/api_docs/python/tfa/layers/SpectralNormalization

This wrapper controls the Lipschitz constant of a layer by constraining its spectral norm, which can stabilize the training of generative networks.

See Spectral Normalization for Generative Adversarial Networks](https://arxiv.org/abs/1802.05957).

build(input_shape)[source]

Build Layer

call(inputs, training=False)[source]

Call Layer

Parameters

inputs (tf.Tensor of shape (None,...,condition_dim + target_dim)) – The inputs to the corresponding layer.

get_config()[source]

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns

Python dictionary.

normalize_weights()[source]

Generate spectral normalized weights.

This method will update the value of self.w with the spectral normalized value, so that the layer is ready for call().

Module contents