Title: | Interface to 'TensorFlow Probability' |
Version: | 0.15.1 |
Description: | Interface to 'TensorFlow Probability', a 'Python' library built on 'TensorFlow' that makes it easy to combine probabilistic models and deep learning on modern hardware ('TPU', 'GPU'). 'TensorFlow Probability' includes a wide selection of probability distributions and bijectors, probabilistic layers, variational inference, Markov chain Monte Carlo, and optimizers such as Nelder-Mead, BFGS, and SGLD. |
License: | Apache License (≥ 2.0) |
URL: | https://github.com/rstudio/tfprobability |
BugReports: | https://github.com/rstudio/tfprobability/issues |
SystemRequirements: | TensorFlow Probability (https://www.tensorflow.org/probability) |
Encoding: | UTF-8 |
RoxygenNote: | 7.2.1 |
Imports: | tensorflow (≥ 2.4.0), reticulate, keras, magrittr |
Suggests: | tfdatasets, testthat (≥ 2.1.0), knitr, rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2022-08-31 17:00:12 UTC; tomasz |
Author: | Tomasz Kalinowski [ctb, cre],
Sigrid Keydana [aut],
Daniel Falbel [ctb],
Kevin Kuo |
Maintainer: | Tomasz Kalinowski <tomasz.kalinowski@rstudio.com> |
Repository: | CRAN |
Date/Publication: | 2022-09-01 09:10:05 UTC |
GLM families
Description
A list of models that can be used as the model
argument in glm_fit()
:
Details
-
Bernoulli
:Bernoulli(probs=mean)
wheremean = sigmoid(matmul(X, weights))
-
BernoulliNormalCDF
:Bernoulli(probs=mean)
wheremean = Normal(0, 1).cdf(matmul(X, weights))
-
GammaExp
:Gamma(concentration=1, rate=1 / mean)
wheremean = exp(matmul(X, weights))
-
GammaSoftplus
:Gamma(concentration=1, rate=1 / mean)
wheremean = softplus(matmul(X, weights))
-
LogNormal
:LogNormal(loc=log(mean) - log(2) / 2, scale=sqrt(log(2)))
wheremean = exp(matmul(X, weights))
. -
LogNormalSoftplus
:LogNormal(loc=log(mean) - log(2) / 2, scale=sqrt(log(2)))
wheremean = softplus(matmul(X, weights))
-
Normal
:Normal(loc=mean, scale=1)
wheremean = matmul(X, weights)
. -
NormalReciprocal
:Normal(loc=mean, scale=1)
wheremean = 1 / matmul(X, weights)
-
Poisson
:Poisson(rate=mean)
wheremean = exp(matmul(X, weights))
. -
PoissonSoftplus
:Poisson(rate=mean)
wheremean = softplus(matmul(X, weights))
.
Value
list of models that can be used as the model
argument in glm_fit()
See Also
Other glm_fit:
glm_fit.tensorflow.tensor()
,
glm_fit_one_step.tensorflow.tensor()
Runs multiple Fisher scoring steps
Description
Runs multiple Fisher scoring steps
Usage
glm_fit(x, ...)
Arguments
x |
float-like, matrix-shaped Tensor where each row represents a sample's features. |
... |
other arguments passed to specific methods. |
Value
A glm_fit
object with parameter estimates, number of iterations,
etc.
See Also
Runs multiple Fisher scoring steps
Description
Runs multiple Fisher scoring steps
Usage
## S3 method for class 'tensorflow.tensor'
glm_fit(
x,
response,
model,
model_coefficients_start = NULL,
predicted_linear_response_start = NULL,
l2_regularizer = NULL,
dispersion = NULL,
offset = NULL,
convergence_criteria_fn = NULL,
learning_rate = NULL,
fast_unsafe_numerics = TRUE,
maximum_iterations = NULL,
name = NULL,
...
)
Arguments
x |
float-like, matrix-shaped Tensor where each row represents a sample's features. |
response |
vector-shaped Tensor where each element represents a sample's
observed response (to the corresponding row of features). Must have same |
model |
a string naming the model (see glm_families) or a |
model_coefficients_start |
Optional (batch of) vector-shaped Tensor representing
the initial model coefficients, one for each column in |
predicted_linear_response_start |
Optional Tensor with shape, |
l2_regularizer |
Optional scalar Tensor representing L2 regularization penalty.
Default: |
dispersion |
Optional (batch of) Tensor representing response dispersion. |
offset |
Optional Tensor representing constant shift applied to |
convergence_criteria_fn |
callable taking: |
learning_rate |
Optional (batch of) scalar Tensor used to dampen iterative progress.
Typically only needed if optimization diverges, should be no larger than 1 and typically
very close to 1. Default value: |
fast_unsafe_numerics |
Optional Python bool indicating if faster, less numerically accurate methods can be employed for computing the weighted least-squares solution. Default value: TRUE (i.e., "fast but possibly diminished accuracy"). |
maximum_iterations |
Optional maximum number of iterations of Fisher scoring to run;
"and-ed" with result of |
name |
usesed as name prefix to ops created by this function. Default value: "fit". |
... |
other arguments passed to specific methods. |
Value
A glm_fit
object with parameter estimates, and
number of required steps.
See Also
Other glm_fit:
glm_families
,
glm_fit_one_step.tensorflow.tensor()
Runs one Fisher scoring step
Description
Runs one Fisher scoring step
Usage
glm_fit_one_step(x, ...)
Arguments
x |
float-like, matrix-shaped Tensor where each row represents a sample's features. |
... |
other arguments passed to specific methods. |
Value
A glm_fit
object with parameter estimates, number of iterations,
etc.
See Also
glm_fit_one_step.tensorflow.tensor()
Runs one Fisher Scoring step
Description
Runs one Fisher Scoring step
Usage
## S3 method for class 'tensorflow.tensor'
glm_fit_one_step(
x,
response,
model,
model_coefficients_start = NULL,
predicted_linear_response_start = NULL,
l2_regularizer = NULL,
dispersion = NULL,
offset = NULL,
learning_rate = NULL,
fast_unsafe_numerics = TRUE,
name = NULL,
...
)
Arguments
x |
float-like, matrix-shaped Tensor where each row represents a sample's features. |
response |
vector-shaped Tensor where each element represents a sample's
observed response (to the corresponding row of features). Must have same |
model |
a string naming the model (see glm_families) or a |
model_coefficients_start |
Optional (batch of) vector-shaped Tensor representing
the initial model coefficients, one for each column in |
predicted_linear_response_start |
Optional Tensor with shape, |
l2_regularizer |
Optional scalar Tensor representing L2 regularization penalty.
Default: |
dispersion |
Optional (batch of) Tensor representing response dispersion. |
offset |
Optional Tensor representing constant shift applied to |
learning_rate |
Optional (batch of) scalar Tensor used to dampen iterative progress.
Typically only needed if optimization diverges, should be no larger than 1 and typically
very close to 1. Default value: |
fast_unsafe_numerics |
Optional Python bool indicating if faster, less numerically accurate methods can be employed for computing the weighted least-squares solution. Default value: TRUE (i.e., "fast but possibly diminished accuracy"). |
name |
usesed as name prefix to ops created by this function. Default value: "fit". |
... |
other arguments passed to specific methods. |
Value
A glm_fit
object with parameter estimates, and
number of required steps.
See Also
Other glm_fit:
glm_families
,
glm_fit.tensorflow.tensor()
Blockwise Initializer
Description
Initializer which concats other intializers
Usage
initializer_blockwise(initializers, sizes, validate_args = FALSE)
Arguments
initializers |
list of Keras initializers, eg: |
sizes |
list of integers scalars representing the number of elements associated
with each initializer in |
validate_args |
bool indicating we should do (possibly expensive) graph-time assertions, if necessary. @return Initializer which concats other intializers |
Installs TensorFlow Probability
Description
Installs TensorFlow Probability
Usage
install_tfprobability(
method = c("auto", "virtualenv", "conda"),
conda = "auto",
version = "default",
tensorflow = "default",
extra_packages = NULL,
...,
pip_ignore_installed = TRUE
)
Arguments
method |
Installation method. By default, "auto" automatically finds a method that will work in the local environment. Change the default to force a specific installation method. Note that the "virtualenv" method is not available on Windows. |
conda |
The path to a |
version |
TensorFlow version to install. Valid values include:
|
tensorflow |
Synonym for |
extra_packages |
Additional Python packages to install along with TensorFlow. |
... |
other arguments passed to |
pip_ignore_installed |
Whether pip should ignore installed python
packages and reinstall all already installed python packages. This defaults
to |
Value
invisible
Masked Autoencoder for Distribution Estimation
Description
layer_autoregressive
takes as input a Tensor of shape [..., event_size]
and returns a Tensor of shape [..., event_size, params]
.
The output satisfies the autoregressive property. That is, the layer is
configured with some permutation ord
of {0, ..., event_size-1}
(i.e., an
ordering of the input dimensions), and the output output[batch_idx, i, ...]
for input dimension i
depends only on inputs x[batch_idx, j]
where
ord(j) < ord(i)
.
Usage
layer_autoregressive(
object,
params,
event_shape = NULL,
hidden_units = NULL,
input_order = "left-to-right",
hidden_degrees = "equal",
activation = NULL,
use_bias = TRUE,
kernel_initializer = "glorot_uniform",
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
params |
integer specifying the number of parameters to output per input. |
event_shape |
|
| |
input_order |
Order of degrees to the input units: 'random',
'left-to-right', 'right-to-left', or an array of an explicit order. For
example, 'left-to-right' builds an autoregressive model:
|
Method for assigning degrees to the hidden units: 'equal', 'random'. If 'equal', hidden units in each layer are allocated equally (up to a remainder term) to each degree. Default: 'equal'. | |
activation |
An activation function. See |
use_bias |
Whether or not the dense layers constructed in this layer
should have a bias term. See |
kernel_initializer |
Initializer for the kernel weights matrix. Default: 'glorot_uniform'. |
validate_args |
|
... |
Additional keyword arguments passed to the |
Details
The autoregressive property allows us to use
output[batch_idx, i]
to parameterize conditional distributions:
p(x[batch_idx, i] | x[batch_idx, ] for ord(j) < ord(i))
which give us a tractable distribution over input x[batch_idx]
:
p(x[batch_idx]) = prod_i p(x[batch_idx, ord(i)] | x[batch_idx, ord(0:i)])
For example, when params
is 2, the output of the layer can parameterize
the location and log-scale of an autoregressive Gaussian distribution.
Value
a Keras layer
See Also
Other layers:
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
An autoregressive normalizing flow layer, given a layer_autoregressive
.
Description
Following Papamakarios et al. (2017), given
an autoregressive model p(x)
with conditional distributions in the location-scale
family, we can construct a normalizing flow for p(x)
.
Usage
layer_autoregressive_transform(object, made, ...)
Arguments
object |
What to compose the new
|
made |
A |
... |
Additional parameters passed to Keras Layer. |
Details
Specifically, suppose made is a [layer_autoregressive()]
– a layer implementing
a Masked Autoencoder for Distribution Estimation (MADE) – that computes location
and log-scale parameters made(x)[i]
for each input x[i]
. Then we can represent
the autoregressive model p(x)
as x = f(u)
where u
is drawn
from from some base distribution and where f
is an invertible and
differentiable function (i.e., a Bijector) and f^{-1}(x)
is defined by:
library(tensorflow) library(zeallot) f_inverse <- function(x) { c(shift, log_scale) %<-% tf$unstack(made(x), 2, axis = -1L) (x - shift) * tf$math$exp(-log_scale) }
Given a layer_autoregressive()
made, a layer_autoregressive_transform()
transforms an input tfd_*
p(u)
to an output tfd_*
p(x)
where
x = f(u)
.
Value
a Keras layer
References
See Also
tfb_masked_autoregressive_flow()
and layer_autoregressive()
A OneHotCategorical mixture Keras layer from k * (1 + d)
params.
Description
k
(i.e., num_components
) represents the number of component
OneHotCategorical
distributions and d
(i.e., event_size
) represents the
number of categories within each OneHotCategorical
distribution.
Usage
layer_categorical_mixture_of_one_hot_categorical(
object,
event_size,
num_components,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
sample_dtype = NULL,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_size |
Scalar |
num_components |
Scalar |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
sample_dtype |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
... |
Additional arguments passed to |
Details
Typical choices for convert_to_tensor_fn
include:
-
tfp$distributions$Distribution$sample
-
tfp$distributions$Distribution$mean
-
tfp$distributions$Distribution$mode
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
1D convolution layer (e.g. temporal convolution) with Flipout
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_1d_flipout(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the Flipout
estimator (Wen et al., 2018), which performs a Monte Carlo approximation
of the distribution integrating over the kernel
and bias
. Flipout uses
roughly twice as many floating point operations as the reparameterization
estimator but has the advantage of significantly lower variance.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
1D convolution layer (e.g. temporal convolution).
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_1d_reparameterization(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the reparameterization
estimator (Kingma and Welling, 2014), which performs a Monte Carlo
approximation of the distribution integrating over the kernel
and bias
.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
2D convolution layer (e.g. spatial convolution over images) with Flipout
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_2d_flipout(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the Flipout
estimator (Wen et al., 2018), which performs a Monte Carlo approximation
of the distribution integrating over the kernel
and bias
. Flipout uses
roughly twice as many floating point operations as the reparameterization
estimator but has the advantage of significantly lower variance.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
2D convolution layer (e.g. spatial convolution over images)
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_2d_reparameterization(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the reparameterization
estimator (Kingma and Welling, 2014), which performs a Monte Carlo
approximation of the distribution integrating over the kernel
and bias
.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
3D convolution layer (e.g. spatial convolution over volumes) with Flipout
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_3d_flipout(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the Flipout
estimator (Wen et al., 2018), which performs a Monte Carlo approximation
of the distribution integrating over the kernel
and bias
. Flipout uses
roughly twice as many floating point operations as the reparameterization
estimator but has the advantage of significantly lower variance.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
3D convolution layer (e.g. spatial convolution over volumes)
Description
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from distributions.
Usage
layer_conv_3d_reparameterization(
object,
filters,
kernel_size,
strides = 1,
padding = "valid",
data_format = "channels_last",
dilation_rate = 1,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
filters |
Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |
kernel_size |
An integer or list of a single integer, specifying the length of the 1D convolution window. |
strides |
An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |
padding |
One of |
data_format |
A string, one of |
dilation_rate |
An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the reparameterization
estimator (Kingma and Welling, 2014), which performs a Monte Carlo
approximation of the distribution integrating over the kernel
and bias
.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
Densely-connected layer class with Flipout estimator.
Description
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
Usage
layer_dense_flipout(
object,
units,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
seed = NULL,
...
)
Arguments
object |
What to compose the new
|
units |
integer dimensionality of the output space |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
seed |
scalar |
... |
Additional keyword arguments passed to the |
Details
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
kernel, bias ~ posterior outputs = activation(matmul(inputs, kernel) + bias)
It uses the Flipout estimator (Wen et al., 2018), which performs a Monte
Carlo approximation of the distribution integrating over the kernel
and
bias
. Flipout uses roughly twice as many floating point operations as the
reparameterization estimator but has the advantage of significantly lower
variance.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
Densely-connected layer class with local reparameterization estimator.
Description
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
Usage
layer_dense_local_reparameterization(
object,
units,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
units |
integer dimensionality of the output space |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
kernel, bias ~ posterior outputs = activation(matmul(inputs, kernel) + bias)
It uses the local reparameterization estimator (Kingma et al., 2015),
which performs a Monte Carlo approximation of the distribution on the hidden
units induced by the kernel
and bias
. The default kernel_posterior_fn
is a normal distribution which factorizes across all elements of the weight
matrix and bias vector. Unlike that paper's multiplicative parameterization, this
distribution has trainable location and scale parameters which is known as
an additive noise parameterization (Molchanov et al., 2017).
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_reparameterization()
,
layer_dense_variational()
,
layer_variable()
Densely-connected layer class with reparameterization estimator.
Description
This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the kernel
and/or the bias
are drawn
from distributions.
Usage
layer_dense_reparameterization(
object,
units,
activation = NULL,
activity_regularizer = NULL,
trainable = TRUE,
kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(),
kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(),
kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn,
kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE),
bias_posterior_tensor_fn = function(d) d %>% tfd_sample(),
bias_prior_fn = NULL,
bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p),
...
)
Arguments
object |
What to compose the new
|
units |
integer dimensionality of the output space |
activation |
Activation function. Set it to None to maintain a linear activation. |
activity_regularizer |
Regularizer function for the output. |
trainable |
Whether the layer weights will be updated during training. |
kernel_posterior_fn |
Function which creates |
kernel_posterior_tensor_fn |
Function which takes a |
kernel_prior_fn |
Function which creates |
kernel_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
bias_posterior_fn |
Function which creates a |
bias_posterior_tensor_fn |
Function which takes a |
bias_prior_fn |
Function which creates |
bias_divergence_fn |
Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |
... |
Additional keyword arguments passed to the |
Details
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
kernel, bias ~ posterior outputs = activation(matmul(inputs, kernel) + bias)
It uses the reparameterization estimator (Kingma and Welling, 2014)
which performs a Monte Carlo approximation of the distribution integrating
over the kernel
and bias
.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Value
a Keras layer
References
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_variational()
,
layer_variable()
Dense Variational Layer
Description
This layer uses variational inference to fit a "surrogate" posterior to the
distribution over both the kernel
matrix and the bias
terms which are
otherwise used in a manner similar to layer_dense()
.
This layer fits the "weights posterior" according to the following generative
process:
[K, b] ~ Prior() M = matmul(X, K) + b Y ~ Likelihood(M)
Usage
layer_dense_variational(
object,
units,
make_posterior_fn,
make_prior_fn,
kl_weight = NULL,
kl_use_exact = FALSE,
activation = NULL,
use_bias = TRUE,
...
)
Arguments
object |
What to compose the new
|
units |
Positive integer, dimensionality of the output space. |
make_posterior_fn |
function taking |
make_prior_fn |
function taking |
kl_weight |
Amount by which to scale the KL divergence loss between prior and posterior. |
kl_use_exact |
Logical indicating that the analytical KL divergence should be used rather than a Monte Carlo approximation. |
activation |
An activation function. See |
use_bias |
Whether or not the dense layers constructed in this layer
should have a bias term. See |
... |
Additional keyword arguments passed to the |
Value
a Keras layer
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_variable()
Keras layer enabling plumbing TFP distributions through Keras models
Description
Keras layer enabling plumbing TFP distributions through Keras models
Usage
layer_distribution_lambda(
object,
make_distribution_fn,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
...
)
Arguments
object |
What to compose the new
|
make_distribution_fn |
A callable that takes previous layer outputs and returns a |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
An Independent-Bernoulli Keras layer from prod(event_shape) params
Description
An Independent-Bernoulli Keras layer from prod(event_shape) params
Usage
layer_independent_bernoulli(
object,
event_shape,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
sample_dtype = NULL,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_shape |
Scalar integer representing the size of single draw from this distribution. |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
sample_dtype |
dtype of samples produced by this distribution. Default value: NULL (i.e., previous layer's dtype). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
An independent Logistic Keras layer.
Description
An independent Logistic Keras layer.
Usage
layer_independent_logistic(
object,
event_shape,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_shape |
Scalar integer representing the size of single draw from this distribution. |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
An independent Normal Keras layer.
Description
An independent Normal Keras layer.
Usage
layer_independent_normal(
object,
event_shape,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_shape |
Scalar integer representing the size of single draw from this distribution. |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
Examples
library(keras)
input_shape <- c(28, 28, 1)
encoded_shape <- 2
n <- 2
model <- keras_model_sequential(
list(
layer_input(shape = input_shape),
layer_flatten(),
layer_dense(units = n),
layer_dense(units = params_size_independent_normal(encoded_shape)),
layer_independent_normal(event_shape = encoded_shape)
)
)
An independent Poisson Keras layer.
Description
An independent Poisson Keras layer.
Usage
layer_independent_poisson(
object,
event_shape,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_shape |
Scalar integer representing the size of single draw from this distribution. |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
Pass-through layer that adds a KL divergence penalty to the model loss
Description
Pass-through layer that adds a KL divergence penalty to the model loss
Usage
layer_kl_divergence_add_loss(
object,
distribution_b,
use_exact_kl = FALSE,
test_points_reduce_axis = NULL,
test_points_fn = tf$convert_to_tensor,
weight = NULL,
...
)
Arguments
object |
What to compose the new
|
distribution_b |
Distribution instance corresponding to b as in |
use_exact_kl |
Logical indicating if KL divergence should be
calculated exactly via |
test_points_reduce_axis |
Integer vector or scalar representing dimensions over which to reduce_mean while calculating the Monte Carlo approximation of the KL divergence. As is with all tf$reduce_* ops, NULL means reduce over all dimensions; () means reduce over none of them. Default value: () (i.e., no reduction). |
test_points_fn |
A callable taking a |
weight |
Multiplier applied to the calculated KL divergence for each Keras batch member. Default value: NULL (i.e., do not weight each batch member). |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
Regularizer that adds a KL divergence penalty to the model loss
Description
When using Monte Carlo approximation (e.g., use_exact = FALSE
), it is presumed that the input
distribution's concretization (i.e., tf$convert_to_tensor(distribution)
) corresponds to a random
sample. To override this behavior, set test_points_fn.
Usage
layer_kl_divergence_regularizer(
object,
distribution_b,
use_exact_kl = FALSE,
test_points_reduce_axis = NULL,
test_points_fn = tf$convert_to_tensor,
weight = NULL,
...
)
Arguments
object |
What to compose the new
|
distribution_b |
Distribution instance corresponding to b as in |
use_exact_kl |
Logical indicating if KL divergence should be
calculated exactly via |
test_points_reduce_axis |
Integer vector or scalar representing dimensions over which to reduce_mean while calculating the Monte Carlo approximation of the KL divergence. As is with all tf$reduce_* ops, NULL means reduce over all dimensions; () means reduce over none of them. Default value: () (i.e., no reduction). |
test_points_fn |
A callable taking a |
weight |
Multiplier applied to the calculated KL divergence for each Keras batch member. Default value: NULL (i.e., do not weight each batch member). |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
A mixture distribution Keras layer, with independent logistic components.
Description
A mixture distribution Keras layer, with independent logistic components.
Usage
layer_mixture_logistic(
object,
num_components,
event_shape = list(),
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
num_components |
Number of component distributions in the mixture distribution. |
event_shape |
integer vector |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
A mixture distribution Keras layer, with independent normal components.
Description
A mixture distribution Keras layer, with independent normal components.
Usage
layer_mixture_normal(
object,
num_components,
event_shape = list(),
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
num_components |
Number of component distributions in the mixture distribution. |
event_shape |
integer vector |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
A mixture (same-family) Keras layer.
Description
A mixture (same-family) Keras layer.
Usage
layer_mixture_same_family(
object,
num_components,
component_layer,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
num_components |
Number of component distributions in the mixture distribution. |
component_layer |
Function that, given a tensor of shape
|
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked
for validity despite possibly degrading runtime performance. When FALSE invalid inputs may
silently render incorrect outputs. Default value: FALSE.
@param ... Additional arguments passed to |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_multivariate_normal_tri_l()
,
layer_one_hot_categorical()
A d-variate Multivariate Normal TriL Keras layer from d+d*(d+1)/ 2
params
Description
A d-variate Multivariate Normal TriL Keras layer from d+d*(d+1)/ 2
params
Usage
layer_multivariate_normal_tri_l(
object,
event_size,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_size |
Integer vector tensor representing the shape of single draw from this distribution. |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_one_hot_categorical()
A d
-variate OneHotCategorical Keras layer from d
params.
Description
Typical choices for convert_to_tensor_fn
include:
-
tfp$distributions$Distribution$sample
-
tfp$distributions$Distribution$mean
-
tfp$distributions$Distribution$mode
-
tfp$distributions$OneHotCategorical$logits
Usage
layer_one_hot_categorical(
object,
event_size,
convert_to_tensor_fn = tfp$distributions$Distribution$sample,
sample_dtype = NULL,
validate_args = FALSE,
...
)
Arguments
object |
What to compose the new
|
event_size |
Scalar |
convert_to_tensor_fn |
A callable that takes a tfd$Distribution instance and returns a
tf$Tensor-like object. Default value: |
sample_dtype |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
... |
Additional arguments passed to |
Value
a Keras layer
See Also
For an example how to use in a Keras model, see layer_independent_normal()
.
Other distribution_layers:
layer_categorical_mixture_of_one_hot_categorical()
,
layer_distribution_lambda()
,
layer_independent_bernoulli()
,
layer_independent_logistic()
,
layer_independent_normal()
,
layer_independent_poisson()
,
layer_kl_divergence_add_loss()
,
layer_kl_divergence_regularizer()
,
layer_mixture_logistic()
,
layer_mixture_normal()
,
layer_mixture_same_family()
,
layer_multivariate_normal_tri_l()
Variable Layer
Description
Simply returns a (trainable) variable, regardless of input.
This layer implements the mathematical function f(x) = c
where c
is a
constant, i.e., unchanged for all x
. Like other Keras layers, the constant
is trainable
. This layer can also be interpretted as the special case of
layer_dense()
when the kernel
is forced to be the zero matrix
(tf$zeros
).
Usage
layer_variable(
object,
shape,
dtype = NULL,
activation = NULL,
initializer = "zeros",
regularizer = NULL,
constraint = NULL,
...
)
Arguments
object |
What to compose the new
|
shape |
integer or integer vector specifying the shape of the output of this layer. |
dtype |
TensorFlow |
activation |
An activation function. See |
initializer |
Initializer for the |
regularizer |
Regularizer function applied to the |
constraint |
Constraint function applied to the |
... |
Additional keyword arguments passed to the |
Value
a Keras layer
See Also
Other layers:
layer_autoregressive()
,
layer_conv_1d_flipout()
,
layer_conv_1d_reparameterization()
,
layer_conv_2d_flipout()
,
layer_conv_2d_reparameterization()
,
layer_conv_3d_flipout()
,
layer_conv_3d_reparameterization()
,
layer_dense_flipout()
,
layer_dense_local_reparameterization()
,
layer_dense_reparameterization()
,
layer_dense_variational()
A Variational Gaussian Process Layer.
Description
Create a Variational Gaussian Process distribution whose index_points
are
the inputs to the layer. Parameterized by number of inducing points and a
kernel_provider
, which should be a tf.keras.Layer
with an @property that
late-binds variable parameters to a tfp.positive_semidefinite_kernel.PositiveSemidefiniteKernel
instance (this requirement has to do with the way that variables must be created
in a keras model). The mean_fn is an optional argument which, if omitted, will
be automatically configured to be a constant function with trainable variable
output.
Usage
layer_variational_gaussian_process(
object,
num_inducing_points,
kernel_provider,
event_shape = 1,
inducing_index_points_initializer = NULL,
unconstrained_observation_noise_variance_initializer = NULL,
mean_fn = NULL,
jitter = 1e-06,
name = NULL
)
Arguments
object |
What to compose the new
|
num_inducing_points |
number of inducing points in the Variational Gaussian Process distribution. |
kernel_provider |
a |
event_shape |
the shape of the output of the layer. This translates to a
batch of underlying Variational Gaussian Process distributions. For example,
|
inducing_index_points_initializer |
a |
unconstrained_observation_noise_variance_initializer |
a |
mean_fn |
a callable that maps layer inputs to mean function values. Passed to the mean_fn parameter of Variational Gaussian Process distribution. If omitted, defaults to a constant function with trainable variable value. |
jitter |
a small term added to the diagonal of various kernel matrices for numerical stability. |
name |
name to give to this layer and the scope of ops and variables it contains. |
Value
a Keras layer
Adapts the inner kernel's step_size
based on log_accept_prob
.
Description
The dual averaging policy uses a noisy step size for exploration, while
averaging over tuning steps to provide a smoothed estimate of an optimal
value. It is based on section 3.2 of Hoffman and Gelman (2013), which
modifies the [stochastic convex optimization scheme of Nesterov (2009).
The modified algorithm applies extra weight to recent iterations while
keeping the convergence guarantees of Robbins-Monro, and takes care not
to make the step size too small too quickly when maintaining a constant
trajectory length, to avoid expensive early iterations. A good target
acceptance probability depends on the inner kernel. If this kernel is
HamiltonianMonteCarlo
, then 0.6-0.9 is a good range to aim for. For
RandomWalkMetropolis
this should be closer to 0.25. See the individual
kernels' docstrings for guidance.
Usage
mcmc_dual_averaging_step_size_adaptation(
inner_kernel,
num_adaptation_steps,
target_accept_prob = 0.75,
exploration_shrinkage = 0.05,
step_count_smoothing = 10,
decay_rate = 0.75,
step_size_setter_fn = NULL,
step_size_getter_fn = NULL,
log_accept_prob_getter_fn = NULL,
validate_args = FALSE,
name = NULL
)
Arguments
inner_kernel |
|
num_adaptation_steps |
Scalar |
target_accept_prob |
A floating point |
exploration_shrinkage |
Floating point scalar |
step_count_smoothing |
Int32 scalar |
decay_rate |
Floating point scalar |
step_size_setter_fn |
A function with the signature
|
step_size_getter_fn |
A callable with the signature
|
log_accept_prob_getter_fn |
A callable with the signature
|
validate_args |
|
name |
name prefixed to Ops created by this function.
Default value: |
Details
In general, adaptation prevents the chain from reaching a stationary
distribution, so obtaining consistent samples requires num_adaptation_steps
be set to a value somewhat smaller than the number of burnin steps.
However, it may sometimes be helpful to set num_adaptation_steps
to a larger
value during development in order to inspect the behavior of the chain during
adaptation.
The step size is assumed to broadcast with the chain state, potentially having
leading dimensions corresponding to multiple chains. When there are fewer of
those leading dimensions than there are chain dimensions, the corresponding
dimensions in the log_accept_prob
are averaged (in the direct space, rather
than the log space) before being used to adjust the step size. This means that
this kernel can do both cross-chain adaptation, or per-chain step size
adaptation, depending on the shape of the step size.
For example, if your problem has a state with shape [S]
, your chain state
has shape [C0, C1, S]
(meaning that there are C0 * C1
total chains) and
log_accept_prob
has shape [C0, C1]
(one acceptance probability per chain),
then depending on the shape of the step size, the following will happen:
Step size has shape
[]
,[S]
or[1]
, thelog_accept_prob
will be averaged across itsC0
andC1
dimensions. This means that you will learn a shared step size based on the mean acceptance probability across all chains. This can be useful if you don't have a lot of steps to adapt and want to average away the noise.Step size has shape
[C1, 1]
or[C1, S]
, thelog_accept_prob
will be averaged across itsC0
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC1
dimension. This can be useful when theC1
dimension indexes different distributions, whileC0
indexes replicas of a single distribution, all sampled in parallel.Step size has shape
[C0, C1, 1]
or[C0, C1, S]
, then no averaging will happen. This means that each chain will learn its own step size. This can be useful when all chains are sampling from different distributions. Even when all chains are for the same distribution, this can help during the initial warmup period.Step size has shape
[C0, 1, 1]
or[C0, 1, S]
, thelog_accept_prob
will be averaged across itsC1
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC0
dimension. This can be useful when theC0
dimension indexes different distributions, whileC1
indexes replicas of a single distribution, all sampled in parallel.
Value
a Monte Carlo sampling kernel
References
See Also
For an example how to use see mcmc_no_u_turn_sampler()
.
Other mcmc_kernels:
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Estimate a lower bound on effective sample size for each independent chain.
Description
Roughly speaking, "effective sample size" (ESS) is the size of an iid sample
with the same variance as state
.
Usage
mcmc_effective_sample_size(
states,
filter_threshold = 0,
filter_beyond_lag = NULL,
name = NULL
)
Arguments
states |
|
filter_threshold |
|
filter_beyond_lag |
|
name |
name to prepend to created ops. |
Details
More precisely, given a stationary sequence of possibly correlated random
variables X_1, X_2,...,X_N
, each identically distributed ESS is the number
such that
Variance{ N**-1 * Sum{X_i} } = ESS**-1 * Variance{ X_1 }.
If the sequence is uncorrelated, ESS = N
. In general, one should expect
ESS <= N
, with more highly correlated sequences having smaller ESS
.
Value
Tensor
or list of Tensor
objects. The effective sample size of
each component of states
. Shape will be states$shape[1:]
.
See Also
Other mcmc_functions:
mcmc_potential_scale_reduction()
,
mcmc_sample_annealed_importance_chain()
,
mcmc_sample_chain()
,
mcmc_sample_halton_sequence()
Runs one step of Hamiltonian Monte Carlo.
Description
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm
that takes a series of gradient-informed steps to produce a Metropolis
proposal. This class implements one random HMC step from a given
current_state
. Mathematical details and derivations can be found in
Neal (2011).
Usage
mcmc_hamiltonian_monte_carlo(
target_log_prob_fn,
step_size,
num_leapfrog_steps,
state_gradients_are_stopped = FALSE,
step_size_update_fn = NULL,
seed = NULL,
store_parameters_in_results = FALSE,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
step_size |
|
num_leapfrog_steps |
Integer number of steps to run the leapfrog integrator
for. Total progress per HMC step is roughly proportional to
|
state_gradients_are_stopped |
|
step_size_update_fn |
Function taking current |
seed |
integer to seed the random number generator. |
store_parameters_in_results |
If |
name |
string prefixed to Ops created by this function.
Default value: |
Details
The one_step
function can update multiple chains in parallel. It assumes
that all leftmost dimensions of current_state
index independent chain states
(and are therefore updated independently). The output of
target_log_prob_fn(current_state)
should sum log-probabilities across all
event dimensions. Slices along the rightmost dimensions may have different
target distributions; for example, current_state[0, :]
could have a
different target distribution from current_state[1, :]
. These semantics are
governed by target_log_prob_fn(current_state)
. (The number of independent
chains is tf$size(target_log_prob_fn(current_state))
.)
Value
a Monte Carlo sampling kernel
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of Metropolis-adjusted Langevin algorithm.
Description
Metropolis-adjusted Langevin algorithm (MALA) is a Markov chain Monte Carlo
(MCMC) algorithm that takes a step of a discretised Langevin diffusion as a
proposal. This class implements one step of MALA using Euler-Maruyama method
for a given current_state
and diagonal preconditioning volatility
matrix.
Usage
mcmc_metropolis_adjusted_langevin_algorithm(
target_log_prob_fn,
step_size,
volatility_fn = NULL,
seed = NULL,
parallel_iterations = 10,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
step_size |
|
volatility_fn |
function which takes an argument like
|
seed |
integer to seed the random number generator. |
parallel_iterations |
the number of coordinates for which the gradients of
the volatility matrix |
name |
String prefixed to Ops created by this function.
Default value: |
Details
Mathematical details and derivations can be found in Roberts and Rosenthal (1998) and Xifara et al. (2013).
The one_step
function can update multiple chains in parallel. It assumes
that all leftmost dimensions of current_state
index independent chain states
(and are therefore updated independently). The output of
target_log_prob_fn(current_state)
should reduce log-probabilities across
all event dimensions. Slices along the rightmost dimensions may have different
target distributions; for example, current_state[0, :]
could have a
different target distribution from current_state[1, :]
. These semantics are
governed by target_log_prob_fn(current_state)
. (The number of independent
chains is tf.size(target_log_prob_fn(current_state))
.)
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of the Metropolis-Hastings algorithm.
Description
The Metropolis-Hastings algorithm is a Markov chain Monte Carlo (MCMC) technique which uses a proposal distribution to eventually sample from a target distribution.
Usage
mcmc_metropolis_hastings(inner_kernel, seed = NULL, name = NULL)
Arguments
inner_kernel |
|
seed |
integer to seed the random number generator. |
name |
string prefixed to Ops created by this function. Default value: |
Details
Note: inner_kernel$one_step
must return kernel_results
as a collections$namedtuple
which must:
have a
target_log_prob
field,optionally have a
log_acceptance_correction
field, and,have only fields which are
Tensor
-valued.
The Metropolis-Hastings log acceptance-probability is computed as:
log_accept_ratio = (current_kernel_results.target_log_prob - previous_kernel_results.target_log_prob + current_kernel_results.log_acceptance_correction)
If current_kernel_results$log_acceptance_correction
does not exist, it is
presumed 0
(i.e., that the proposal distribution is symmetric).
The most common use-case for log_acceptance_correction
is in the
Metropolis-Hastings algorithm, i.e.,
accept_prob(x' | x) = p(x') / p(x) (g(x|x') / g(x'|x)) where, p represents the target distribution, g represents the proposal (conditional) distribution, x' is the proposed state, and, x is current state
The log of the parenthetical term is the log_acceptance_correction
.
The log_acceptance_correction
may not necessarily correspond to the ratio of
proposal distributions, e.g, log_acceptance_correction
has a different
interpretation in Hamiltonian Monte Carlo.
Value
a Monte Carlo sampling kernel
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of the No U-Turn Sampler
Description
The No U-Turn Sampler (NUTS) is an adaptive variant of the Hamiltonian Monte
Carlo (HMC) method for MCMC. NUTS adapts the distance traveled in response to
the curvature of the target density. Conceptually, one proposal consists of
reversibly evolving a trajectory through the sample space, continuing until
that trajectory turns back on itself (hence the name, 'No U-Turn').
This class implements one random NUTS step from a given
current_state
. Mathematical details and derivations can be found in
Hoffman & Gelman (2011).
Usage
mcmc_no_u_turn_sampler(
target_log_prob_fn,
step_size,
max_tree_depth = 10,
max_energy_diff = 1000,
unrolled_leapfrog_steps = 1,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
function which takes an argument like
|
step_size |
|
max_tree_depth |
Maximum depth of the tree implicitly built by NUTS. The
maximum number of leapfrog steps is bounded by |
max_energy_diff |
Scaler threshold of energy differences at each leapfrog, divergence samples are defined as leapfrog steps that exceed this threshold. Default to 1000. |
unrolled_leapfrog_steps |
The number of leapfrogs to unroll per tree expansion step. Applies a direct linear multipler to the maximum trajectory length implied by max_tree_depth. Defaults to 1. |
seed |
integer to seed the random number generator. |
name |
name prefixed to Ops created by this function.
Default value: |
Details
The one_step
function can update multiple chains in parallel. It assumes
that a prefix of leftmost dimensions of current_state
index independent
chain states (and are therefore updated independently). The output of
target_log_prob_fn(current_state)
should sum log-probabilities across all
event dimensions. Slices along the rightmost dimensions may have different
target distributions; for example, current_state[0][0, ...]
could have a
different target distribution from current_state[0][1, ...]
. These
semantics are governed by target_log_prob_fn(*current_state)
.
(The number of independent chains is tf$size(target_log_prob_fn(current_state))
.)
Value
a Monte Carlo sampling kernel
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Examples
predictors <- tf$cast( c(201,244, 47,287,203,58,210,202,198,158,165,201,157,
131,166,160,186,125,218,146),tf$float32)
obs <- tf$cast(c(592,401,583,402,495,173,479,504,510,416,393,442,317,311,400,
337,423,334,533,344),tf$float32)
y_sigma <- tf$cast(c(61,25,38,15,21,15,27,14,30,16,14,25,52,16,34,31,42,26,
16,22),tf$float32)
# Robust linear regression model
robust_lm <- tfd_joint_distribution_sequential(
list(
tfd_normal(loc = 0, scale = 1, name = "b0"),
tfd_normal(loc = 0, scale = 1, name = "b1"),
tfd_half_normal(5, name = "df"),
function(df, b1, b0)
tfd_independent(
tfd_student_t(
# Likelihood
df = tf$expand_dims(df, axis = -1L),
loc = tf$expand_dims(b0, axis = -1L) +
tf$expand_dims(b1, axis = -1L) * predictors[tf$newaxis, ],
scale = y_sigma,
name = "st"
), name = "ind")), validate_args = TRUE)
log_prob <-function(b0, b1, df) {robust_lm %>%
tfd_log_prob(list(b0, b1, df, obs))}
step_size0 <- Map(function(x) tf$cast(x, tf$float32), c(1, .2, .5))
number_of_steps <- 10
burnin <- 5
nchain <- 50
run_chain <- function() {
# random initialization of the starting postion of each chain
samples <- robust_lm %>% tfd_sample(nchain)
b0 <- samples[[1]]
b1 <- samples[[2]]
df <- samples[[3]]
# bijector to map constrained parameters to real
unconstraining_bijectors <- list(
tfb_identity(), tfb_identity(), tfb_exp())
trace_fn <- function(x, pkr) {
list(pkr$inner_results$inner_results$step_size,
pkr$inner_results$inner_results$log_accept_ratio)
}
nuts <- mcmc_no_u_turn_sampler(
target_log_prob_fn = log_prob,
step_size = step_size0
) %>%
mcmc_transformed_transition_kernel(bijector = unconstraining_bijectors) %>%
mcmc_dual_averaging_step_size_adaptation(
num_adaptation_steps = burnin,
step_size_setter_fn = function(pkr, new_step_size)
pkr$`_replace`(
inner_results = pkr$inner_results$`_replace`(step_size = new_step_size)),
step_size_getter_fn = function(pkr) pkr$inner_results$step_size,
log_accept_prob_getter_fn = function(pkr) pkr$inner_results$log_accept_ratio
)
nuts %>% mcmc_sample_chain(
num_results = number_of_steps,
num_burnin_steps = burnin,
current_state = list(b0, b1, df),
trace_fn = trace_fn)
}
run_chain <- tensorflow::tf_function(run_chain)
res <- run_chain()
Gelman and Rubin (1992)'s potential scale reduction for chain convergence.
Description
Given N > 1
states from each of C > 1
independent chains, the potential
scale reduction factor, commonly referred to as R-hat, measures convergence of
the chains (to the same target) by testing for equality of means.
Usage
mcmc_potential_scale_reduction(
chains_states,
independent_chain_ndims = 1,
name = NULL
)
Arguments
chains_states |
|
independent_chain_ndims |
Integer type |
name |
name to prepend to created tf. Default: |
Details
Specifically, R-hat measures the degree to which variance (of the means) between chains exceeds what one would expect if the chains were identically distributed. See Gelman and Rubin (1992), Brooks and Gelman (1998)].
Some guidelines:
The initial state of the chains should be drawn from a distribution overdispersed with respect to the target.
If all chains converge to the target, then as
N --> infinity
, R-hat –> 1. Before that, R-hat > 1 (except in pathological cases, e.g. if the chain paths were identical).The above holds for any number of chains
C > 1
. IncreasingC
improves effectiveness of the diagnostic.Sometimes, R-hat < 1.2 is used to indicate approximate convergence, but of course this is problem dependent. See Brooks and Gelman (1998).
R-hat only measures non-convergence of the mean. If higher moments, or other statistics are desired, a different diagnostic should be used. See Brooks and Gelman (1998).
To see why R-hat is reasonable, let X
be a random variable drawn uniformly
from the combined states (combined over all chains). Then, in the limit
N, C --> infinity
, with E
, Var
denoting expectation and variance,
R-hat = ( E[Var[X | chain]] + Var[E[X | chain]] ) / E[Var[X | chain]].
Using the law of total variance, the numerator is the variance of the combined
states, and the denominator is the total variance minus the variance of the
the individual chain means. If the chains are all drawing from the same
distribution, they will have the same mean, and thus the ratio should be one.
Value
Tensor
or list
of Tensor
s representing the R-hat statistic for
the state(s). Same dtype
as state
, and shape equal to
state$shape[1 + independent_chain_ndims:]
.
References
Stephen P. Brooks and Andrew Gelman. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics, 7(4), 1998.
Andrew Gelman and Donald B. Rubin. Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7(4):457-472, 1992.
See Also
Other mcmc_functions:
mcmc_effective_sample_size()
,
mcmc_sample_annealed_importance_chain()
,
mcmc_sample_chain()
,
mcmc_sample_halton_sequence()
Runs one step of the RWM algorithm with symmetric proposal.
Description
Random Walk Metropolis is a gradient-free Markov chain Monte Carlo
(MCMC) algorithm. The algorithm involves a proposal generating step
proposal_state = current_state + perturb
by a random
perturbation, followed by Metropolis-Hastings accept/reject step. For more
details see Section 2.1 of Roberts and Rosenthal (2004).
Usage
mcmc_random_walk_metropolis(
target_log_prob_fn,
new_state_fn = NULL,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
new_state_fn |
Function which takes a list of state parts and a
seed; returns a same-type |
seed |
integer to seed the random number generator. |
name |
String name prefixed to Ops created by this function.
Default value: |
Details
The current class implements RWM for normal and uniform proposals. Alternatively,
the user can supply any custom proposal generating function.
The function one_step
can update multiple chains in parallel. It assumes
that all leftmost dimensions of current_state
index independent chain states
(and are therefore updated independently). The output of
target_log_prob_fn(current_state)
should sum log-probabilities across all
event dimensions. Slices along the rightmost dimensions may have different
target distributions; for example, current_state[0, :]
could have a
different target distribution from current_state[1, :]
. These semantics
are governed by target_log_prob_fn(current_state)
. (The number of
independent chains is tf$size(target_log_prob_fn(current_state))
.)
Value
a Monte Carlo sampling kernel
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of the Replica Exchange Monte Carlo
Description
Replica Exchange Monte Carlo
is a Markov chain Monte Carlo (MCMC) algorithm that is also known as Parallel Tempering.
This algorithm performs multiple sampling with different temperatures in parallel,
and exchanges those samplings according to the Metropolis-Hastings criterion.
The K
replicas are parameterized in terms of inverse_temperature
's,
(beta[0], beta[1], ..., beta[K-1])
. If the target distribution has
probability density p(x)
, the kth
replica has density p(x)**beta_k
.
Usage
mcmc_replica_exchange_mc(
target_log_prob_fn,
inverse_temperatures,
make_kernel_fn,
swap_proposal_fn = tfp$mcmc$replica_exchange_mc$default_swap_proposal_fn(1),
state_includes_replicas = FALSE,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
inverse_temperatures |
|
make_kernel_fn |
Function which takes target_log_prob_fn and seed args and returns a TransitionKernel instance. |
swap_proposal_fn |
function which take a number of replicas, and return combinations of replicas for exchange. |
state_includes_replicas |
Boolean indicating whether the leftmost dimension
of each state sample should index replicas. If |
seed |
integer to seed the random number generator. |
name |
string prefixed to Ops created by this function.
Default value: |
Details
Typically beta[0] = 1.0
, and 1.0 > beta[1] > beta[2] > ... > 0.0
.
-
beta[0] == 1
==> First replicas samples from the target density,p
. -
beta[k] < 1
, fork = 1, ..., K-1
==> Other replicas sample from "flattened" versions ofp
(peak is less high, valley less low). These distributions are somewhat closer to a uniform on the support ofp
. Samples from adjacent replicasi
,i + 1
are used as proposals for each other in a Metropolis step. This allows the lowerbeta
samples, which explore less dense areas ofp
, to occasionally be used to help thebeta == 1
chain explore new regions of the support. Samples from replica 0 are returned, and the others are discarded.
Value
list of
next_state
(Tensor or Python list of Tensor
s representing the state(s)
of the Markov chain(s) at each result step. Has same shape as
and current_state
.) and
kernel_results
(collections$namedtuple
of internal calculations used to
'advance the chain).
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs annealed importance sampling (AIS) to estimate normalizing constants.
Description
This function uses an MCMC transition operator (e.g., Hamiltonian Monte Carlo)
to sample from a series of distributions that slowly interpolates between
an initial "proposal" distribution:
exp(proposal_log_prob_fn(x) - proposal_log_normalizer)
and the target distribution:
exp(target_log_prob_fn(x) - target_log_normalizer)
,
accumulating importance weights along the way. The product of these
importance weights gives an unbiased estimate of the ratio of the
normalizing constants of the initial distribution and the target
distribution:
E[exp(ais_weights)] = exp(target_log_normalizer - proposal_log_normalizer)
.
Usage
mcmc_sample_annealed_importance_chain(
num_steps,
proposal_log_prob_fn,
target_log_prob_fn,
current_state,
make_kernel_fn,
parallel_iterations = 10,
name = NULL
)
Arguments
num_steps |
Integer number of Markov chain updates to run. More iterations means more expense, but smoother annealing between q and p, which in turn means exponentially lower variance for the normalizing constant estimator. |
proposal_log_prob_fn |
function that returns the log density of the initial distribution. |
target_log_prob_fn |
function which takes an argument like
|
current_state |
|
make_kernel_fn |
function which returns a |
parallel_iterations |
The number of iterations allowed to run in parallel.
It must be a positive integer. See |
name |
string prefixed to Ops created by this function.
Default value: |
Details
Note: When running in graph mode, proposal_log_prob_fn
and
target_log_prob_fn
are called exactly three times (although this may be
reduced to two times in the future).
Value
list of
next_state
(Tensor
or Python list of Tensor
s representing the
state(s) of the Markov chain(s) at the final iteration. Has same shape as
input current_state
),
ais_weights
(Tensor with the estimated weight(s). Has shape matching
target_log_prob_fn(current_state)
), and
kernel_results
(collections.namedtuple
of internal calculations used to
advance the chain).
See Also
For an example how to use see mcmc_sample_chain()
.
Other mcmc_functions:
mcmc_effective_sample_size()
,
mcmc_potential_scale_reduction()
,
mcmc_sample_chain()
,
mcmc_sample_halton_sequence()
Implements Markov chain Monte Carlo via repeated TransitionKernel
steps.
Description
This function samples from an Markov chain at current_state
and whose
stationary distribution is governed by the supplied TransitionKernel
instance (kernel
).
Usage
mcmc_sample_chain(
kernel = NULL,
num_results,
current_state,
previous_kernel_results = NULL,
num_burnin_steps = 0,
num_steps_between_results = 0,
trace_fn = NULL,
return_final_kernel_results = FALSE,
parallel_iterations = 10,
seed = NULL,
name = NULL
)
Arguments
kernel |
An instance of |
num_results |
Integer number of Markov chain draws. |
current_state |
|
previous_kernel_results |
A |
num_burnin_steps |
Integer number of chain steps to take before starting to collect results. Default value: 0 (i.e., no burn-in). |
num_steps_between_results |
Integer number of chain steps between collecting
a result. Only one out of every |
trace_fn |
A function that takes in the current chain state and the previous
kernel results and return a |
return_final_kernel_results |
If |
parallel_iterations |
The number of iterations allowed to run in parallel. It
must be a positive integer. See |
seed |
Optional, a seed for reproducible sampling. |
name |
string prefixed to Ops created by this function. Default value: |
Details
This function can sample from multiple chains, in parallel. (Whether or not
there are multiple chains is dictated by the kernel
.)
The current_state
can be represented as a single Tensor
or a list
of
Tensors
which collectively represent the current state.
Since MCMC states are correlated, it is sometimes desirable to produce
additional intermediate states, and then discard them, ending up with a set of
states with decreased autocorrelation. See Owen (2017). Such "thinning"
is made possible by setting num_steps_between_results > 0
. The chain then
takes num_steps_between_results
extra steps between the steps that make it
into the results. The extra steps are never materialized (in calls to
sess$run
), and thus do not increase memory requirements.
Warning: when setting a seed
in the kernel
, ensure that sample_chain
's
parallel_iterations=1
, otherwise results will not be reproducible.
In addition to returning the chain state, this function supports tracing of
auxiliary variables used by the kernel. The traced values are selected by
specifying trace_fn
. By default, all kernel results are traced but in the
future the default will be changed to no results being traced, so plan
accordingly. See below for some examples of this feature.
Value
list of:
checkpointable_states_and_trace: if
return_final_kernel_results
isTRUE
. The return value is an instance ofCheckpointableStatesAndTrace
.all_states: if
return_final_kernel_results
isFALSE
andtrace_fn
isNULL
. The return value is aTensor
or Python list ofTensor
s representing the state(s) of the Markov chain(s) at each result step. Has same shape as inputcurrent_state
but with a prependednum_results
-size dimension.states_and_trace: if
return_final_kernel_results
isFALSE
andtrace_fn
is notNULL
. The return value is an instance ofStatesAndTrace
.
References
See Also
Other mcmc_functions:
mcmc_effective_sample_size()
,
mcmc_potential_scale_reduction()
,
mcmc_sample_annealed_importance_chain()
,
mcmc_sample_halton_sequence()
Examples
dims <- 10
true_stddev <- sqrt(seq(1, 3, length.out = dims))
likelihood <- tfd_multivariate_normal_diag(scale_diag = true_stddev)
kernel <- mcmc_hamiltonian_monte_carlo(
target_log_prob_fn = likelihood$log_prob,
step_size = 0.5,
num_leapfrog_steps = 2
)
states <- kernel %>% mcmc_sample_chain(
num_results = 1000,
num_burnin_steps = 500,
current_state = rep(0, dims),
trace_fn = NULL
)
sample_mean <- tf$reduce_mean(states, axis = 0L)
sample_stddev <- tf$sqrt(
tf$reduce_mean(tf$math$squared_difference(states, sample_mean), axis = 0L))
Returns a sample from the dim
dimensional Halton sequence.
Description
Warning: The sequence elements take values only between 0 and 1. Care must be taken to appropriately transform the domain of a function if it differs from the unit cube before evaluating integrals using Halton samples. It is also important to remember that quasi-random numbers without randomization are not a replacement for pseudo-random numbers in every context. Quasi random numbers are completely deterministic and typically have significant negative autocorrelation unless randomization is used.
Usage
mcmc_sample_halton_sequence(
dim,
num_results = NULL,
sequence_indices = NULL,
dtype = tf$float32,
randomized = TRUE,
seed = NULL,
name = NULL
)
Arguments
dim |
Positive |
num_results |
(Optional) Positive scalar |
sequence_indices |
(Optional) |
dtype |
(Optional) The dtype of the sample. One of: |
randomized |
(Optional) bool indicating whether to produce a randomized
Halton sequence. If TRUE, applies the randomization described in
Owen (2017). Default value: |
seed |
(Optional) integer to seed the random number generator. Only
used if |
name |
(Optional) string describing ops managed by this function. If not supplied the name of this function is used. Default value: "sample_halton_sequence". |
Details
Computes the members of the low discrepancy Halton sequence in dimension
dim
. The dim
-dimensional sequence takes values in the unit hypercube in
dim
dimensions. Currently, only dimensions up to 1000 are supported. The
prime base for the k-th axes is the k-th prime starting from 2. For example,
if dim
= 3, then the bases will be [2, 3, 5]
respectively and the first
element of the non-randomized sequence will be: [0.5, 0.333, 0.2]
. For a more
complete description of the Halton sequences see
here. For low discrepancy
sequences and their applications see
here.
If randomized
is true, this function produces a scrambled version of the
Halton sequence introduced by Owen (2017). For the advantages of
randomization of low discrepancy sequences see
here.
The number of samples produced is controlled by the num_results
and
sequence_indices
parameters. The user must supply either num_results
or
sequence_indices
but not both.
The former is the number of samples to produce starting from the first
element. If sequence_indices
is given instead, the specified elements of
the sequence are generated. For example, sequence_indices=tf$range(10) is
equivalent to specifying n=10.
Value
halton_elements Elements of the Halton sequence. Tensor
of supplied dtype
and shape
[num_results, dim]
if num_results
was specified or shape
[s, dim]
where s is the size of sequence_indices
if sequence_indices
were specified.
References
See Also
For an example how to use see mcmc_sample_chain()
.
Other mcmc_functions:
mcmc_effective_sample_size()
,
mcmc_potential_scale_reduction()
,
mcmc_sample_annealed_importance_chain()
,
mcmc_sample_chain()
Adapts the inner kernel's step_size
based on log_accept_prob
.
Description
The simple policy multiplicatively increases or decreases the step_size
of
the inner kernel based on the value of log_accept_prob
. It is based on
equation 19 of Andrieu and Thoms (2008). Given enough steps and small
enough adaptation_rate
the median of the distribution of the acceptance
probability will converge to the target_accept_prob
. A good target
acceptance probability depends on the inner kernel. If this kernel is
HamiltonianMonteCarlo
, then 0.6-0.9 is a good range to aim for. For
RandomWalkMetropolis
this should be closer to 0.25. See the individual
kernels' docstrings for guidance.
Usage
mcmc_simple_step_size_adaptation(
inner_kernel,
num_adaptation_steps,
target_accept_prob = 0.75,
adaptation_rate = 0.01,
step_size_setter_fn = NULL,
step_size_getter_fn = NULL,
log_accept_prob_getter_fn = NULL,
validate_args = FALSE,
name = NULL
)
Arguments
inner_kernel |
|
num_adaptation_steps |
Scalar |
target_accept_prob |
A floating point |
adaptation_rate |
|
step_size_setter_fn |
A function with the signature
|
step_size_getter_fn |
A function with the signature
|
log_accept_prob_getter_fn |
A function with the signature
|
validate_args |
|
name |
string prefixed to Ops created by this class. Default: "simple_step_size_adaptation". |
Details
In general, adaptation prevents the chain from reaching a stationary
distribution, so obtaining consistent samples requires num_adaptation_steps
be set to a value somewhat smaller than the number of burnin steps.
However, it may sometimes be helpful to set num_adaptation_steps
to a larger
value during development in order to inspect the behavior of the chain during
adaptation.
The step size is assumed to broadcast with the chain state, potentially having
leading dimensions corresponding to multiple chains. When there are fewer of
those leading dimensions than there are chain dimensions, the corresponding
dimensions in the log_accept_prob
are averaged (in the direct space, rather
than the log space) before being used to adjust the step size. This means that
this kernel can do both cross-chain adaptation, or per-chain step size
adaptation, depending on the shape of the step size.
For example, if your problem has a state with shape [S]
, your chain state
has shape [C0, C1, Y]
(meaning that there are C0 * C1
total chains) and
log_accept_prob
has shape [C0, C1]
(one acceptance probability per chain),
then depending on the shape of the step size, the following will happen:
Step size has shape
[]
,[S]
or[1]
, thelog_accept_prob
will be averaged across itsC0
andC1
dimensions. This means that you will learn a shared step size based on the mean acceptance probability across all chains. This can be useful if you don't have a lot of steps to adapt and want to average away the noise.Step size has shape
[C1, 1]
or[C1, S]
, thelog_accept_prob
will be averaged across itsC0
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC1
dimension. This can be useful when theC1
dimension indexes different distributions, whileC0
indexes replicas of a single distribution, all sampled in parallel.Step size has shape
[C0, C1, 1]
or[C0, C1, S]
, then no averaging will happen. This means that each chain will learn its own step size. This can be useful when all chains are sampling from different distributions. Even when all chains are for the same distribution, this can help during the initial warmup period.Step size has shape
[C0, 1, 1]
or[C0, 1, S]
, thelog_accept_prob
will be averaged across itsC1
dimension. This means that you will learn a shared step size based on the mean acceptance probability across chains that share the coordinate across theC0
dimension. This can be useful when theC0
dimension indexes different distributions, whileC1
indexes replicas of a single distribution, all sampled in parallel.
Value
a Monte Carlo sampling kernel
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Examples
target_log_prob_fn <- tfd_normal(loc = 0, scale = 1)$log_prob
num_burnin_steps <- 500
num_results <- 500
num_chains <- 64L
step_size <- tf$fill(list(num_chains), 0.1)
kernel <- mcmc_hamiltonian_monte_carlo(
target_log_prob_fn = target_log_prob_fn,
num_leapfrog_steps = 2,
step_size = step_size
) %>%
mcmc_simple_step_size_adaptation(num_adaptation_steps = round(num_burnin_steps * 0.8))
res <- kernel %>% mcmc_sample_chain(
num_results = num_results,
num_burnin_steps = num_burnin_steps,
current_state = rep(0, num_chains),
trace_fn = function(x, pkr) {
list (
pkr$inner_results$accepted_results$step_size,
pkr$inner_results$log_accept_ratio
)
}
)
samples <- res$all_states
step_size <- res$trace[[1]]
log_accept_ratio <- res$trace[[2]]
Runs one step of the slice sampler using a hit and run approach
Description
Slice Sampling is a Markov Chain Monte Carlo (MCMC) algorithm based, as stated
by Neal (2003), on the observation that "...one can sample from a
distribution by sampling uniformly from the region under the plot of its
density function. A Markov chain that converges to this uniform distribution
can be constructed by alternately uniform sampling in the vertical direction
with uniform sampling from the horizontal slice
defined by the current
vertical position, or more generally, with some update that leaves the uniform
distribution over this slice invariant". Mathematical details and derivations
can be found in Neal (2003). The one dimensional slice sampler is
extended to n-dimensions through use of a hit-and-run approach: choose a
random direction in n-dimensional space and take a step, as determined by the
one-dimensional slice sampling algorithm, along that direction
(Belisle at al. 1993).
Usage
mcmc_slice_sampler(
target_log_prob_fn,
step_size,
max_doublings,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
step_size |
|
max_doublings |
Scalar positive int32 |
seed |
integer to seed the random number generator. |
name |
string prefixed to Ops created by this function.
Default value: |
Details
The one_step
function can update multiple chains in parallel. It assumes
that all leftmost dimensions of current_state
index independent chain states
(and are therefore updated independently). The output of
target_log_prob_fn(*current_state)
should sum log-probabilities across all
event dimensions. Slices along the rightmost dimensions may have different
target distributions; for example, current_state[0, :]
could have a
different target distribution from current_state[1, :]
. These semantics are
governed by target_log_prob_fn(*current_state)
. (The number of independent
chains is tf$size(target_log_prob_fn(*current_state))
.)
Note that the sampler only supports states where all components have a common dtype.
Value
list of
next_state
(Tensor or Python list of Tensor
s representing the state(s)
of the Markov chain(s) at each result step. Has same shape as
and current_state
.) and
kernel_results
(collections$namedtuple
of internal calculations used to
'advance the chain).
References
-
Radford M. Neal. Slice Sampling. The Annals of Statistics. 2003, Vol 31, No. 3 , 705-767.
C.J.P. Belisle, H.E. Romeijn, R.L. Smith. Hit-and-run algorithms for generating multivariate distributions. Math. Oper. Res., 18(1993), 225-266.
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Applies a bijector to the MCMC's state space
Description
The transformed transition kernel enables fitting
a bijector which serves to decorrelate the Markov chain Monte Carlo (MCMC)
event dimensions thus making the chain mix faster. This is
particularly useful when the geometry of the target distribution is
unfavorable. In such cases it may take many evaluations of the
target_log_prob_fn
for the chain to mix between faraway states.
Usage
mcmc_transformed_transition_kernel(inner_kernel, bijector, name = NULL)
Arguments
inner_kernel |
|
bijector |
bijector or list of bijectors. These bijectors use |
name |
string prefixed to Ops created by this function.
Default value: |
Details
The idea of training an affine function to decorrelate chain event dims was presented in Parno and Marzouk (2014). Used in conjunction with the Hamiltonian Monte Carlo transition kernel, the Parno and Marzouk (2014) idea is an instance of Riemannian manifold HMC (Girolami and Calderhead, 2011).
The transformed transition kernel enables arbitrary bijective transformations
of arbitrary transition kernels, e.g., one could use bijectors
tfb_affine
, tfb_real_nvp
, etc.
with transition kernels mcmc_hamiltonian_monte_carlo
, mcmc_random_walk_metropolis
, etc.
Value
a Monte Carlo sampling kernel
References
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of Uncalibrated Hamiltonian Monte Carlo
Description
Warning: this kernel will not result in a chain which converges to the
target_log_prob
. To get a convergent MCMC, use mcmc_hamiltonian_monte_carlo(...)
or mcmc_metropolis_hastings(mcmc_uncalibrated_hamiltonian_monte_carlo(...))
.
For more details on UncalibratedHamiltonianMonteCarlo
, see HamiltonianMonteCarlo
.
Usage
mcmc_uncalibrated_hamiltonian_monte_carlo(
target_log_prob_fn,
step_size,
num_leapfrog_steps,
state_gradients_are_stopped = FALSE,
seed = NULL,
store_parameters_in_results = FALSE,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
step_size |
|
num_leapfrog_steps |
Integer number of steps to run the leapfrog integrator
for. Total progress per HMC step is roughly proportional to
|
state_gradients_are_stopped |
|
seed |
integer to seed the random number generator. |
store_parameters_in_results |
If |
name |
string prefixed to Ops created by this function.
Default value: |
Value
a Monte Carlo sampling kernel
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_langevin()
,
mcmc_uncalibrated_random_walk()
Runs one step of Uncalibrated Langevin discretized diffusion.
Description
The class generates a Langevin proposal using _euler_method
function and
also computes helper UncalibratedLangevinKernelResults
for the next
iteration.
Warning: this kernel will not result in a chain which converges to the
target_log_prob
. To get a convergent MCMC, use
MetropolisAdjustedLangevinAlgorithm(...)
or MetropolisHastings(UncalibratedLangevin(...))
.
Usage
mcmc_uncalibrated_langevin(
target_log_prob_fn,
step_size,
volatility_fn = NULL,
parallel_iterations = 10,
compute_acceptance = TRUE,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
step_size |
|
volatility_fn |
function which takes an argument like
|
parallel_iterations |
the number of coordinates for which the gradients of
the volatility matrix |
compute_acceptance |
logical indicating whether to compute the
Metropolis log-acceptance ratio used to construct |
seed |
integer to seed the random number generator. |
name |
String prefixed to Ops created by this function.
Default value: |
Value
list of
next_state
(Tensor or Python list of Tensor
s representing the state(s)
of the Markov chain(s) at each result step. Has same shape as
and current_state
.) and
kernel_results
(collections$namedtuple
of internal calculations used to
'advance the chain).
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_random_walk()
Generate proposal for the Random Walk Metropolis algorithm.
Description
Warning: this kernel will not result in a chain which converges to the
target_log_prob
. To get a convergent MCMC, use
mcmc_random_walk_metropolis(...)
or
mcmc_metropolis_hastings(mcmc_uncalibrated_random_walk(...))
.
Usage
mcmc_uncalibrated_random_walk(
target_log_prob_fn,
new_state_fn = NULL,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
Function which takes an argument like
|
new_state_fn |
Function which takes a list of state parts and a
seed; returns a same-type |
seed |
integer to seed the random number generator. |
name |
String name prefixed to Ops created by this function.
Default value: |
Value
a Monte Carlo sampling kernel
See Also
Other mcmc_kernels:
mcmc_dual_averaging_step_size_adaptation()
,
mcmc_hamiltonian_monte_carlo()
,
mcmc_metropolis_adjusted_langevin_algorithm()
,
mcmc_metropolis_hastings()
,
mcmc_no_u_turn_sampler()
,
mcmc_random_walk_metropolis()
,
mcmc_replica_exchange_mc()
,
mcmc_simple_step_size_adaptation()
,
mcmc_slice_sampler()
,
mcmc_transformed_transition_kernel()
,
mcmc_uncalibrated_hamiltonian_monte_carlo()
,
mcmc_uncalibrated_langevin()
number of params
needed to create a CategoricalMixtureOfOneHotCategorical distribution
Description
number of params
needed to create a CategoricalMixtureOfOneHotCategorical distribution
Usage
params_size_categorical_mixture_of_one_hot_categorical(
event_size,
num_components
)
Arguments
event_size |
event size of this distribution |
num_components |
number of components in the mixture |
Value
a scalar
number of params
needed to create an IndependentBernoulli distribution
Description
number of params
needed to create an IndependentBernoulli distribution
Usage
params_size_independent_bernoulli(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
number of params
needed to create an IndependentLogistic distribution
Description
number of params
needed to create an IndependentLogistic distribution
Usage
params_size_independent_logistic(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
number of params
needed to create an IndependentNormal distribution
Description
number of params
needed to create an IndependentNormal distribution
Usage
params_size_independent_normal(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
number of params
needed to create an IndependentPoisson distribution
Description
number of params
needed to create an IndependentPoisson distribution
Usage
params_size_independent_poisson(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
number of params
needed to create a MixtureLogistic distribution
Description
number of params
needed to create a MixtureLogistic distribution
Usage
params_size_mixture_logistic(num_components, event_shape)
Arguments
num_components |
Number of component distributions in the mixture distribution. |
event_shape |
Number of parameters needed to create a single component distribution. |
Value
a scalar
number of params
needed to create a MixtureNormal distribution
Description
number of params
needed to create a MixtureNormal distribution
Usage
params_size_mixture_normal(num_components, event_shape)
Arguments
num_components |
Number of component distributions in the mixture distribution. |
event_shape |
Number of parameters needed to create a single component distribution. |
Value
a scalar
number of params
needed to create a MixtureSameFamily distribution
Description
number of params
needed to create a MixtureSameFamily distribution
Usage
params_size_mixture_same_family(num_components, component_params_size)
Arguments
num_components |
Number of component distributions in the mixture distribution. |
component_params_size |
Number of parameters needed to create a single component distribution. |
Value
a scalar
number of params
needed to create a MultivariateNormalTriL distribution
Description
number of params
needed to create a MultivariateNormalTriL distribution
Usage
params_size_multivariate_normal_tri_l(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
number of params
needed to create a OneHotCategorical distribution
Description
number of params
needed to create a OneHotCategorical distribution
Usage
params_size_one_hot_categorical(event_size)
Arguments
event_size |
event size of this distribution |
Value
a scalar
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
Value
a alias for tensorflow::tf
a alias for tensorflow::shape
a alias for tensorflow::tf_config
a alias for magrittr::%>%
A state space model representing a sum of component state space models.
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for details.
Usage
sts_additive_state_space_model(
component_ssms,
constant_offset = 0,
observation_noise_scale = NULL,
initial_state_prior = NULL,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
component_ssms |
|
constant_offset |
scalar |
observation_noise_scale |
Optional scalar |
initial_state_prior |
instance of |
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string prefixed to ops created by this class. Default value: "AdditiveStateSpaceModel". |
Details
The sts_additive_state_space_model
represents a sum of component state space
models. Each of the N
components describes a random process
generating a distribution on observed time series x1[t], x2[t], ..., xN[t]
.
The additive model represents the sum of these
processes, y[t] = x1[t] + x2[t] + ... + xN[t] + eps[t]
, where
eps[t] ~ N(0, observation_noise_scale)
is an observation noise term.
Mathematical Details
The additive model concatenates the latent states of its component models. The generative process runs each component's dynamics in its own subspace of latent space, and then observes the sum of the observation models from the components.
Formally, the transition model is linear Gaussian:
p(z[t+1] | z[t]) ~ Normal(loc = transition_matrix.matmul(z[t]), cov = transition_cov)
where each z[t]
is a latent state vector concatenating the component
state vectors, z[t] = [z1[t], z2[t], ..., zN[t]]
, so it has size
latent_size = sum([c.latent_size for c in components])
.
The transition matrix is the block-diagonal composition of transition matrices from the component processes:
transition_matrix = [[ c0.transition_matrix, 0., ..., 0. ], [ 0., c1.transition_matrix, ..., 0. ], [ ... ... ... ], [ 0., 0., ..., cN.transition_matrix ]]
and the noise covariance is similarly the block-diagonal composition of component noise covariances:
transition_cov = [[ c0.transition_cov, 0., ..., 0. ], [ 0., c1.transition_cov, ..., 0. ], [ ... ... ... ], [ 0., 0., ..., cN.transition_cov ]]
The observation model is also linear Gaussian,
p(y[t] | z[t]) ~ Normal(loc = observation_matrix.matmul(z[t]), stddev = observation_noise_scale)
This implementation assumes scalar observations, so observation_matrix
has shape [1, latent_size]
.
The additive observation matrix simply concatenates the observation matrices from each component:
observation_matrix = concat([c0.obs_matrix, c1.obs_matrix, ..., cN.obs_matrix], axis=-1)
The effect is that each component observation matrix acts on the dimensions of latent state corresponding to that component, and the overall expected observation is the sum of the expected observations from each component.
If observation_noise_scale
is not explicitly specified, it is also computed
by summing the noise variances of the component processes:
observation_noise_scale = sqrt(sum([c.observation_noise_scale**2 for c in components]))
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of an autoregressive model.
Description
An autoregressive (AR) model posits a latent level
whose value at each step
is a noisy linear combination of previous steps:
level[t+1] = (sum(coefficients * levels[t:t-order:-1]) + Normal(0., level_scale))
Usage
sts_autoregressive(
observed_time_series = NULL,
order,
coefficients_prior = NULL,
level_scale_prior = NULL,
initial_state_prior = NULL,
coefficient_constraining_bijector = NULL,
name = NULL
)
Arguments
observed_time_series |
optional |
order |
scalar positive |
coefficients_prior |
optional |
level_scale_prior |
optional |
initial_state_prior |
optional |
coefficient_constraining_bijector |
optional |
name |
the name of this model component. Default value: 'Autoregressive'. |
Details
The latent state is levels[t:t-order:-1]
. We observe a noisy realization of
the current level: f[t] = level[t] + Normal(0., observation_noise_scale)
at
each timestep.
If coefficients=[1.]
, the AR process is a simple random walk, equivalent to
a LocalLevel
model. However, a random walk's variance increases with time,
while many AR processes (in particular, any first-order process with
abs(coefficient) < 1
) are stationary, i.e., they maintain a constant
variance over time. This makes AR processes useful models of uncertainty.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for an autoregressive process.
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for
details.
Usage
sts_autoregressive_state_space_model(
num_timesteps,
coefficients,
level_scale,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
coefficients |
|
level_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
Optional scalar |
validate_args |
|
name |
name prefixed to ops created by this class. Default value: "AutoregressiveStateSpaceModel". |
Details
In an autoregressive process, the expected level at each timestep is a linear function of previous levels, with added Gaussian noise:
level[t+1] = (sum(coefficients * levels[t:t-order:-1]) + Normal(0., level_scale))
The process is characterized by a vector coefficients
whose size determines
the order of the process (how many previous values it looks at), and by
level_scale
, the standard deviation of the noise added at each step.
This is formulated as a state space model by letting the latent state encode
the most recent values; see 'Mathematical Details' below.
The parameters level_scale
and observation_noise_scale
are each (a batch
of) scalars, and coefficients
is a (batch) vector of size list(order)
. The
batch shape of this Distribution
is the broadcast batch
shape of these parameters and of the initial_state_prior
.
Mathematical Details
The autoregressive model implements a
tfd_linear_gaussian_state_space_model
with latent_size = order
and observation_size = 1
. The latent state vector encodes the recent history
of the process, with the current value in the topmost dimension. At each
timestep, the transition sums the previous values to produce the new expected
value, shifts all other values down by a dimension, and adds noise to the
current value. This is formally encoded by the transition model:
transition_matrix = [ coefs[0], coefs[1], ..., coefs[order] 1., 0 , ..., 0. 0., 1., ..., 0. ... 0., 0., ..., 1., 0. ]
transition_noise ~ N(loc=0., scale=diag([level_scale, 0., 0., ..., 0.]))
The observation model simply extracts the current (topmost) value, and optionally adds independent noise at each step:
observation_matrix = [[1., 0., ..., 0.]] observation_noise ~ N(loc=0, scale=observation_noise_scale)
Models with observation_noise_scale = 0
are AR processes in the formal
sense. Setting observation_noise_scale
to a nonzero value corresponds to a
latent AR process observed under an iid noise model.
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Build a variational posterior that factors over model parameters.
Description
The surrogate posterior consists of independent Normal distributions for
each parameter with trainable loc
and scale
, transformed using the
parameter's bijector
to the appropriate support space for that parameter.
Usage
sts_build_factored_surrogate_posterior(
model,
batch_shape = list(),
seed = NULL,
name = NULL
)
Arguments
model |
An instance of |
batch_shape |
Batch shape ( |
seed |
integer to seed the random number generator. |
name |
string prefixed to ops created by this function.
Default value: |
Value
variational_posterior tfd_joint_distribution_named
defining a trainable
surrogate posterior over model parameters. Samples from this
distribution are named lists with character
parameter names as keys.
See Also
Other sts-functions:
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Build a loss function for variational inference in STS models.
Description
Variational inference searches for the distribution within some family of
approximate posteriors that minimizes a divergence between the approximate
posterior q(z)
and true posterior p(z|observed_time_series)
. By converting
inference to optimization, it's generally much faster than sampling-based
inference algorithms such as HMC. The tradeoff is that the approximating
family rarely contains the true posterior, so it may miss important aspects of
posterior structure (in particular, dependence between variables) and should
not be blindly trusted. Results may vary; it's generally wise to compare to
HMC to evaluate whether inference quality is sufficient for your task at hand.
Usage
sts_build_factored_variational_loss(
observed_time_series,
model,
init_batch_shape = list(),
seed = NULL,
name = NULL
)
Arguments
observed_time_series |
|
model |
An instance of |
init_batch_shape |
Batch shape ( |
seed |
integer to seed the random number generator. |
name |
name prefixed to ops created by this function. Default value: |
Details
This method constructs a loss function for variational inference using the
Kullback-Liebler divergence KL[q(z) || p(z|observed_time_series)]
, with an
approximating family given by independent Normal distributions transformed to
the appropriate parameter space for each parameter. Minimizing this loss (the
negative ELBO) maximizes a lower bound on the log model evidence
-log p(observed_time_series)
. This is equivalent to the 'mean-field' method
implemented in Kucukelbir et al. (2017) and is a standard approach.
The resulting posterior approximations are unimodal; they will tend to underestimate posterior
uncertainty when the true posterior contains multiple modes
(the KL[q||p]
divergence encourages choosing a single mode) or dependence between variables.
Value
list of:
variational_loss:
float
Tensor
of shapetf$concat([init_batch_shape, model$batch_shape])
, encoding a stochastic estimate of an upper bound on the negative model evidence-log p(y)
. Minimizing this loss performs variational inference; the gap between the variational bound and the true (generally unknown) model evidence corresponds to the divergenceKL[q||p]
between the approximate and true posterior.variational_distributions: a named list giving the approximate posterior for each model parameter. The keys are
character
parameter names in order, corresponding to[param.name for param in model.parameters]
. The values aretfd$Distribution
instances with batch shapetf$concat([init_batch_shape, model$batch_shape])
; these will typically be of the formtfd$TransformedDistribution(tfd.Normal(...), bijector=param.bijector)
.
References
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Seasonal state space model with effects constrained to sum to zero.
Description
Seasonal state space model with effects constrained to sum to zero.
Usage
sts_constrained_seasonal_state_space_model(
num_timesteps,
num_seasons,
drift_scale,
initial_state_prior,
observation_noise_scale = 1e-04,
num_steps_per_season = 1,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
num_seasons |
Scalar |
drift_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
num_steps_per_season |
|
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string prefixed to ops created by this class. Default value: "SeasonalStateSpaceModel". |
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
sts_seasonal_state_space_model()
.
Mathematical details
The constrained model implements a reparameterization of the
naive SeasonalStateSpaceModel
. Instead of directly representing the
seasonal effects in the latent space, the latent space of the constrained
model represents the difference between each effect and the mean effect.
The following discussion assumes familiarity with the mathematical details
of SeasonalStateSpaceModel
.
Reparameterization and constraints: let the seasonal effects at a given
timestep be E = [e_1, ..., e_N]
. The difference between each effect e_i
and the mean effect is z_i = e_i - sum_i(e_i)/N
. By itself, this
transformation is not invertible because recovering the absolute effects
requires that we know the mean as well. To fix this, we'll define
z_N = sum_i(e_i)/N
as the mean effect. It's easy to see that this is
invertible: given the mean effect and the differences of the first N - 1
effects from the mean, it's easy to solve for all N
effects. Formally,
we've defined the invertible linear reparameterization Z = R E
, where
R = [1 - 1/N, -1/N, ..., -1/N -1/N, 1 - 1/N, ..., -1/N, ... 1/N, 1/N, ..., 1/N]
represents the change of basis from 'effect coordinates' E to
'residual coordinates' Z. The Z
s form the latent space of the
ConstrainedSeasonalStateSpaceModel
.
To constrain the mean effect z_N
to zero, we fix the prior to zero,
p(z_N) ~ N(0., 0)
, and after the transition at each timestep we project
z_N
back to zero. Note that this projection is linear: to set the Nth
dimension to zero, we simply multiply by the identity matrix with a missing
element in the bottom right, i.e., Z_constrained = P Z
,
where P = eye(N) - scatter((N-1, N-1), 1)
.
Model: concretely, suppose a naive seasonal effect model has initial state
prior N(m, S)
, transition matrix F
and noise covariance
Q
, and observation matrix H
. Then the corresponding constrained seasonal
effect model has initial state prior N(P R m, P R S R' P')
,
transition matrix P R F R^-1
and noise covariance F R Q R' F'
, and
observation matrix H R^-1
, where the change-of-basis matrix R
and
constraint projection matrix P
are as defined above. This follows
directly from applying the reparameterization Z = R E
, and then enforcing
the zero-sum constraint on the prior and transition noise covariances.
In practice, because the sum of effects z_N
is constrained to be zero, it
will never contribute a term to any linear operation on the latent space,
so we can drop that dimension from the model entirely.
ConstrainedSeasonalStateSpaceModel
does this, so that it implements the
N - 1
dimension latent space z_1, ..., z_[N-1]
.
Note that since we constrained the mean effect to be zero, the latent
z_i
's now recover their interpretation as the actual effects,
z_i = e_i
for i =
1, ..., N - 1, even though they were originally defined as residuals. The
Nth effect is represented only implicitly, as the nonzero mean of the first
N - 1effects. Although the computational represention is not symmetric across all
Neffects, we derived the
ConstrainedSeasonalStateSpaceModelby starting with a symmetric representation and imposing only a symmetric constraint (the zero-sum constraint), so the probability model remains symmetric over all
N'
seasonal effects.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Decompose an observed time series into contributions from each component.
Description
This method decomposes a time series according to the posterior represention of a structural time series model. In particular, it:
Computes the posterior marginal mean and covariances over the additive model's latent space.
Decomposes the latent posterior into the marginal blocks for each model component.
Maps the per-component latent posteriors back through each component's observation model, to generate the time series modeled by that component.
Usage
sts_decompose_by_component(observed_time_series, model, parameter_samples)
Arguments
observed_time_series |
|
model |
An instance of |
parameter_samples |
|
Value
component_dists A named list mapping
component StructuralTimeSeries instances (elements of model$components
)
to Distribution
instances representing the posterior marginal
distributions on the process modeled by each component. Each distribution
has batch shape matching that of posterior_means
/posterior_covs
, and
event shape of list(num_timesteps)
.
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Examples
observed_time_series <- array(rnorm(2 * 1 * 12), dim = c(2, 1, 12))
day_of_week <- observed_time_series %>% sts_seasonal(num_seasons = 7, name = "seasonal")
local_linear_trend <- observed_time_series %>% sts_local_linear_trend(name = "local_linear")
model <- observed_time_series %>%
sts_sum(components = list(day_of_week, local_linear_trend))
states_and_results <- observed_time_series %>%
sts_fit_with_hmc(
model,
num_results = 10,
num_warmup_steps = 5,
num_variational_steps = 15
)
samples <- states_and_results[[1]]
component_dists <- observed_time_series %>%
sts_decompose_by_component(model = model, parameter_samples = samples)
Decompose a forecast distribution into contributions from each component.
Description
Decompose a forecast distribution into contributions from each component.
Usage
sts_decompose_forecast_by_component(model, forecast_dist, parameter_samples)
Arguments
model |
An instance of |
forecast_dist |
A |
parameter_samples |
|
Value
component_dists A named list mapping
component StructuralTimeSeries instances (elements of model$components
)
to Distribution
instances representing the marginal forecast for each component.
Each distribution has batch shape matching forecast_dist
(specifically,
the event shape is [num_steps_forecast]
).
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Formal representation of a dynamic linear regression model.
Description
The dynamic linear regression model is a special case of a linear Gaussian SSM
and a generalization of typical (static) linear regression. The model
represents regression weights
with a latent state which evolves via a
Gaussian random walk:
Usage
sts_dynamic_linear_regression(
observed_time_series = NULL,
design_matrix,
drift_scale_prior = NULL,
initial_weights_prior = NULL,
name = NULL
)
Arguments
observed_time_series |
optional |
design_matrix |
float |
drift_scale_prior |
instance of |
initial_weights_prior |
instance of |
name |
the name of this component. Default value: 'DynamicLinearRegression'. |
Details
weights[t] ~ Normal(weights[t-1], drift_scale)
The latent state has dimension num_features
, while the parameters
drift_scale
and observation_noise_scale
are each (a batch of) scalars. The
batch shape of this distribution is the broadcast batch shape of these
parameters, the initial_state_prior
, and the design_matrix
.
num_features
is determined from the last dimension of design_matrix
(equivalent to the
number of columns in the design matrix in linear regression).
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a dynamic linear regression from provided covariates.
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for details.
Usage
sts_dynamic_linear_regression_state_space_model(
num_timesteps,
design_matrix,
drift_scale,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
design_matrix |
float |
drift_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
scalar |
validate_args |
|
allow_nan_stats |
|
name |
name prefixed to ops created by this class. Default value: 'DynamicLinearRegressionStateSpaceModel'. |
Details
The dynamic linear regression model is a special case of a linear Gaussian SSM
and a generalization of typical (static) linear regression. The model
represents regression weights
with a latent state which evolves via a
Gaussian random walk:
weights[t] ~ Normal(weights[t-1], drift_scale)
The latent state (the weights) has dimension num_features
, while the
parameters drift_scale
and observation_noise_scale
are each (a batch of)
scalars. The batch shape of this Distribution
is the broadcast batch shape
of these parameters, the initial_state_prior
, and the
design_matrix
. num_features
is determined from the last dimension of
design_matrix
(equivalent to the number of columns in the design matrix in
linear regression).
Mathematical Details
The dynamic linear regression model implements a
tfd_linear_gaussian_state_space_model
with latent_size = num_features
and
observation_size = 1
following the transition model:
transition_matrix = eye(num_features) transition_noise ~ Normal(0, diag([drift_scale]))
which implements the evolution of weights
described above. The observation
model is:
observation_matrix[t] = design_matrix[t] observation_noise ~ Normal(0, observation_noise_scale)
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Draw posterior samples using Hamiltonian Monte Carlo (HMC)
Description
Markov chain Monte Carlo (MCMC) methods are considered the gold standard of Bayesian inference; under suitable conditions and in the limit of infinitely many draws they generate samples from the true posterior distribution. HMC (Neal, 2011) uses gradients of the model's log-density function to propose samples, allowing it to exploit posterior geometry. However, it is computationally more expensive than variational inference and relatively sensitive to tuning.
Usage
sts_fit_with_hmc(
observed_time_series,
model,
num_results = 100,
num_warmup_steps = 50,
num_leapfrog_steps = 15,
initial_state = NULL,
initial_step_size = NULL,
chain_batch_shape = list(),
num_variational_steps = 150,
variational_optimizer = NULL,
variational_sample_size = 5,
seed = NULL,
name = NULL
)
Arguments
observed_time_series |
|
model |
An instance of |
num_results |
Integer number of Markov chain draws. Default value: |
num_warmup_steps |
Integer number of steps to take before starting to
collect results. The warmup steps are also used to adapt the step size
towards a target acceptance rate of 0.75. Default value: |
num_leapfrog_steps |
Integer number of steps to run the leapfrog integrator
for. Total progress per HMC step is roughly proportional to |
initial_state |
Optional Python |
initial_step_size |
|
chain_batch_shape |
Batch shape ( |
num_variational_steps |
|
variational_optimizer |
Optional |
variational_sample_size |
integer number of Monte Carlo samples to use
in estimating the variational divergence. Larger values may stabilize
the optimization, but at higher cost per step in time and memory.
Default value: |
seed |
integer to seed the random number generator. |
name |
name prefixed to ops created by this function. Default value: |
Details
This method attempts to provide a sensible default approach for fitting StructuralTimeSeries models using HMC. It first runs variational inference as a fast posterior approximation, and initializes the HMC sampler from the variational posterior, using the posterior standard deviations to set per-variable step sizes (equivalently, a diagonal mass matrix). During the warmup phase, it adapts the step size to target an acceptance rate of 0.75, which is thought to be in the desirable range for optimal mixing (Betancourt et al., 2014).
Value
list of:
samples:
list
ofTensors
representing posterior samples of model parameters, with shapes[concat([[num_results], chain_batch_shape, param.prior.batch_shape, param.prior.event_shape]) for param in model.parameters]
.kernel_results: A (possibly nested)
list
ofTensor
s representing internal calculations made within the HMC sampler.
References
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_forecast()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Examples
observed_time_series <-
rep(c(3.5, 4.1, 4.5, 3.9, 2.4, 2.1, 1.2), 5) +
rep(c(1.1, 1.5, 2.4, 3.1, 4.0), each = 7) %>%
tensorflow::tf$convert_to_tensor(dtype = tensorflow::tf$float64)
day_of_week <- observed_time_series %>% sts_seasonal(num_seasons = 7)
local_linear_trend <- observed_time_series %>% sts_local_linear_trend()
model <- observed_time_series %>%
sts_sum(components = list(day_of_week, local_linear_trend))
states_and_results <- observed_time_series %>%
sts_fit_with_hmc(
model,
num_results = 10,
num_warmup_steps = 5,
num_variational_steps = 15)
Construct predictive distribution over future observations
Description
Given samples from the posterior over parameters, return the predictive distribution over future observations for num_steps_forecast timesteps.
Usage
sts_forecast(
observed_time_series,
model,
parameter_samples,
num_steps_forecast
)
Arguments
observed_time_series |
|
model |
An instance of |
parameter_samples |
|
num_steps_forecast |
scalar |
Value
forecast_dist a tfd_mixture_same_family
instance with event shape
list(num_steps_forecast, 1)
and batch shape tf$concat(list(sample_shape, model$batch_shape))
, with
num_posterior_draws
mixture components.
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_one_step_predictive()
,
sts_sample_uniform_initial_state()
Examples
observed_time_series <-
rep(c(3.5, 4.1, 4.5, 3.9, 2.4, 2.1, 1.2), 5) +
rep(c(1.1, 1.5, 2.4, 3.1, 4.0), each = 7) %>%
tensorflow::tf$convert_to_tensor(dtype = tensorflow::tf$float64)
day_of_week <- observed_time_series %>% sts_seasonal(num_seasons = 7)
local_linear_trend <- observed_time_series %>% sts_local_linear_trend()
model <- observed_time_series %>%
sts_sum(components = list(day_of_week, local_linear_trend))
states_and_results <- observed_time_series %>%
sts_fit_with_hmc(
model,
num_results = 10,
num_warmup_steps = 5,
num_variational_steps = 15)
samples <- states_and_results[[1]]
preds <- observed_time_series %>%
sts_forecast(model,
parameter_samples = samples,
num_steps_forecast = 50)
predictions <- preds %>% tfd_sample(10)
Formal representation of a linear regression from provided covariates.
Description
This model defines a time series given by a linear combination of covariate time series provided in a design matrix:
observed_time_series <- tf$matmul(design_matrix, weights)
Usage
sts_linear_regression(design_matrix, weights_prior = NULL, name = NULL)
Arguments
design_matrix |
float |
weights_prior |
|
name |
the name of this model component. Default value: 'LinearRegression'. |
Details
The design matrix has shape list(num_timesteps, num_features)
.
The weights are treated as an unknown random variable of size list(num_features)
(both components also support batch shape), and are integrated over using the same
approximate inference tools as other model parameters, i.e., generally HMC or
variational inference.
This component does not itself include observation noise; it defines a
deterministic distribution with mass at the point
tf$matmul(design_matrix, weights)
. In practice, it should be combined with
observation noise from another component such as sts_sum
, as demonstrated below.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of a local level model
Description
The local level model posits a level
evolving via a Gaussian random walk:
level[t] = level[t-1] + Normal(0., level_scale)
Usage
sts_local_level(
observed_time_series = NULL,
level_scale_prior = NULL,
initial_level_prior = NULL,
name = NULL
)
Arguments
observed_time_series |
optional |
level_scale_prior |
optional |
initial_level_prior |
optional |
name |
the name of this model component. Default value: 'LocalLevel'. |
Details
The latent state is [level]
. We observe a noisy realization of the current
level: f[t] = level[t] + Normal(0., observation_noise_scale)
at each timestep.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a local level
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for
details.
The local level model is a special case of a linear Gaussian SSM, in which the
latent state posits a level
evolving via a Gaussian random walk:
level[t] = level[t-1] + Normal(0., level_scale)
Usage
sts_local_level_state_space_model(
num_timesteps,
level_scale,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
level_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string name prefixed to ops created by this class. Default value: "LocalLevelStateSpaceModel". |
Details
The latent state is [level]
and [level]
is observed (with noise) at each timestep.
The parameters level_scale
and observation_noise_scale
are each (a batch
of) scalars. The batch shape of this Distribution
is the broadcast batch
shape of these parameters and of the initial_state_prior
.
Mathematical Details
The local level model implements a tfp$distributions$LinearGaussianStateSpaceModel
with
latent_size = 1
and observation_size = 1
, following the transition model:
transition_matrix = [[1]] transition_noise ~ N(loc = 0, scale = diag([level_scale]))
which implements the evolution of level
described above, and the observation model:
observation_matrix = [[1]] observation_noise ~ N(loc = 0, scale = observation_noise_scale)
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of a local linear trend model
Description
The local linear trend model posits a level
and slope
, each
evolving via a Gaussian random walk:
level[t] = level[t-1] + slope[t-1] + Normal(0., level_scale) slope[t] = slope[t-1] + Normal(0., slope_scale)
Usage
sts_local_linear_trend(
observed_time_series = NULL,
level_scale_prior = NULL,
slope_scale_prior = NULL,
initial_level_prior = NULL,
initial_slope_prior = NULL,
name = NULL
)
Arguments
observed_time_series |
optional |
level_scale_prior |
optional |
slope_scale_prior |
optional |
initial_level_prior |
optional |
initial_slope_prior |
optional |
name |
the name of this model component. Default value: 'LocalLinearTrend'. |
Details
The latent state is the two-dimensional tuple [level, slope]
. At each
timestep we observe a noisy realization of the current level:
f[t] = level[t] + Normal(0., observation_noise_scale)
.
This model is appropriate for data where the trend direction and magnitude (latent
slope
) is consistent within short periods but may evolve over time.
Note that this model can produce very high uncertainty forecasts, as
uncertainty over the slope compounds quickly. If you expect your data to
have nonzero long-term trend, i.e. that slopes tend to revert to some mean,
then the SemiLocalLinearTrend
model may produce sharper forecasts.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a local linear trend
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for details.
Usage
sts_local_linear_trend_state_space_model(
num_timesteps,
level_scale,
slope_scale,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
level_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
slope_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string prefixed to ops created by this class. Default value: "LocalLinearTrendStateSpaceModel". |
Details
The local linear trend model is a special case of a linear Gaussian SSM, in
which the latent state posits a level
and slope
, each evolving via a
Gaussian random walk:
level[t] = level[t-1] + slope[t-1] + Normal(0., level_scale) slope[t] = slope[t-1] + Normal(0., slope_scale)
The latent state is the two-dimensional tuple [level, slope]
. The
level
is observed at each timestep.
The parameters level_scale
, slope_scale
, and observation_noise_scale
are each (a batch of) scalars. The batch shape of this Distribution
is the
broadcast batch shape of these parameters and of the initial_state_prior
.
Mathematical Details
The linear trend model implements a tfd_linear_gaussian_state_space_model
with latent_size = 2
and observation_size = 1
, following the transition model:
transition_matrix = [[1., 1.] [0., 1.]] transition_noise ~ N(loc = 0, scale = diag([level_scale, slope_scale]))
which implements the evolution of [level, slope]
described above, and the observation model:
observation_matrix = [[1., 0.]] observation_noise ~ N(loc= 0 , scale = observation_noise_scale)
which picks out the first latent component, i.e., the level
, as the
observation at each timestep.
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Compute one-step-ahead predictive distributions for all timesteps
Description
Given samples from the posterior over parameters, return the predictive
distribution over observations at each time T
, given observations up
through time T-1
.
Usage
sts_one_step_predictive(
observed_time_series,
model,
parameter_samples,
timesteps_are_event_shape = TRUE
)
Arguments
observed_time_series |
|
model |
An instance of |
parameter_samples |
|
timesteps_are_event_shape |
Deprecated, for backwards compatibility only. If False, the predictive distribution will return per-timestep probabilities Default value: TRUE. |
Value
forecast_dist a tfd_mixture_same_family
instance with event shape
list(num_timesteps)
and batch shape tf$concat(list(sample_shape, model$batch_shape))
, with
num_posterior_draws
mixture components. The t
th step represents the
forecast distribution p(observed_time_series[t] | observed_time_series[0:t-1], parameter_samples)
.
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_sample_uniform_initial_state()
Initialize from a uniform [-2, 2]
distribution in unconstrained space.
Description
Initialize from a uniform [-2, 2]
distribution in unconstrained space.
Usage
sts_sample_uniform_initial_state(
parameter,
return_constrained = TRUE,
init_sample_shape = list(),
seed = NULL
)
Arguments
parameter |
|
return_constrained |
if |
init_sample_shape |
|
seed |
integer to seed the random number generator. |
Value
uniform_initializer Tensor
of shape
concat([init_sample_shape, parameter.prior.batch_shape, transformed_event_shape])
, where
transformed_event_shape
is parameter.prior.event_shape
, if
return_constrained=TRUE
, and otherwise it is
parameter$bijector$inverse_event_shape(parameter$prior$event_shape)
.
See Also
Other sts-functions:
sts_build_factored_surrogate_posterior()
,
sts_build_factored_variational_loss()
,
sts_decompose_by_component()
,
sts_decompose_forecast_by_component()
,
sts_fit_with_hmc()
,
sts_forecast()
,
sts_one_step_predictive()
Formal representation of a seasonal effect model.
Description
A seasonal effect model posits a fixed set of recurring, discrete 'seasons', each of which is active for a fixed number of timesteps and, while active, contributes a different effect to the time series. These are generally not meteorological seasons, but represent regular recurring patterns such as hour-of-day or day-of-week effects. Each season lasts for a fixed number of timesteps. The effect of each season drifts from one occurrence to the next following a Gaussian random walk:
Usage
sts_seasonal(
observed_time_series = NULL,
num_seasons,
num_steps_per_season = 1,
drift_scale_prior = NULL,
initial_effect_prior = NULL,
constrain_mean_effect_to_zero = TRUE,
name = NULL
)
Arguments
observed_time_series |
optional |
num_seasons |
Scalar |
num_steps_per_season |
|
drift_scale_prior |
optional |
initial_effect_prior |
optional |
constrain_mean_effect_to_zero |
if |
name |
the name of this model component. Default value: 'Seasonal'. |
Details
effects[season, occurrence[i]] = ( effects[season, occurrence[i-1]] + Normal(loc=0., scale=drift_scale))
The drift_scale
parameter governs the standard deviation of the random walk;
for example, in a day-of-week model it governs the change in effect from this
Monday to next Monday.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a seasonal effect.
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for
details.
Usage
sts_seasonal_state_space_model(
num_timesteps,
num_seasons,
drift_scale,
initial_state_prior,
observation_noise_scale = 0,
num_steps_per_season = 1,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
num_seasons |
Scalar |
drift_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
num_steps_per_season |
|
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string prefixed to ops created by this class. Default value: "SeasonalStateSpaceModel". |
Details
A seasonal effect model is a special case of a linear Gaussian SSM. The latent states represent an unknown effect from each of several 'seasons'; these are generally not meteorological seasons, but represent regular recurring patterns such as hour-of-day or day-of-week effects. The effect of each season drifts from one occurrence to the next, following a Gaussian random walk:
effects[season, occurrence[i]] = (effects[season, occurrence[i-1]] + Normal(loc=0., scale=drift_scale))
The latent state has dimension num_seasons
, containing one effect for each
seasonal component. The parameters drift_scale
and
observation_noise_scale
are each (a batch of) scalars. The batch shape of
this Distribution
is the broadcast batch shape of these parameters and of
the initial_state_prior
.
Note: there is no requirement that the effects sum to zero.
Mathematical Details
The seasonal effect model implements a tfd_linear_gaussian_state_space_model
with
latent_size = num_seasons
and observation_size = 1
. The latent state
is organized so that the current seasonal effect is always in the first
(zeroth) dimension. The transition model rotates the latent state to shift
to a new effect at the end of each season:
transition_matrix[t] = (permutation_matrix([1, 2, ..., num_seasons-1, 0]) if season_is_changing(t) else eye(num_seasons) transition_noise[t] ~ Normal(loc=0., scale_diag=( [drift_scale, 0, ..., 0] if season_is_changing(t) else [0, 0, ..., 0]))
where season_is_changing(t)
is True
if t `mod` sum(num_steps_per_season)
is in
the set of final days for each season, given by cumsum(num_steps_per_season) - 1
.
The observation model always picks out the effect for the current season, i.e.,
the first element of the latent state:
observation_matrix = [[1., 0., ..., 0.]] observation_noise ~ Normal(loc=0, scale=observation_noise_scale)
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of a semi-local linear trend model.
Description
Like the sts_local_linear_trend
model, a semi-local linear trend posits a
latent level
and slope
, with the level component updated according to
the current slope plus a random walk:
Usage
sts_semi_local_linear_trend(
observed_time_series = NULL,
level_scale_prior = NULL,
slope_mean_prior = NULL,
slope_scale_prior = NULL,
autoregressive_coef_prior = NULL,
initial_level_prior = NULL,
initial_slope_prior = NULL,
constrain_ar_coef_stationary = TRUE,
constrain_ar_coef_positive = FALSE,
name = NULL
)
Arguments
observed_time_series |
optional |
level_scale_prior |
optional |
slope_mean_prior |
optional |
slope_scale_prior |
optional |
autoregressive_coef_prior |
optional |
initial_level_prior |
optional |
initial_slope_prior |
optional |
constrain_ar_coef_stationary |
if |
constrain_ar_coef_positive |
if |
name |
the name of this model component. Default value: 'SemiLocalLinearTrend'. |
Details
level[t] = level[t-1] + slope[t-1] + Normal(0., level_scale)
The slope component in a sts_semi_local_linear_trend
model evolves according to
a first-order autoregressive (AR1) process with potentially nonzero mean:
slope[t] = (slope_mean + autoregressive_coef * (slope[t-1] - slope_mean) + Normal(0., slope_scale))
Unlike the random walk used in LocalLinearTrend
, a stationary
AR1 process (coefficient in (-1, 1)
) maintains bounded variance over time,
so a SemiLocalLinearTrend
model will often produce more reasonable
uncertainties when forecasting over long timescales.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a semi-local linear trend.
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfd_linear_gaussian_state_space_model
for details.
Usage
sts_semi_local_linear_trend_state_space_model(
num_timesteps,
level_scale,
slope_mean,
slope_scale,
autoregressive_coef,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
level_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
slope_mean |
Scalar (any additional dimensions are treated as batch
dimensions) |
slope_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
autoregressive_coef |
Scalar (any additional dimensions are treated as
batch dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
Optional scalar |
validate_args |
|
allow_nan_stats |
|
name |
string' prefixed to ops created by this class. Default value: "SemiLocalLinearTrendStateSpaceModel". |
Details
The semi-local linear trend model is a special case of a linear Gaussian
SSM, in which the latent state posits a level
and slope
. The level
evolves via a Gaussian random walk centered at the current slope
, while
the slope
follows a first-order autoregressive (AR1) process with
mean slope_mean
:
level[t] = level[t-1] + slope[t-1] + Normal(0, level_scale) slope[t] = (slope_mean + autoregressive_coef * (slope[t-1] - slope_mean) + Normal(0., slope_scale))
The latent state is the two-dimensional tuple [level, slope]
. The
level
is observed at each timestep.
The parameters level_scale
, slope_mean
, slope_scale
,
autoregressive_coef
, and observation_noise_scale
are each (a batch of)
scalars. The batch shape of this Distribution
is the broadcast batch shape
of these parameters and of the initial_state_prior
.
Mathematical Details
The semi-local linear trend model implements a
tfp.distributions.LinearGaussianStateSpaceModel
with latent_size = 2
and observation_size = 1
, following the transition model:
transition_matrix = [[1., 1.] [0., autoregressive_coef]] transition_noise ~ N(loc=slope_mean - autoregressive_coef * slope_mean, scale=diag([level_scale, slope_scale]))
which implements the evolution of [level, slope]
described above, and
the observation model:
observation_matrix = [[1., 0.]] observation_noise ~ N(loc=0, scale=observation_noise_scale)
which picks out the first latent component, i.e., the level
, as the
observation at each timestep.
Value
an instance of LinearGaussianStateSpaceModel
.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of a smooth seasonal effect model
Description
The smooth seasonal model uses a set of trigonometric terms in order to
capture a recurring pattern whereby adjacent (in time) effects are
similar. The model uses frequencies
calculated via:
Usage
sts_smooth_seasonal(
period,
frequency_multipliers,
allow_drift = TRUE,
drift_scale_prior = NULL,
initial_state_prior = NULL,
observed_time_series = NULL,
name = NULL
)
Arguments
period |
positive scalar |
frequency_multipliers |
One-dimensional |
allow_drift |
optional |
drift_scale_prior |
optional |
initial_state_prior |
instance of |
observed_time_series |
optional |
name |
the name of this model component. Default value: 'LocalLinearTrend'. |
Details
frequencies[j] = 2. * pi * frequency_multipliers[j] / period
and then posits two latent states for each frequency
. The two latent states
associated with frequency j
drift over time via:
effect[t] = (effect[t-1] * cos(frequencies[j]) + auxiliary[t-] * sin(frequencies[j]) + Normal(0., drift_scale)) auxiliary[t] = (-effect[t-1] * sin(frequencies[j]) + auxiliary[t-] * cos(frequencies[j]) + Normal(0., drift_scale))
where effect
is the smooth seasonal effect and auxiliary
only appears as a
matter of construction. The interpretation of auxiliary
is thus not
particularly important.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_sparse_linear_regression()
,
sts_sum()
State space model for a smooth seasonal effect
Description
A state space model (SSM) posits a set of latent (unobserved) variables that
evolve over time with dynamics specified by a probabilistic transition model
p(z[t+1] | z[t])
. At each timestep, we observe a value sampled from an
observation model conditioned on the current state, p(x[t] | z[t])
. The
special case where both the transition and observation models are Gaussians
with mean specified as a linear function of the inputs, is known as a linear
Gaussian state space model and supports tractable exact probabilistic
calculations; see tfp$distributions$LinearGaussianStateSpaceModel
for
details.
A smooth seasonal effect model is a special case of a linear Gaussian SSM. It
is the sum of a set of "cyclic" components, with one component for each
frequency:
frequencies[j] = 2. * pi * frequency_multipliers[j] / period
Each cyclic component contains two latent states which we denote effect
and
auxiliary
. The two latent states for component j
drift over time via:
effect[t] = (effect[t-1] * cos(frequencies[j]) + auxiliary[t-] * sin(frequencies[j]) + Normal(0., drift_scale)) auxiliary[t] = (-effect[t-1] * sin(frequencies[j]) + auxiliary[t-] * cos(frequencies[j]) + Normal(0., drift_scale))
Usage
sts_smooth_seasonal_state_space_model(
num_timesteps,
period,
frequency_multipliers,
drift_scale,
initial_state_prior,
observation_noise_scale = 0,
initial_step = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
num_timesteps |
Scalar |
period |
positive scalar |
frequency_multipliers |
One-dimensional |
drift_scale |
Scalar (any additional dimensions are treated as batch
dimensions) |
initial_state_prior |
instance of |
observation_noise_scale |
Scalar (any additional dimensions are
treated as batch dimensions) |
initial_step |
scalar |
validate_args |
|
allow_nan_stats |
|
name |
string prefixed to ops created by this class. Default value: "LocalLinearTrendStateSpaceModel". |
Details
The auxiliary
latent state only appears as a matter of construction and thus
its interpretation is not particularly important. The total smooth seasonal
effect is the sum of the effect
values from each of the cyclic components.
The parameters drift_scale
and observation_noise_scale
are each (a batch
of) scalars. The batch shape of this Distribution
is the broadcast batch
shape of these parameters and of the initial_state_prior
.
Mathematical Details
The smooth seasonal effect model implements a
tfp$distributions$LinearGaussianStateSpaceModel
with
latent_size = 2 * len(frequency_multipliers)
and observation_size = 1
.
The latent state is the concatenation of the cyclic latent states which themselves
comprise an effect
and an auxiliary
state. The transition matrix is a block diagonal
matrix where block j
is:
transition_matrix[j] = [[cos(frequencies[j]), sin(frequencies[j])], [-sin(frequencies[j]), cos(frequencies[j])]]
The observation model picks out the cyclic effect
values from the latent state:
observation_matrix = [[1., 0., 1., 0., ..., 1., 0.]] observation_noise ~ Normal(loc=0, scale=observation_noise_scale)
For further mathematical details please see Harvey (1990).
Value
an instance of LinearGaussianStateSpaceModel
.
references
Harvey, A. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press, 1990.
See Also
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
,
sts_sum()
Formal representation of a sparse linear regression.
Description
This model defines a time series given by a sparse linear combination of covariate time series provided in a design matrix:
Usage
sts_sparse_linear_regression(
design_matrix,
weights_prior_scale = 0.1,
weights_batch_shape = NULL,
name = NULL
)
Arguments
design_matrix |
float |
weights_prior_scale |
float |
weights_batch_shape |
if |
name |
the name of this model component. Default value: 'LinearRegression'. |
Details
observed_time_series <- tf$matmul(design_matrix, weights)
This is identical to sts_linear_regression
, except that
sts_sparse_linear_regression
uses a parameterization of a Horseshoe
prior to encode the assumption that many of the weights
are zero,
i.e., many of the covariate time series are irrelevant. See the mathematical
details section below for further discussion. The prior parameterization used
by sts_sparse_linear_regression
is more suitable for inference than that
obtained by simply passing the equivalent tfd_horseshoe
prior to
sts_linear_regression
; when sparsity is desired, sts_sparse_linear_regression
will
likely yield better results.
This component does not itself include observation noise; it defines a
deterministic distribution with mass at the point
tf$matmul(design_matrix, weights)
. In practice, it should be combined with
observation noise from another component such as sts_sum
.
Mathematical Details
The basic horseshoe prior Carvalho et al. (2009) is defined as a Cauchy-normal scale mixture:
scales[i] ~ HalfCauchy(loc=0, scale=1) weights[i] ~ Normal(loc=0., scale=scales[i] * global_scale)`
The Cauchy scale parameters puts substantial mass near zero, encouraging
weights to be sparse, but their heavy tails allow weights far from zero to be
estimated without excessive shrinkage. The horseshoe can be thought of as a
continuous relaxation of a traditional 'spike-and-slab' discrete sparsity
prior, in which the latent Cauchy scale mixes between 'spike'
(scales[i] ~= 0
) and 'slab' (scales[i] >> 0
) regimes.
Following the recommendations in Piironen et al. (2017), SparseLinearRegression
implements
a horseshoe with the following adaptations:
The Cauchy prior on
scales[i]
is represented as an InverseGamma-Normal compound.The
global_scale
parameter is integrated out following aCauchy(0., scale=weights_prior_scale)
hyperprior, which is also represented as an InverseGamma-Normal compound.All compound distributions are implemented using a non-centered parameterization. The compound, non-centered representation defines the same marginal prior as the original horseshoe (up to integrating out the global scale), but allows samplers to mix more efficiently through the heavy tails; for variational inference, the compound representation implicity expands the representational power of the variational model.
Note that we do not yet implement the regularized ('Finnish') horseshoe, proposed in Piironen et al. (2017) for models with weak likelihoods, because the likelihood in STS models is typically Gaussian, where it's not clear that additional regularization is appropriate. If you need this functionality, please email tfprobability@tensorflow.org.
The full prior parameterization implemented in SparseLinearRegression
is
as follows:
Sample global_scale from Cauchy(0, scale=weights_prior_scale). global_scale_variance ~ InverseGamma(alpha=0.5, beta=0.5) global_scale_noncentered ~ HalfNormal(loc=0, scale=1) global_scale = (global_scale_noncentered * sqrt(global_scale_variance) * weights_prior_scale) Sample local_scales from Cauchy(0, 1). local_scale_variances[i] ~ InverseGamma(alpha=0.5, beta=0.5) local_scales_noncentered[i] ~ HalfNormal(loc=0, scale=1) local_scales[i] = local_scales_noncentered[i] * sqrt(local_scale_variances[i]) weights[i] ~ Normal(loc=0., scale=local_scales[i] * global_scale)
Value
an instance of StructuralTimeSeries
.
References
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sum()
Sum of structural time series components.
Description
This class enables compositional specification of a structural time series model from basic components. Given a list of component models, it represents an additive model, i.e., a model of time series that may be decomposed into a sum of terms corresponding to the component models.
Usage
sts_sum(
observed_time_series = NULL,
components,
constant_offset = NULL,
observation_noise_scale_prior = NULL,
name = NULL
)
Arguments
observed_time_series |
optional |
components |
|
constant_offset |
optional scalar |
observation_noise_scale_prior |
optional |
name |
string name of this model component; used as |
Details
Formally, the additive model represents a random process
g[t] = f1[t] + f2[t] + ... + fN[t] + eps[t]
, where the f
's are the
random processes represented by the components, and
eps[t] ~ Normal(loc=0, scale=observation_noise_scale)
is an observation
noise term. See the AdditiveStateSpaceModel
documentation for mathematical details.
This model inherits the parameters (with priors) of its components, and
adds an observation_noise_scale
parameter governing the level of noise in
the observed time series.
Value
an instance of StructuralTimeSeries
.
See Also
For usage examples see sts_fit_with_hmc()
, sts_forecast()
, sts_decompose_by_component()
.
Other sts:
sts_additive_state_space_model()
,
sts_autoregressive_state_space_model()
,
sts_autoregressive()
,
sts_constrained_seasonal_state_space_model()
,
sts_dynamic_linear_regression_state_space_model()
,
sts_dynamic_linear_regression()
,
sts_linear_regression()
,
sts_local_level_state_space_model()
,
sts_local_level()
,
sts_local_linear_trend_state_space_model()
,
sts_local_linear_trend()
,
sts_seasonal_state_space_model()
,
sts_seasonal()
,
sts_semi_local_linear_trend_state_space_model()
,
sts_semi_local_linear_trend()
,
sts_smooth_seasonal_state_space_model()
,
sts_smooth_seasonal()
,
sts_sparse_linear_regression()
ComputesY = g(X) = Abs(X)
, element-wise
Description
This non-injective bijector allows for transformations of scalar distributions
with the absolute value function, which maps (-inf, inf)
to [0, inf)
.
For
y
in(0, inf)
,tfb_absolute_value$inverse(y)
returns the set inverse{x in (-inf, inf) : |x| = y}
as a tuple,-y, y
.tfb_absolute_value$inverse(0)
returns0, 0
, which is not the set inverse (the set inverse is the singleton{0}
), but "works" in conjunction withTransformedDistribution
to produce a left semi-continuous pdf. Fory < 0
,tfb_absolute_value$inverse(y)
happily returns the wrong thing,-y, y
This is done for efficiency. Ifvalidate_args == TRUE
,y < 0
will raise an exception.
Usage
tfb_absolute_value(validate_args = FALSE, name = "absolute_value")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Affine bijector
Description
This Bijector is initialized with shift Tensor and scale arguments,
giving the forward operation: Y = g(X) = scale @ X + shift
where the scale term is logically equivalent to:
scale = scale_identity_multiplier * tf.diag(tf.ones(d)) + tf.diag(scale_diag) + scale_tril + scale_perturb_factor @ diag(scale_perturb_diag) @ tf.transpose([scale_perturb_factor]))
Usage
tfb_affine(
shift = NULL,
scale_identity_multiplier = NULL,
scale_diag = NULL,
scale_tril = NULL,
scale_perturb_factor = NULL,
scale_perturb_diag = NULL,
adjoint = FALSE,
validate_args = FALSE,
name = "affine",
dtype = NULL
)
Arguments
shift |
Floating-point Tensor. If this is set to NULL, no shift is applied. |
scale_identity_multiplier |
floating point rank 0 Tensor representing a scaling done
to the identity matrix. When |
scale_diag |
Floating-point Tensor representing the diagonal matrix.
|
scale_tril |
Floating-point Tensor representing the lower triangular matrix.
|
scale_perturb_factor |
Floating-point Tensor representing factor matrix with last
two dimensions of shape |
scale_perturb_diag |
Floating-point Tensor representing the diagonal matrix.
|
adjoint |
Logical indicating whether to use the scale matrix as specified or its adjoint. Default value: FALSE. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
dtype |
|
Details
If NULL of scale_identity_multiplier
, scale_diag
, or scale_tril
are specified then
scale += IdentityMatrix
Otherwise specifying a scale argument has the semantics of
scale += Expand(arg)
, i.e., scale_diag != NULL
means scale += tf$diag(scale_diag)
.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X; shift, scale) = scale @ X + shift
Description
shift
is a numeric Tensor and scale is a LinearOperator.
If X
is a scalar then the forward transformation is: scale * X + shift
where *
denotes broadcasted elementwise product.
Usage
tfb_affine_linear_operator(
shift = NULL,
scale = NULL,
adjoint = FALSE,
validate_args = FALSE,
name = "affine_linear_operator"
)
Arguments
shift |
Floating-point Tensor. |
scale |
Subclass of LinearOperator. Represents the (batch) positive definite matrix |
adjoint |
Logical indicating whether to use the scale matrix as specified or its adjoint. Default value: FALSE. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
AffineScalar bijector (Deprecated)
Description
This Bijector is initialized with shift Tensor and scale arguments, giving the forward operation:
Y = g(X) = scale * X + shift
If scale
is not specified, then the bijector has the semantics of scale = 1..
Similarly, if shift
is not specified, then the bijector has the semantics of shift = 0..
Usage
tfb_affine_scalar(
shift = NULL,
scale = NULL,
validate_args = FALSE,
name = "affine_scalar"
)
Arguments
shift |
Floating-point Tensor. If this is set to NULL, no shift is applied. |
scale |
Floating-point Tensor. If this is set to NULL, no scale is applied. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Maps unconstrained R^n to R^n in ascending order.
Description
Both the domain and the codomain of the mapping is [-inf, inf]^n
, however,
the input of the inverse mapping must be strictly increasing.
On the last dimension of the tensor, the Ascending bijector performs:
y = tf$cumsum([x[0], tf$exp(x[1]), tf$exp(x[2]), ..., tf$exp(x[-1])])
Usage
tfb_ascending(validate_args = FALSE, name = "ascending")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X)
s.t. X = g^-1(Y) = (Y - mean(Y)) / std(Y)
Description
Applies Batch Normalization (Ioffe and Szegedy, 2015) to samples from a data distribution. This can be used to stabilize training of normalizing flows (Papamakarios et al., 2016; Dinh et al., 2017)
Usage
tfb_batch_normalization(
batchnorm_layer = NULL,
training = TRUE,
validate_args = FALSE,
name = "batch_normalization"
)
Arguments
batchnorm_layer |
|
training |
If TRUE, updates running-average statistics during call to inverse(). |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
When training Deep Neural Networks (DNNs), it is common practice to normalize or whiten features by shifting them to have zero mean and scaling them to have unit variance.
The inverse()
method of the BatchNormalization bijector, which is used in
the log-likelihood computation of data samples, implements the normalization
procedure (shift-and-scale) using the mean and standard deviation of the
current minibatch.
Conversely, the forward()
method of the bijector de-normalizes samples (e.g.
X*std(Y) + mean(Y)
with the running-average mean and standard deviation
computed at training-time. De-normalization is useful for sampling.
During training time, BatchNormalization.inverse and BatchNormalization.forward are not
guaranteed to be inverses of each other because inverse(y)
uses statistics of the current minibatch,
while forward(x)
uses running-average statistics accumulated from training.
In other words, tfb_batch_normalization()$inverse(tfb_batch_normalization()$forward(...))
and
tfb_batch_normalization()$forward(tfb_batch_normalization()$inverse(...))
will be identical when
training=FALSE but may be different when training=TRUE.
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector which applies a list of bijectors to blocks of a Tensor
Description
More specifically, given [F_0, F_1, ... F_n]
which are scalar or vector
bijectors this bijector creates a transformation which operates on the vector
[x_0, ... x_n]
with the transformation [F_0(x_0), F_1(x_1) ..., F_n(x_n)]
where x_0, ..., x_n
are blocks (partitions) of the vector.
Usage
tfb_blockwise(
bijectors,
block_sizes = NULL,
validate_args = FALSE,
name = NULL
)
Arguments
bijectors |
A non-empty list of bijectors. |
block_sizes |
A 1-D integer Tensor with each element signifying the length of the block of the input vector to pass to the corresponding bijector. The length of block_sizes must be be equal to the length of bijectors. If left as NULL, a vector of 1's is used. |
validate_args |
Logical indicating whether arguments should be checked for correctness. |
name |
String, name given to ops managed by this object. Default:
E.g., |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector which applies a sequence of bijectors
Description
Bijector which applies a sequence of bijectors
Usage
tfb_chain(
bijectors = NULL,
validate_args = FALSE,
validate_event_size = TRUE,
parameters = NULL,
name = NULL
)
Arguments
bijectors |
list of bijector instances. An empty list makes this bijector equivalent to the Identity bijector. |
validate_args |
Logical indicating whether arguments should be checked for correctness. |
validate_event_size |
Checks that bijectors are not applied to inputs with
incomplete support (that is, inputs where one or more elements are a
deterministic transformation of the others). For example, the following
LDJ would be incorrect:
|
parameters |
Locals dict captured by subclass constructor, to be used for copy/slice re-instantiation operators. |
name |
String, name given to ops managed by this object. Default:
E.g., |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computesg(X) = X @ X.T
where X
is lower-triangular, positive-diagonal matrix
Description
Note: the upper-triangular part of X is ignored (whether or not its zero).
Usage
tfb_cholesky_outer_product(
validate_args = FALSE,
name = "cholesky_outer_product"
)
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
The surjectivity of g as a map from the set of n x n positive-diagonal
lower-triangular matrices to the set of SPD matrices follows immediately from
executing the Cholesky factorization algorithm on an SPD matrix A
to produce a
positive-diagonal lower-triangular matrix L
such that A = L @ L.T
.
To prove the injectivity of g, suppose that L_1
and L_2
are lower-triangular
with positive diagonals and satisfy A = L_1 @ L_1.T = L_2 @ L_2.T
. Then
inv(L_1) @ A @ inv(L_1).T = [inv(L_1) @ L_2] @ [inv(L_1) @ L_2].T = I
.
Setting L_3 := inv(L_1) @ L_2
, that L_3
is a positive-diagonal
lower-triangular matrix follows from inv(L_1)
being positive-diagonal
lower-triangular (which follows from the diagonal of a triangular matrix being
its spectrum), and that the product of two positive-diagonal lower-triangular
matrices is another positive-diagonal lower-triangular matrix.
A simple inductive argument (proceeding one column of L_3
at a time) shows
that, if I = L_3 @ L_3.T
, with L_3
being lower-triangular with positive-
diagonal, then L_3 = I
. Thus, L_1 = L_2
, proving injectivity of g.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Maps the Cholesky factor of M to the Cholesky factor of M^{-1}
Description
The forward and inverse calculations are conceptually identical to:
forward <- function(x) tf$cholesky(tf$linalg$inv(tf$matmul(x, x, adjoint_b=TRUE)))
inverse = forward
However, the actual calculations exploit the triangular structure of the matrices.
Usage
tfb_cholesky_to_inv_cholesky(
validate_args = FALSE,
name = "cholesky_to_inv_cholesky"
)
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Maps unconstrained reals to Cholesky-space correlation matrices.
Description
This bijector is a mapping between R^{n}
and the n
-dimensional manifold of
Cholesky-space correlation matrices embedded in R^{m^2}
, where n
is the
(m - 1)
th triangular number; i.e. n = 1 + 2 + ... + (m - 1)
.
Usage
tfb_correlation_cholesky(validate_args = FALSE, name = "correlation_cholesky")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The image of unconstrained reals under the CorrelationCholesky
bijector is
the set of correlation matrices which are positive definite.
A correlation matrix
can be characterized as a symmetric positive semidefinite matrix with 1s on
the main diagonal. However, the correlation matrix is positive definite if no
component can be expressed as a linear combination of the other components.
For a lower triangular matrix L
to be a valid Cholesky-factor of a positive
definite correlation matrix, it is necessary and sufficient that each row of
L
have unit Euclidean norm. To see this, observe that if L_i
is the
i
th row of the Cholesky factor corresponding to the correlation matrix R
,
then the i
th diagonal entry of R
satisfies:
1 = R_i,i = L_i . L_i = ||L_i||^2
where '.' is the dot product of vectors and ||...||
denotes the Euclidean
norm. Furthermore, observe that R_i,j
lies in the interval [-1, 1]
. By the
Cauchy-Schwarz inequality:
|R_i,j| = |L_i . L_j| <= ||L_i|| ||L_j|| = 1
This is a consequence of the fact that R
is symmetric positive definite with
1s on the main diagonal.
The LKJ distribution with input_output_cholesky=TRUE
generates samples from
(and computes log-densities on) the set of Cholesky factors of positive
definite correlation matrices. The CorrelationCholesky
bijector provides
a bijective mapping from unconstrained reals to the support of the LKJ
distribution.
Value
a bijector instance.
References
-
Stan Manual. Section 24.2. Cholesky LKJ Correlation Distribution.
Daniel Lewandowski, Dorota Kurowicka, and Harry Joe, "Generating random correlation matrices based on vines and extended onion method," Journal of Multivariate Analysis 100 (2009), pp 1989-2001.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes the cumulative sum of a tensor along a specified axis.
Description
Computes the cumulative sum of a tensor along a specified axis.
Usage
tfb_cumsum(axis = -1, validate_args = FALSE, name = "cumsum")
Arguments
axis |
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = DCT(X)
, where DCT type is indicated by the type arg
Description
The discrete cosine transform
efficiently applies a unitary DCT operator. This can be useful for mixing and decorrelating across
the innermost event dimension.
The inverse X = g^{-1}(Y) = IDCT(Y)
, where IDCT is DCT-III for type==2.
This bijector can be interleaved with Affine bijectors to build a cascade of
structured efficient linear layers as in Moczulski et al., 2016.
Note that the operator applied is orthonormal (i.e. norm='ortho').
Usage
tfb_discrete_cosine_transform(
validate_args = FALSE,
dct_type = 2,
name = "dct"
)
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
dct_type |
integer, the DCT type performed by the forward transformation. Currently, only 2 and 3 are supported. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY=g(X)=exp(X)
Description
ComputesY=g(X)=exp(X)
Usage
tfb_exp(validate_args = FALSE, name = "exp")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = exp(X) - 1
Description
This Bijector is no different from tfb_chain(list(tfb_affine_scalar(shift=-1), tfb_exp()))
.
However, this makes use of the more numerically stable routines
tf$math$expm1
and tf$log1p
.
Usage
tfb_expm1(validate_args = FALSE, name = "expm1")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Note: the expm1(.) is applied element-wise but the Jacobian is a reduction over the event space.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Implements a continuous normalizing flow X->Y defined via an ODE.
Description
This bijector implements a continuous dynamics transformation parameterized by a differential equation, where initial and terminal conditions correspond to domain (X) and image (Y) i.e.
Usage
tfb_ffjord(
state_time_derivative_fn,
ode_solve_fn = NULL,
trace_augmentation_fn = tfp$bijectors$ffjord$trace_jacobian_hutchinson,
initial_time = 0,
final_time = 1,
validate_args = FALSE,
dtype = tf$float32,
name = "ffjord"
)
Arguments
state_time_derivative_fn |
|
ode_solve_fn |
|
trace_augmentation_fn |
|
initial_time |
Scalar float representing time to which the |
final_time |
Scalar float representing time to which the |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
dtype |
|
name |
name prefixed to Ops created by this class. |
Details
d/dt[state(t)] = state_time_derivative_fn(t, state(t)) state(initial_time) = X state(final_time) = Y
For this transformation the value of log_det_jacobian
follows another
differential equation, reducing it to computation of the trace of the jacobian
along the trajectory
state_time_derivative = state_time_derivative_fn(t, state(t)) d/dt[log_det_jac(t)] = Tr(jacobian(state_time_derivative, state(t)))
FFJORD constructor takes two functions ode_solve_fn
and
trace_augmentation_fn
arguments that customize integration of the
differential equation and trace estimation.
Differential equation integration is performed by a call to ode_solve_fn
.
Custom ode_solve_fn
must accept the following arguments:
ode_fn(time, state): Differential equation to be solved.
initial_time: Scalar float or floating Tensor representing the initial time.
initial_state: Floating Tensor representing the initial state.
solution_times: 1D floating Tensor of solution times.
And return a Tensor of shape [solution_times$shape, initial_state$shape]
representing state values evaluated at solution_times
. In addition
ode_solve_fn
must support nested structures. For more details see the
interface of tfp$math$ode$Solver$solve()
.
Trace estimation is computed simultaneously with state_time_derivative
using augmented_state_time_derivative_fn
that is generated by
trace_augmentation_fn
. trace_augmentation_fn
takes
state_time_derivative_fn
, state.shape
and state.dtype
arguments and
returns a augmented_state_time_derivative_fn
callable that computes both
state_time_derivative
and unreduced trace_estimation
.
Custom ode_solve_fn
and trace_augmentation_fn
examples:
# custom_solver_fn: `function(f, t_initial, t_solutions, y_initial, ...)` # ... : Additional arguments to pass to custom_solver_fn. ode_solve_fn <- function(ode_fn, initial_time, initial_state, solution_times) { custom_solver_fn(ode_fn, initial_time, solution_times, initial_state, ...) } ffjord <- tfb_ffjord(state_time_derivative_fn, ode_solve_fn = ode_solve_fn)
# state_time_derivative_fn: `function(time, state)` # trace_jac_fn: `function(time, state)` unreduced jacobian trace function trace_augmentation_fn <- function(ode_fn, state_shape, state_dtype) { augmented_ode_fn <- function(time, state) { list(ode_fn(time, state), trace_jac_fn(time, state)) } augmented_ode_fn } ffjord <- tfb_ffjord(state_time_derivative_fn, trace_augmentation_fn = trace_augmentation_fn)
For more details on FFJORD and continous normalizing flows see Chen et al. (2018), Grathwol et al. (2018).
Value
a bijector instance.
References
Chen, T. Q., Rubanova, Y., Bettencourt, J., & Duvenaud, D. K. (2018). Neural ordinary differential equations. In Advances in neural information processing systems (pp. 6571-6583)
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Transforms unconstrained vectors to TriL matrices with positive diagonal
Description
This is implemented as a simple tfb_chain of tfb_fill_triangular followed by tfb_transform_diagonal, and provided mostly as a convenience. The default setup is somewhat opinionated, using a Softplus transformation followed by a small shift (1e-5) which attempts to avoid numerical issues from zeros on the diagonal.
Usage
tfb_fill_scale_tri_l(
diag_bijector = NULL,
diag_shift = 1e-05,
validate_args = FALSE,
name = "fill_scale_tril"
)
Arguments
diag_bijector |
Bijector instance, used to transform the output diagonal to be positive.
Default value: NULL (i.e., |
diag_shift |
Float value broadcastable and added to all diagonal entries after applying the diag_bijector. Setting a positive value forces the output diagonal entries to be positive, but prevents inverting the transformation for matrices with diagonal entries less than this value. Default value: 1e-5. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Transforms vectors to triangular
Description
Triangular matrix elements are filled in a clockwise spiral.
Given input with shape batch_shape + [d]
, produces output with
shape batch_shape + [n, n]
, where n = (-1 + sqrt(1 + 8 * d))/2
.
This follows by solving the quadratic equation d = 1 + 2 + ... + n = n * (n + 1)/2
.
Usage
tfb_fill_triangular(
upper = FALSE,
validate_args = FALSE,
name = "fill_triangular"
)
Arguments
upper |
Logical representing whether output matrix should be upper triangular (TRUE) or lower triangular (FALSE, default). |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Returns the forward Bijector evaluation, i.e., X = g(Y)
.
Description
Returns the forward Bijector evaluation, i.e., X = g(Y)
.
Usage
tfb_forward(bijector, x, name = "forward")
Arguments
bijector |
The bijector to apply |
x |
Tensor. The input to the "forward" evaluation. |
name |
name of the operation |
Value
a tensor
See Also
Other bijector_methods:
tfb_forward_log_det_jacobian()
,
tfb_inverse_log_det_jacobian()
,
tfb_inverse()
Examples
b <- tfb_affine_scalar(shift = 1, scale = 2)
x <- 10
b %>% tfb_forward(x)
Returns the result of the forward evaluation of the log determinant of the Jacobian
Description
Returns the result of the forward evaluation of the log determinant of the Jacobian
Usage
tfb_forward_log_det_jacobian(
bijector,
x,
event_ndims,
name = "forward_log_det_jacobian"
)
Arguments
bijector |
The bijector to apply |
x |
Tensor. The input to the "forward" Jacobian determinant evaluation. |
event_ndims |
Number of dimensions in the probabilistic events being transformed. Must be greater than or equal to bijector$forward_min_event_ndims. The result is summed over the final dimensions to produce a scalar Jacobian determinant for each event, i.e. it has shape x$shape$ndims - event_ndims dimensions. |
name |
name of the operation |
Value
a tensor
See Also
Other bijector_methods:
tfb_forward()
,
tfb_inverse_log_det_jacobian()
,
tfb_inverse()
Examples
b <- tfb_affine_scalar(shift = 1, scale = 2)
x <- 10
b %>% tfb_forward_log_det_jacobian(x, event_ndims = 0)
Implements the Glow Bijector from Kingma & Dhariwal (2018).
Description
Overview: Glow
is a chain of bijectors which transforms a rank-1 tensor
(vector) into a rank-3 tensor (e.g. an RGB image). Glow
does this by
chaining together an alternating series of "Blocks," "Squeezes," and "Exits"
which are each themselves special chains of other bijectors. The intended use
of Glow
is as part of a tfd_transformed_distribution
, in
which the base distribution over the vector space is used to generate samples
in the image space. In the paper, an Independent Normal distribution is used
as the base distribution.
Usage
tfb_glow(
output_shape = c(32, 32, 3),
num_glow_blocks = 3,
num_steps_per_block = 32,
coupling_bijector_fn = NULL,
exit_bijector_fn = NULL,
grab_after_block = NULL,
use_actnorm = TRUE,
seed = NULL,
validate_args = FALSE,
name = "glow"
)
Arguments
output_shape |
A list of integers, specifying the event shape of the
output, of the bijectors forward pass (the image). Specified as
|
num_glow_blocks |
An integer, specifying how many downsampling levels to include in the model. This must divide equally into both H and W, otherwise the bijector would not be invertible. Default Value: 3 |
num_steps_per_block |
An integer specifying how many Affine Coupling and 1x1 convolution layers to include at each level of the spatial hierarchy. Default Value: 32 (i.e. the value used in the original glow paper). |
coupling_bijector_fn |
A function which takes the argument |
exit_bijector_fn |
Similar to coupling_bijector_fn, exit_bijector_fn is
a function which takes the argument |
grab_after_block |
A tuple of floats, specifying what fraction of the remaining channels to remove following each glow block. Glow will take the integer floor of this number multiplied by the remaining number of channels. The default is half at each spatial hierarchy. Default value: None (this will take out half of the channels after each block. |
use_actnorm |
A boolean deciding whether or not to use actnorm. Data-dependent
initialization is used to initialize this layer. Default value: |
seed |
A seed to control randomness in the 1x1 convolution initialization.
Default value: |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
A "Block" (implemented as the GlowBlock
Bijector) performs much of the
transformations which allow glow to produce sophisticated and complex mappings
between the image space and the latent space and therefore achieve rich image
generation performance. A Block is composed of num_steps_per_block
steps,
which are each implemented as a Chain
containing an
ActivationNormalization
(ActNorm) bijector, followed by an (invertible)
OneByOneConv
bijector, and finally a coupling bijector. The coupling
bijector is an instance of a RealNVP
bijector, and uses the
coupling_bijector_fn
function to instantiate the coupling bijector function
which is given to the RealNVP
. This function returns a bijector which
defines the coupling (e.g. Shift(Scale)
for affine coupling or Shift
for
additive coupling).
A "Squeeze" converts spatial features into channel features. It is
implemented using the Expand
bijector. The difference in names is
due to the fact that the forward
function from glow is meant to ultimately
correspond to sampling from a tfp$util$TransformedDistribution
object,
which would use Expand
(Squeeze is just Invert(Expand)). The Expand
bijector takes a tensor with shape [H, W, C]
and returns a tensor with shape
[2H, 2W, C / 4]
, such that each 2x2x1 spatial tile in the output is composed
from a single 1x1x4 tile in the input tensor, as depicted in the figure below.
Forward pass (Expand)
\ \ \ \ \ \\ \ ----> \ 1 \ 2 \ \\\__1__\ \____\____\ \\\__2__\ \ \ \ \\__3__\ <---- \ 3 \ 4 \ \__4__\ \____\____\
Inverse pass (Squeeze)
This is implemented using a chain of Reshape
-> Transpose
-> Reshape
bijectors. Note that on an inverse pass through the bijector, each Squeeze
will cause the width/height of the image to decrease by a factor of 2.
Therefore, the input image must be evenly divisible by 2 at least
num_glow_blocks
times, since it will pass through a Squeeze step that many
times.
An "Exit" is simply a junction at which some of the tensor "exits" from the
glow bijector and therefore avoids any further alteration. Each exit is
implemented as a Blockwise
bijector, where some channels are given to the
rest of the glow model, and the rest are given to a bypass implemented using
the Identity
bijector. The fraction of channels to be removed at each exit
is determined by the grab_after_block
arg, indicates the fraction of
remaining channels which join the identity bypass. The fraction is
converted to an integer number of channels by multiplying by the remaining
number of channels and rounding.
Additionally, at each exit, glow couples the tensor exiting the highway to
the tensor continuing onward. This makes small scale features in the image
dependent on larger scale features, since the larger scale features dictate
the mean and scale of the distribution over the smaller scale features.
This coupling is done similarly to the Coupling bijector in each step of the
flow (i.e. using a RealNVP bijector). However for the exit bijector, the
coupling is instantiated using exit_bijector_fn
rather than coupling
bijector fn, allowing for different behaviors between standard coupling and
exit coupling. Also note that because the exit utilizes a coupling bijector,
there are two special cases (all channels exiting and no channels exiting).
The full Glow bijector consists of num_glow_blocks
Blocks each of which
contains num_steps_per_block
steps. Each step implements a coupling using
bijector_coupling_fn
. Between blocks, glow converts between spatial pixels
and channels using the Expand Bijector, and splits channels out of the
bijector using the Exit Bijector. The channels which have exited continue
onward through Identity bijectors and those which have not exited are given
to the next block. After passing through all Blocks, the tensor is reshaped
to a rank-1 tensor with the same number of elements. This is where the
distribution will be defined.
A schematic diagram of Glow is shown below. The forward
function of the
bijector starts from the bottom and goes upward, while the inverse
function
starts from the top and proceeds downward.
Value
a bijector instance.
#' “'
Glow Schematic Diagram Input Image ######################## shape = [H, W, C] \ /<- Expand Bijector turns spatial \ / dimensions into channels. | XXXXXXXXXXXXXXXXXXXX | XXXXXXXXXXXXXXXXXXXX | XXXXXXXXXXXXXXXXXXXX A single step of the flow consists Glow Block - | XXXXXXXXXXXXXXXXXXXX <- of ActNorm -> 1x1Conv -> Coupling. | XXXXXXXXXXXXXXXXXXXX there are num_steps_per_block | XXXXXXXXXXXXXXXXXXXX steps of the flow in each block. |_ XXXXXXXXXXXXXXXXXXXX \ / <– Expand bijectors follow each glow \ / block XXXXXXXX\\\\ <– Exit Bijector removes channels _ _ from additional alteration. | XXXXXXXX ! | ! | XXXXXXXX ! | ! | XXXXXXXX ! | ! After exiting, channels are passed Glow Block - | XXXXXXXX ! | ! <— downward using the Blockwise and | XXXXXXXX ! | ! Identify bijectors. | XXXXXXXX ! | ! |_ XXXXXXXX ! | ! \ / <—- Expand Bijector \ / XXX\\ | ! <—- Exit Bijector _ | XXX ! | | ! | XXX ! | | ! | XXX ! | | ! low Block - | XXX ! | | ! | XXX ! | | ! | XXX ! | | ! |_ XXX ! | | ! XX\ ! | | ! <—– (Optional) Exit Bijector | | | v v v Output Distribution ########## shape = [H * W * C]
Legend
| XX = Step of flow | | X\ = Exit bijector | | \/ = Expand bijector | | !|! = Identity bijector | | | | up = Forward pass | | dn = Inverse pass | |_________________________|
[H, W, C]: R:H,%20W,%20C [2H, 2W, C / 4]: R:2H,%202W,%20C%20/%204 [H, W, C]: R:H,%20W,%20C [H * W * C]: R:H%20*%20W%20*%20C
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X) = 1 - exp(-c * (exp(rate * X) - 1)
, the Gompertz CDF.
Description
This bijector maps inputs from [-inf, inf]
to [0, inf]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the
Gompertz distribution:
Y ~ GompertzCDF(concentration, rate) pdf(y; c, r) = r * c * exp(r * y + c - c * exp(-c * exp(r * y)))
Note: Because the Gompertz distribution concentrates its mass close to zero,
for larger rates or larger concentrations, bijector.forward
will quickly
saturate to 1.
Usage
tfb_gompertz_cdf(
concentration,
rate,
validate_args = FALSE,
name = "gompertz_cdf"
)
Arguments
concentration |
Positive Float-like |
rate |
Positive Float-like |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = exp(-exp(-(X - loc) / scale))
Description
This bijector maps inputs from [-inf, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the Gumbel distribution:
Usage
tfb_gumbel(loc = 0, scale = 1, validate_args = FALSE, name = "gumbel")
Arguments
loc |
Float-like Tensor that is the same dtype and is broadcastable with scale.
This is loc in |
scale |
Positive Float-like Tensor that is the same dtype and is broadcastable with loc.
This is scale in |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Y ~ Gumbel(loc, scale)
pdf(y; loc, scale) = exp(-( (y - loc) / scale + exp(- (y - loc) / scale) ) ) / scale
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X) = exp(-exp(-(X - loc) / scale))
, the Gumbel CDF.
Description
This bijector maps inputs from [-inf, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the Gumbel distribution:
Usage
tfb_gumbel_cdf(loc = 0, scale = 1, validate_args = FALSE, name = "gumbel_cdf")
Arguments
loc |
Float-like |
scale |
Positive Float-like |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Y ~ GumbelCDF(loc, scale) pdf(y; loc, scale) = exp(-( (y - loc) / scale + exp(- (y - loc) / scale) ) ) / scale
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = X
Description
ComputesY = g(X) = X
Usage
tfb_identity(validate_args = FALSE, name = "identity")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector constructed from custom functions
Description
Bijector constructed from custom functions
Usage
tfb_inline(
forward_fn = NULL,
inverse_fn = NULL,
inverse_log_det_jacobian_fn = NULL,
forward_log_det_jacobian_fn = NULL,
forward_event_shape_fn = NULL,
forward_event_shape_tensor_fn = NULL,
inverse_event_shape_fn = NULL,
inverse_event_shape_tensor_fn = NULL,
is_constant_jacobian = NULL,
validate_args = FALSE,
forward_min_event_ndims = NULL,
inverse_min_event_ndims = NULL,
name = "inline"
)
Arguments
forward_fn |
Function implementing the forward transformation. |
inverse_fn |
Function implementing the inverse transformation. |
inverse_log_det_jacobian_fn |
Function implementing the log_det_jacobian of the forward transformation. |
forward_log_det_jacobian_fn |
Function implementing the log_det_jacobian of the inverse transformation. |
forward_event_shape_fn |
Function implementing non-identical static event shape changes. Default: shape is assumed unchanged. |
forward_event_shape_tensor_fn |
Function implementing non-identical event shape changes. Default: shape is assumed unchanged. |
inverse_event_shape_fn |
Function implementing non-identical static event shape changes. Default: shape is assumed unchanged. |
inverse_event_shape_tensor_fn |
Function implementing non-identical event shape changes. Default: shape is assumed unchanged. |
is_constant_jacobian |
Logical indicating that the Jacobian is constant for all input arguments. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
forward_min_event_ndims |
Integer indicating the minimal dimensionality this bijector acts on. |
inverse_min_event_ndims |
Integer indicating the minimal dimensionality this bijector acts on. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Returns the inverse Bijector evaluation, i.e., X = g^{-1}(Y)
.
Description
Returns the inverse Bijector evaluation, i.e., X = g^{-1}(Y)
.
Usage
tfb_inverse(bijector, y, name = "inverse")
Arguments
bijector |
The bijector to apply |
y |
Tensor. The input to the "inverse" evaluation. |
name |
name of the operation |
Value
a tensor
See Also
Other bijector_methods:
tfb_forward_log_det_jacobian()
,
tfb_forward()
,
tfb_inverse_log_det_jacobian()
Examples
b <- tfb_affine_scalar(shift = 1, scale = 2)
x <- 10
y <- b %>% tfb_forward(x)
b %>% tfb_inverse(y)
Returns the result of the inverse evaluation of the log determinant of the Jacobian
Description
Returns the result of the inverse evaluation of the log determinant of the Jacobian
Usage
tfb_inverse_log_det_jacobian(
bijector,
y,
event_ndims,
name = "inverse_log_det_jacobian"
)
Arguments
bijector |
The bijector to apply |
y |
Tensor. The input to the "inverse" Jacobian determinant evaluation. |
event_ndims |
Number of dimensions in the probabilistic events being transformed. Must be greater than or equal to bijector$inverse_min_event_ndims. The result is summed over the final dimensions to produce a scalar Jacobian determinant for each event, i.e. it has shape x$shape$ndims - event_ndims dimensions. |
name |
name of the operation |
Value
a tensor
See Also
Other bijector_methods:
tfb_forward_log_det_jacobian()
,
tfb_forward()
,
tfb_inverse()
Examples
b <- tfb_affine_scalar(shift = 1, scale = 2)
x <- 10
y <- b %>% tfb_forward(x)
b %>% tfb_inverse_log_det_jacobian(y, event_ndims = 0)
Bijector which inverts another Bijector
Description
Creates a Bijector which swaps the meaning of inverse and forward.
Note: An inverted bijector's inverse_log_det_jacobian is often more
efficient if the base bijector implements _forward_log_det_jacobian. If
_forward_log_det_jacobian is not implemented then the following code is
used:
y = b$inverse(x)
-b$inverse_log_det_jacobian(y)
Usage
tfb_invert(bijector, validate_args = FALSE, name = NULL)
Arguments
bijector |
Bijector instance. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector which applies a Stick Breaking procedure.
Description
Bijector which applies a Stick Breaking procedure.
Usage
tfb_iterated_sigmoid_centered(validate_args = FALSE, name = "iterated_sigmoid")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = (1 - (1 - X)**(1 / b))**(1 / a)
, with X in [0, 1]
Description
This bijector maps inputs from [0, 1]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1) gives back a
random variable with the Kumaraswamy distribution:
Y ~ Kumaraswamy(a, b)
pdf(y; a, b, 0 <= y <= 1) = a * b * y ** (a - 1) * (1 - y**a) ** (b - 1)
Usage
tfb_kumaraswamy(
concentration1 = NULL,
concentration0 = NULL,
validate_args = FALSE,
name = "kumaraswamy"
)
Arguments
concentration1 |
float scalar indicating the transform power, i.e.,
|
concentration0 |
float scalar indicating the transform power,
i.e., |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = (1 - (1 - X)**(1 / b))**(1 / a)
, with X in [0, 1]
Description
This bijector maps inputs from [0, 1]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1) gives back a
random variable with the Kumaraswamy distribution:
Y ~ Kumaraswamy(a, b)
pdf(y; a, b, 0 <= y <= 1) = a * b * y ** (a - 1) * (1 - y**a) ** (b - 1)
Usage
tfb_kumaraswamy_cdf(
concentration1 = 1,
concentration0 = 1,
validate_args = FALSE,
name = "kumaraswamy_cdf"
)
Arguments
concentration1 |
float scalar indicating the transform power, i.e.,
|
concentration0 |
float scalar indicating the transform power,
i.e., |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
LambertWTail transformation for heavy-tail Lambert W x F random variables.
Description
A random variable Y has a Lambert W x F distribution if W_tau(Y) = X has distribution F, where tau = (shift, scale, tail) parameterizes the inverse transformation.
Usage
tfb_lambert_w_tail(
shift = NULL,
scale = NULL,
tailweight = NULL,
validate_args = FALSE,
name = "lambertw_tail"
)
Arguments
shift |
Floating point tensor; the shift for centering (uncentering) the input (output) random variable(s). |
scale |
Floating point tensor; the scaling (unscaling) of the input (output) random variable(s). Must contain only positive values. |
tailweight |
Floating point tensor; the tail behaviors of the output random variable(s). Must contain only non-negative values. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
This bijector defines the transformation underlying Lambert W x F distributions that transform an input random variable to an output random variable with heavier tails. It is defined as Y = (U * exp(0.5 * tail * U^2)) * scale + shift, tail >= 0 where U = (X - shift) / scale is a shifted/scaled input random variable, and tail >= 0 is the tail parameter.
Attributes:
shift: shift to center (uncenter) the input data.
scale: scale to normalize (de-normalize) the input data.
tailweight: Tail parameter delta
of heavy-tail transformation; must be >= 0.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Masked Autoregressive Density Estimator
Description
This will be wrapped in a make_template to ensure the variables are only created once. It takes the input and returns the loc ("mu" in Germain et al. (2015)) and log_scale ("alpha" in Germain et al. (2015)) from the MADE network.
Usage
tfb_masked_autoregressive_default_template(
hidden_layers,
shift_only = FALSE,
activation = tf$nn$relu,
log_scale_min_clip = -5,
log_scale_max_clip = 3,
log_scale_clip_gradient = FALSE,
name = NULL,
...
)
Arguments
list-like of non-negative integer, scalars indicating the number
of units in each hidden layer. Default: | |
shift_only |
logical indicating if only the shift term shall be computed. Default: FALSE. |
activation |
Activation function (callable). Explicitly setting to NULL implies a linear activation. |
log_scale_min_clip |
float-like scalar Tensor, or a Tensor with the same shape as log_scale. The minimum value to clip by. Default: -5. |
log_scale_max_clip |
float-like scalar Tensor, or a Tensor with the same shape as log_scale. The maximum value to clip by. Default: 3. |
log_scale_clip_gradient |
logical indicating that the gradient of tf$clip_by_value should be preserved. Default: FALSE. |
name |
A name for ops managed by this function. Default: "tfb_masked_autoregressive_default_template". |
... |
|
Details
Warning: This function uses masked_dense to create randomly initialized
tf$Variables
. It is presumed that these will be fit, just as you would any
other neural architecture which uses tf$layers$dense
.
About Hidden Layers
Each element of hidden_layers should be greater than the input_depth
(i.e., input_depth = tf$shape(input)[-1]
where input is the input to the
neural network). This is necessary to ensure the autoregressivity property.
About Clipping
This function also optionally clips the log_scale (but possibly not its
gradient). This is useful because if log_scale is too small/large it might
underflow/overflow making it impossible for the MaskedAutoregressiveFlow
bijector to implement a bijection. Additionally, the log_scale_clip_gradient
bool indicates whether the gradient should also be clipped. The default does
not clip the gradient; this is useful because it still provides gradient
information (for fitting) yet solves the numerical stability problem. I.e.,
log_scale_clip_gradient = FALSE means grad[exp(clip(x))] = grad[x] exp(clip(x))
rather than the usual grad[clip(x)] exp(clip(x))
.
Value
list of:
shift:
Float
-likeTensor
of shift termslog_scale:
Float
-likeTensor
of log(scale) terms
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Affine MaskedAutoregressiveFlow bijector
Description
The affine autoregressive flow (Papamakarios et al., 2016) provides a relatively simple framework for user-specified (deep) architectures to learn a distribution over continuous events. Regarding terminology,
Usage
tfb_masked_autoregressive_flow(
shift_and_log_scale_fn,
is_constant_jacobian = FALSE,
unroll_loop = FALSE,
event_ndims = 1L,
validate_args = FALSE,
name = NULL
)
Arguments
shift_and_log_scale_fn |
Function which computes shift and log_scale from both the
forward domain (x) and the inverse domain (y).
Calculation must respect the "autoregressive property". Suggested default:
tfb_masked_autoregressive_default_template(hidden_layers=...).
Typically the function contains |
is_constant_jacobian |
Logical, default: FALSE. When TRUE the implementation assumes log_scale does not depend on the forward domain (x) or inverse domain (y) values. (No validation is made; is_constant_jacobian=FALSE is always safe but possibly computationally inefficient.) |
unroll_loop |
Logical indicating whether the |
event_ndims |
integer, the intrinsic dimensionality of this bijector.
1 corresponds to a simple vector autoregressive bijector as implemented by the
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
"Autoregressive models decompose the joint density as a product of conditionals, and model each conditional in turn. Normalizing flows transform a base density (e.g. a standard Gaussian) into the target density by an invertible transformation with tractable Jacobian." (Papamakarios et al., 2016)
In other words, the "autoregressive property" is equivalent to the
decomposition, p(x) = prod{ p(x[perm[i]] | x[perm[0:i]]) : i=0, ..., d }
where perm is some permutation of {0, ..., d}
. In the simple case where
the permutation is identity this reduces to:
p(x) = prod{ p(x[i] | x[0:i]) : i=0, ..., d }
. The provided
shift_and_log_scale_fn, tfb_masked_autoregressive_default_template, achieves
this property by zeroing out weights in its masked_dense layers.
In TensorFlow Probability, "normalizing flows" are implemented as
tfp.bijectors.Bijectors. The forward "autoregression" is implemented
using a tf.while_loop and a deep neural network (DNN) with masked weights
such that the autoregressive property is automatically met in the inverse.
A TransformedDistribution using MaskedAutoregressiveFlow(...) uses the
(expensive) forward-mode calculation to draw samples and the (cheap)
reverse-mode calculation to compute log-probabilities. Conversely, a
TransformedDistribution using Invert(MaskedAutoregressiveFlow(...)) uses
the (expensive) forward-mode calculation to compute log-probabilities and the
(cheap) reverse-mode calculation to compute samples.
Given a shift_and_log_scale_fn, the forward and inverse transformations are (a sequence of) affine transformations. A "valid" shift_and_log_scale_fn must compute each shift (aka loc or "mu" in Germain et al. (2015)]) and log(scale) (aka "alpha" in Germain et al. (2015)) such that ech are broadcastable with the arguments to forward and inverse, i.e., such that the calculations in forward, inverse below are possible.
For convenience, tfb_masked_autoregressive_default_template is offered as a possible shift_and_log_scale_fn function. It implements the MADE architecture (Germain et al., 2015). MADE is a feed-forward network that computes a shift and log(scale) using masked_dense layers in a deep neural network. Weights are masked to ensure the autoregressive property. It is possible that this architecture is suboptimal for your task. To build alternative networks, either change the arguments to tfb_masked_autoregressive_default_template, use the masked_dense function to roll-out your own, or use some other architecture, e.g., using tf.layers. Warning: no attempt is made to validate that the shift_and_log_scale_fn enforces the "autoregressive property".
Assuming shift_and_log_scale_fn has valid shape and autoregressive semantics, the forward transformation is
def forward(x): y = zeros_like(x) event_size = x.shape[-event_dims:].num_elements() for _ in range(event_size): shift, log_scale = shift_and_log_scale_fn(y) y = x * tf.exp(log_scale) + shift return y
and the inverse transformation is
def inverse(y): shift, log_scale = shift_and_log_scale_fn(y) return (y - shift) / tf.exp(log_scale)
Notice that the inverse does not need a for-loop. This is because in the forward pass each calculation of shift and log_scale is based on the y calculated so far (not x). In the inverse, the y is fully known, thus is equivalent to the scaling used in forward after event_size passes, i.e., the "last" y used to compute shift, log_scale. (Roughly speaking, this also proves the transform is bijective.)
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Autoregressively masked dense layer
Description
Analogous to tf$layers$dense
.
Usage
tfb_masked_dense(
inputs,
units,
num_blocks = NULL,
exclusive = FALSE,
kernel_initializer = NULL,
reuse = NULL,
name = NULL,
...
)
Arguments
inputs |
Tensor input. |
units |
integer scalar representing the dimensionality of the output space. |
num_blocks |
integer scalar representing the number of blocks for the MADE masks. |
exclusive |
logical scalar representing whether to zero the diagonal of the mask, used for the first layer of a MADE. |
kernel_initializer |
Initializer function for the weight matrix.
If NULL (default), weights are initialized using the |
reuse |
logical scalar representing whether to reuse the weights of a previous layer by the same name. |
name |
string used to describe ops managed by this function. |
... |
|
Details
See Germain et al. (2015)for detailed explanation.
Value
tensor
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes g(L) = inv(L)
, where L is a lower-triangular matrix
Description
L must be nonsingular; equivalently, all diagonal entries of L must be nonzero.
The input must have rank >= 2. The input is treated as a batch of matrices
with batch shape input.shape[:-2]
, where each matrix has dimensions
input.shape[-2]
by input.shape[-1]
(hence input.shape[-2]
must equal input.shape[-1]
).
Usage
tfb_matrix_inverse_tri_l(validate_args = FALSE, name = "matrix_inverse_tril")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Matrix-vector multiply using LU decomposition
Description
This bijector is identical to the "Convolution1x1" used in Glow (Kingma and Dhariwal, 2018).
Usage
tfb_matvec_lu(lower_upper, permutation, validate_args = FALSE, name = NULL)
Arguments
lower_upper |
The LU factorization as returned by |
permutation |
The LU factorization permutation as returned by |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Warning: this bijector never verifies the scale matrix (as parameterized by LU ecomposition) is invertible. Ensuring this is the case is the caller's responsibility.
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = NormalCDF(x)
Description
This bijector maps inputs from [-inf, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the Normal distribution:
Usage
tfb_normal_cdf(validate_args = FALSE, name = "normal")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Y ~ Normal(0, 1)
pdf(y; 0., 1.) = 1 / sqrt(2 * pi) * exp(-y ** 2 / 2)
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector which maps a tensor x_k that has increasing elements in the last dimension to an unconstrained tensor y_k
Description
Both the domain and the codomain of the mapping is [-inf, inf]
, however,
the input of the forward mapping must be strictly increasing.
The inverse of the bijector applied to a normal random vector y ~ N(0, 1)
gives back a sorted random vector with the same distribution x ~ N(0, 1)
where x = sort(y)
Usage
tfb_ordered(validate_args = FALSE, name = "ordered")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
On the last dimension of the tensor, Ordered bijector performs:
y[0] = x[0]
y[1:] = tf$log(x[1:] - x[:-1])
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Pads a value to the event_shape
of a Tensor
.
Description
The semantics of bijector_pad
generally follow that of tf$pad()
except that bijector_pad
's paddings
argument applies to the rightmost
dimensions. Additionally, the new argument axis
enables overriding the
dimensions to which paddings
is applied. Like paddings
, the axis
argument is also relative to the rightmost dimension and must therefore be
negative.
The argument paddings
is a vector of integer
pairs each representing the
number of left and/or right constant_values
to pad to the corresponding
righmost dimensions. That is, unless axis
is specified, specifiying
kdifferent
paddingsmeans the rightmost
kdimensions will be "grown" by the sum of the respective
paddingsrow. When
axisis specified, it indicates the dimension to which the corresponding
paddingselement is applied. By default
axisis
NULLwhich means it is logically equivalent to
range(start=-len(paddings), limit=0)', i.e., the rightmost dimensions.
Usage
tfb_pad(
paddings = list(c(0, 1)),
mode = "CONSTANT",
constant_values = 0,
axis = NULL,
validate_args = FALSE,
name = NULL
)
Arguments
paddings |
A vector-shaped |
mode |
One of |
constant_values |
In "CONSTANT" mode, the scalar pad value to use. Must be
same type as |
axis |
The dimensions for which |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Permutes the rightmost dimension of a Tensor
Description
Permutes the rightmost dimension of a Tensor
Usage
tfb_permute(permutation, axis = -1L, validate_args = FALSE, name = NULL)
Arguments
permutation |
An integer-like vector-shaped Tensor representing the permutation to apply to the axis dimension of the transformed Tensor. |
axis |
Scalar integer Tensor representing the dimension over which to tf$gather. axis must be relative to the end (reading left to right) thus must be negative. Default value: -1 (i.e., right-most). |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = (1 + X * c)**(1 / c)
, where X >= -1 / c
Description
The power transform maps
inputs from [0, inf]
to [-1/c, inf]
; this is equivalent to the inverse of this bijector.
This bijector is equivalent to the Exp bijector when c=0.
Usage
tfb_power_transform(power, validate_args = FALSE, name = "power_transform")
Arguments
power |
float scalar indicating the transform power, i.e.,
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
A piecewise rational quadratic spline, as developed in Conor et al.(2019).
Description
This transformation represents a monotonically increasing piecewise rational
quadratic function. Outside of the bounds of knot_x
/knot_y
, the transform
behaves as an identity function.
Usage
tfb_rational_quadratic_spline(
bin_widths,
bin_heights,
knot_slopes,
range_min = -1,
validate_args = FALSE,
name = NULL
)
Arguments
bin_widths |
The widths of the spans between subsequent knot |
bin_heights |
The heights of the spans between subsequent knot |
knot_slopes |
The slope of the spline at each knot, a floating point
|
range_min |
The |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Typically this bijector will be used as part of a chain, with splines for
trailing x
dimensions conditioned on some of the earlier x
dimensions, and
with the inverse then solved first for unconditioned dimensions, then using
conditioning derived from those inverses, and so forth.
For each argument, the innermost axis indexes bins/knots and batch axes
index axes of x
/y
spaces. A RationalQuadraticSpline
with a separate
transform for each of three dimensions might have bin_widths
shaped
[3, 32]
. To use the same spline for each of x
's three dimensions we may
broadcast against x
and use a bin_widths
parameter shaped [32]
.
Parameters will be broadcast against each other and against the input
x
/y
s, so if we want fixed slopes, we can use kwarg knot_slopes=1
.
A typical recipe for acquiring compatible bin widths and heights would be:
nbins <- unconstrained_vector$shape[-1] range_min <- 1 range_max <- 1 min_bin_size = 1e-2 scale <- range_max - range_min - nbins * min_bin_size bin_widths = tf$math$softmax(unconstrained_vector) * scale + min_bin_size
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X) = 1 - exp( -(X/scale)**2 / 2 ), X >= 0
.
Description
This bijector maps inputs from [0, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the
Rayleigh distribution:
Y ~ Rayleigh(scale) pdf(y; scale, y >= 0) = (1 / scale) * (y / scale) * exp(-(y / scale)**2 / 2)
Usage
tfb_rayleigh_cdf(scale, validate_args = FALSE, name = "rayleigh_cdf")
Arguments
scale |
Positive floating-point tensor.
This is |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Likewise, the forward of this bijector is the Rayleigh distribution CDF.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
RealNVP affine coupling layer for vector-valued events
Description
Real NVP models a normalizing flow on a D-dimensional distribution via a
single D-d-dimensional conditional distribution (Dinh et al., 2017):
y[d:D] = x[d:D] * tf.exp(log_scale_fn(x[0:d])) + shift_fn(x[0:d])
y[0:d] = x[0:d]
The last D-d units are scaled and shifted based on the first d units only,
while the first d units are 'masked' and left unchanged. Real NVP's
shift_and_log_scale_fn computes vector-valued quantities.
For scale-and-shift transforms that do not depend on any masked units, i.e.
d=0, use the tfb_affine bijector with learned parameters instead.
Masking is currently only supported for base distributions with
event_ndims=1. For more sophisticated masking schemes like checkerboard or
channel-wise masking (Papamakarios et al., 2016), use the tfb_permute
bijector to re-order desired masked units into the first d units. For base
distributions with event_ndims > 1, use the tfb_reshape bijector to
flatten the event shape.
Usage
tfb_real_nvp(
num_masked,
shift_and_log_scale_fn,
is_constant_jacobian = FALSE,
validate_args = FALSE,
name = NULL
)
Arguments
num_masked |
integer indicating that the first d units of the event
should be masked. Must be in the closed interval |
shift_and_log_scale_fn |
Function which computes shift and log_scale from both the
forward domain (x) and the inverse domain (y).
Calculation must respect the "autoregressive property". Suggested default:
|
is_constant_jacobian |
Logical, default: FALSE. When TRUE the implementation assumes log_scale does not depend on the forward domain (x) or inverse domain (y) values. (No validation is made; is_constant_jacobian=FALSE is always safe but possibly computationally inefficient.) |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Recall that the MAF bijector (Papamakarios et al., 2016) implements a normalizing flow via an autoregressive transformation. MAF and IAF have opposite computational tradeoffs - MAF can train all units in parallel but must sample units sequentially, while IAF must train units sequentially but can sample in parallel. In contrast, Real NVP can compute both forward and inverse computations in parallel. However, the lack of an autoregressive transformations makes it less expressive on a per-bijector basis.
A "valid" shift_and_log_scale_fn must compute each shift (aka loc or "mu" in Papamakarios et al. (2016) and log(scale) (aka "alpha" in Papamakarios et al. (2016)) such that each are broadcastable with the arguments to forward and inverse, i.e., such that the calculations in forward, inverse below are possible. For convenience, real_nvp_default_nvp is offered as a possible shift_and_log_scale_fn function.
NICE (Dinh et al., 2014) is a special case of the Real NVP bijector which discards the scale transformation, resulting in a constant-time inverse-log-determinant-Jacobian. To use a NICE bijector instead of Real NVP, shift_and_log_scale_fn should return (shift, NULL), and is_constant_jacobian should be set to TRUE in the RealNVP constructor. Calling tfb_real_nvp_default_template with shift_only=TRUE returns one such NICE-compatible shift_and_log_scale_fn.
Caching: the scalar input depth D of the base distribution is not known at
construction time. The first call to any of forward(x), inverse(x),
inverse_log_det_jacobian(x), or forward_log_det_jacobian(x) memoizes
D, which is re-used in subsequent calls. This shape must be known prior to
graph execution (which is the case if using tf$layers
).
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Build a scale-and-shift function using a multi-layer neural network
Description
This will be wrapped in a make_template to ensure the variables are only
created once. It takes the d-dimensional input x[0:d]
and returns the D-d
dimensional outputs loc ("mu") and log_scale ("alpha").
Usage
tfb_real_nvp_default_template(
hidden_layers,
shift_only = FALSE,
activation = tf$nn$relu,
name = NULL,
...
)
Arguments
list-like of non-negative integer, scalars indicating the number
of units in each hidden layer. Default: | |
shift_only |
logical indicating if only the shift term shall be computed (i.e. NICE bijector). Default: FALSE. |
activation |
Activation function (callable). Explicitly setting to NULL implies a linear activation. |
name |
A name for ops managed by this function. Default: "tfb_real_nvp_default_template". |
... |
tf$layers$dense arguments |
Details
The default template does not support conditioning and will raise an exception if condition_kwargs are passed to it. To use conditioning in real nvp bijector, implement a conditioned shift/scale template that handles the condition_kwargs.
Value
list of:
shift:
Float
-likeTensor
of shift termslog_scale:
Float
-likeTensor
of log(scale) terms
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
A Bijector that computes b(x) = 1. / x
Description
A Bijector that computes b(x) = 1. / x
Usage
tfb_reciprocal(validate_args = FALSE, name = "reciprocal")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Reshapes the event_shape of a Tensor
Description
The semantics generally follow that of tf$reshape()
, with a few differences:
The user must provide both the input and output shape, so that the transformation can be inverted. If an input shape is not specified, the default assumes a vector-shaped input, i.e.,
event_shape_in = list(-1)
.The Reshape bijector automatically broadcasts over the leftmost dimensions of its input (sample_shape and batch_shape); only the rightmost event_ndims_in dimensions are reshaped. The number of dimensions to reshape is inferred from the provided event_shape_in (
event_ndims_in = length(event_shape_in))
.
Usage
tfb_reshape(
event_shape_out,
event_shape_in = c(-1),
validate_args = FALSE,
name = NULL
)
Arguments
event_shape_out |
An integer-like vector-shaped Tensor representing the event shape of the transformed output. |
event_shape_in |
An optional integer-like vector-shape Tensor representing the event shape of the input. This is required in order to define inverse operations; the default of list(-1) assumes a vector-shaped input. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X; scale) = scale * X
.
Description
Examples:
Y <- 2 * X b <- tfb_scale(scale = 2)
Usage
tfb_scale(
scale = NULL,
log_scale = NULL,
validate_args = FALSE,
name = "scale"
)
Arguments
scale |
Floating-point |
log_scale |
Floating-point |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X; scale) = scale @ X
Description
In TF parlance, the scale
term is logically equivalent to:
scale = tf$diag(scale_diag)
The scale
term is applied without materializing a full dense matrix.
Usage
tfb_scale_matvec_diag(
scale_diag,
adjoint = FALSE,
validate_args = FALSE,
name = "scale_matvec_diag",
dtype = NULL
)
Arguments
scale_diag |
Floating-point |
adjoint |
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
dtype |
|
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X; scale) = scale @ X
.
Description
scale
is a LinearOperator
.
If X
is a scalar then the forward transformation is: scale * X
where *
denotes broadcasted elementwise product.
Usage
tfb_scale_matvec_linear_operator(
scale,
adjoint = FALSE,
validate_args = FALSE,
name = "scale_matvec_linear_operator"
)
Arguments
scale |
Subclass of |
adjoint |
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Matrix-vector multiply using LU decomposition.
Description
This bijector is identical to the "Convolution1x1" used in Glow (Kingma and Dhariwal, 2018).
Usage
tfb_scale_matvec_lu(
lower_upper,
permutation,
validate_args = FALSE,
name = NULL
)
Arguments
lower_upper |
The LU factorization as returned by |
permutation |
The LU factorization permutation as returned by |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
References
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X; scale) = scale @ X
.
Description
The scale
term is presumed lower-triangular and non-singular (ie, no zeros
on the diagonal), which permits efficient determinant calculation (linear in
matrix dimension, instead of cubic).
Usage
tfb_scale_matvec_tri_l(
scale_tril,
adjoint = FALSE,
validate_args = FALSE,
name = "scale_matvec_tril",
dtype = NULL
)
Arguments
scale_tril |
Floating-point |
adjoint |
|
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
dtype |
|
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Transforms unconstrained vectors to TriL matrices with positive diagonal
Description
This is implemented as a simple tfb_chain of tfb_fill_triangular followed by tfb_transform_diagonal, and provided mostly as a convenience. The default setup is somewhat opinionated, using a Softplus transformation followed by a small shift (1e-5) which attempts to avoid numerical issues from zeros on the diagonal.
Usage
tfb_scale_tri_l(
diag_bijector = NULL,
diag_shift = 1e-05,
validate_args = FALSE,
name = "scale_tril"
)
Arguments
diag_bijector |
Bijector instance, used to transform the output diagonal to be positive.
Default value: NULL (i.e., |
diag_shift |
Float value broadcastable and added to all diagonal entries after applying the diag_bijector. Setting a positive value forces the output diagonal entries to be positive, but prevents inverting the transformation for matrices with diagonal entries less than this value. Default value: 1e-5. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X; shift) = X + shift
.
Description
where shift
is a numeric Tensor
.
Usage
tfb_shift(shift, validate_args = FALSE, name = "shift")
Arguments
shift |
floating-point tensor |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Compute Y = g(X) = (1 - exp(-rate * X)) * exp(-c * exp(-rate * X))
Description
This bijector maps inputs from [-inf, inf]
to [0, inf]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the
Shifted Gompertz distribution:
Y ~ ShiftedGompertzCDF(concentration, rate) pdf(y; c, r) = r * exp(-r * y - exp(-r * y) / c) * (1 + (1 - exp(-r * y)) / c)
Usage
tfb_shifted_gompertz_cdf(
concentration,
rate,
validate_args = FALSE,
name = "shifted_gompertz_cdf"
)
Arguments
concentration |
Positive Float-like |
rate |
Positive Float-like |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Note: Even though this is called ShiftedGompertzCDF
, when applied to the
Uniform
distribution, this is not the same as applying a GompertzCDF
with
a Shift
bijector (i.e. the Shifted Gompertz distribution is not the same as
a Gompertz distribution with a location parameter).
Note: Because the Shifted Gompertz distribution concentrates its mass close
to zero, for larger rates or larger concentrations, bijector$forward
will
quickly saturate to 1.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = 1 / (1 + exp(-X))
Description
ComputesY = g(X) = 1 / (1 + exp(-X))
Usage
tfb_sigmoid(validate_args = FALSE, name = "sigmoid")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Bijector that computes Y = sinh(X)
.
Description
Bijector that computes Y = sinh(X)
.
Usage
tfb_sinh(validate_args = FALSE, name = "sinh")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = Sinh( (Arcsinh(X) + skewness) * tailweight )
Description
For skewness in (-inf, inf)
and tailweight in (0, inf)
, this
transformation is a diffeomorphism of the real line (-inf, inf)
.
The inverse transform is X = g^{-1}(Y) = Sinh( ArcSinh(Y) / tailweight - skewness )
.
The SinhArcsinh transformation of the Normal is described in
Sinh-arcsinh distributions
Usage
tfb_sinh_arcsinh(
skewness = NULL,
tailweight = NULL,
validate_args = FALSE,
name = "SinhArcsinh"
)
Arguments
skewness |
Skewness parameter. Float-type Tensor. Default is 0 of type float32. |
tailweight |
Tailweight parameter. Positive Tensor of same dtype as skewness and broadcastable shape. Default is 1 of type float32. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
This Bijector allows a similar transformation of any distribution supported on (-inf, inf)
.
Value
a bijector instance.
Meaning of the parameters
If skewness = 0 and tailweight = 1, this transform is the identity.
Positive (negative) skewness leads to positive (negative) skew.
positive skew means, for unimodal X centered at zero, the mode of Y is "tilted" to the right.
positive skew means positive values of Y become more likely, and negative values become less likely.
Larger (smaller) tailweight leads to fatter (thinner) tails.
Fatter tails mean larger values of |Y| become more likely.
If X is a unit Normal, tailweight < 1 leads to a distribution that is "flat" around Y = 0, and a very steep drop-off in the tails.
If X is a unit Normal, tailweight > 1 leads to a distribution more peaked at the mode with heavier tails. To see the argument about the tails, note that for |X| >> 1 and |X| >> (|skewness| * tailweight)tailweight, we have Y approx 0.5 Xtailweight e**(sign(X) skewness * tailweight).
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes Y = g(X) = exp([X 0]) / sum(exp([X 0]))
Description
To implement softmax as a bijection, the forward transformation appends a value to the input and the inverse removes this coordinate. The appended coordinate represents a pivot, e.g., softmax(x) = exp(x-c) / sum(exp(x-c)) where c is the implicit last coordinate.
Usage
tfb_softmax_centered(validate_args = FALSE, name = "softmax_centered")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
At first blush it may seem like the Invariance of domain theorem implies this implementation is not a bijection. However, the appended dimension makes the (forward) image non-open and the theorem does not directly apply.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes Y = g(X) = Log[1 + exp(X)]
Description
The softplus Bijector has the following two useful properties:
The domain is the positive real numbers
softplus(x) approx x, for large x, so it does not overflow as easily as the Exp Bijector.
Usage
tfb_softplus(
hinge_softness = NULL,
low = NULL,
validate_args = FALSE,
name = "softplus"
)
Arguments
hinge_softness |
Nonzero floating point Tensor. Controls the softness of what would otherwise be a kink at the origin. Default is 1.0. |
low |
Nonzero floating point tensor, lower bound on output values.
Implicitly zero if |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
The optional nonzero hinge_softness parameter changes the transition at zero. With hinge_softness = c, the bijector is:
f_c(x) := c * g(x / c) = c * Log[1 + exp(x / c)]. ``` For large x >> 1, ``` c * Log[1 + exp(x / c)] approx c * Log[exp(x / c)] = x ``` so the behavior for large x is the same as the standard softplus. As c > 0 approaches 0 from the right, f_c(x) becomes less and less soft, approaching max(0, x). * c = 1 is the default. * c > 0 but small means f(x) approx ReLu(x) = max(0, x). * c < 0 flips sign and reflects around the y-axis: f_{-c}(x) = -f_c(-x). * c = 0 results in a non-bijective transformation and triggers an exception. Note: log(.) and exp(.) are applied element-wise but the Jacobian is a reduction over the event space. [1 + exp(x / c)]: R:1%20+%20exp(x%20/%20c) [1 + exp(x / c)]: R:1%20+%20exp(x%20/%20c) [exp(x / c)]: R:exp(x%20/%20c)
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes Y = g(X) = X / (1 + |X|)
Description
The softsign Bijector has the following two useful properties:
The domain is all real numbers
softsign(x) approx sgn(x), for large |x|.
Usage
tfb_softsign(validate_args = FALSE, name = "softsign")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Split a Tensor
event along an axis into a list of Tensor
s.
Description
The inverse of split
concatenates a list of Tensor
s along axis
.
Usage
tfb_split(num_or_size_splits, axis = -1, validate_args = FALSE, name = "split")
Arguments
num_or_size_splits |
Either an integer indicating the number of
splits along |
axis |
A negative integer or scalar |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computesg(X) = X^2
; X is a positive real number.
Description
g is a bijection between the non-negative real numbers (R_+) and the non-negative real numbers.
Usage
tfb_square(validate_args = FALSE, name = "square")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Computes Y = tanh(X)
Description
Y = tanh(X)
, therefore Y in (-1, 1)
.
Usage
tfb_tanh(validate_args = FALSE, name = "tanh")
Arguments
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
This can be achieved by an affine transform of the Sigmoid bijector, i.e., it is equivalent to
tfb_chain(list(tfb_affine(shift = -1, scale = 2),
tfb_sigmoid(),
tfb_affine(scale = 2)))
However, using the Tanh bijector directly is slightly faster and more numerically stable.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
Applies a Bijector to the diagonal of a matrix
Description
Applies a Bijector to the diagonal of a matrix
Usage
tfb_transform_diagonal(
diag_bijector,
validate_args = FALSE,
name = "transform_diagonal"
)
Arguments
diag_bijector |
Bijector instance used to transform the diagonal. |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transpose()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = transpose_rightmost_dims(X, rightmost_perm)
Description
This bijector is semantically similar to tf.transpose except that it
transposes only the rightmost "event" dimensions. That is, unlike
tf$transpose
the perm argument is itself a permutation of
tf$range(rightmost_transposed_ndims)
rather than tf$range(tf$rank(x))
,
i.e., users specify the (rightmost) dimensions to permute, not all dimensions.
Usage
tfb_transpose(
perm = NULL,
rightmost_transposed_ndims = NULL,
validate_args = FALSE,
name = "transpose"
)
Arguments
perm |
Positive integer vector-shaped Tensor representing permutation of
rightmost dims (for forward transformation). Note that the 0th index
represents the first of the rightmost dims and the largest value must be
rightmost_transposed_ndims - 1 and corresponds to |
rightmost_transposed_ndims |
Positive integer scalar-shaped Tensor
representing the number of rightmost dimensions to permute.
Only one of perm and rightmost_transposed_ndims can (and must) be
specified. Default value: |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
The actual (forward) transformation is:
sample_batch_ndims <- tf$rank(x) - tf$size(perm)
perm = tf$concat(list(tf$range(sample_batch_ndims), sample_batch_ndims + perm),axis=0)
tf$transpose(x, perm)
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_weibull_cdf()
,
tfb_weibull()
ComputesY = g(X) = 1 - exp((-X / scale) ** concentration)
where X >= 0
Description
This bijector maps inputs from [0, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1) gives back a
random variable with the Weibull distribution:
Usage
tfb_weibull(
scale = 1,
concentration = 1,
validate_args = FALSE,
name = "weibull"
)
Arguments
scale |
Positive Float-type Tensor that is the same dtype and is
broadcastable with concentration.
This is l in |
concentration |
Positive Float-type Tensor that is the same dtype and is
broadcastable with scale.
This is k in |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Y ~ Weibull(scale, concentration)
pdf(y; scale, concentration, y >= 0) = (concentration / scale) * (y / scale)**(concentration - 1) * exp(-(y / scale)**concentration)
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull_cdf()
Compute Y = g(X) = 1 - exp((-X / scale) ** concentration), X >= 0
.
Description
This bijector maps inputs from [0, inf]
to [0, 1]
. The inverse of the
bijector applied to a uniform random variable X ~ U(0, 1)
gives back a
random variable with the
Weibull distribution:
Y ~ Weibull(scale, concentration) pdf(y; scale, concentration, y >= 0) = (concentration / scale) * (y / scale)**(concentration - 1) * exp(-(y / scale)**concentration)
Usage
tfb_weibull_cdf(
scale = 1,
concentration = 1,
validate_args = FALSE,
name = "weibull_cdf"
)
Arguments
scale |
Positive Float-type |
concentration |
Positive Float-type |
validate_args |
Logical, default FALSE. Whether to validate input with asserts. If validate_args is FALSE, and the inputs are invalid, correct behavior is not guaranteed. |
name |
name prefixed to Ops created by this class. |
Details
Likwewise, the forward of this bijector is the Weibull distribution CDF.
Value
a bijector instance.
See Also
For usage examples see tfb_forward()
, tfb_inverse()
, tfb_inverse_log_det_jacobian()
.
Other bijectors:
tfb_absolute_value()
,
tfb_affine_linear_operator()
,
tfb_affine_scalar()
,
tfb_affine()
,
tfb_ascending()
,
tfb_batch_normalization()
,
tfb_blockwise()
,
tfb_chain()
,
tfb_cholesky_outer_product()
,
tfb_cholesky_to_inv_cholesky()
,
tfb_correlation_cholesky()
,
tfb_cumsum()
,
tfb_discrete_cosine_transform()
,
tfb_expm1()
,
tfb_exp()
,
tfb_ffjord()
,
tfb_fill_scale_tri_l()
,
tfb_fill_triangular()
,
tfb_glow()
,
tfb_gompertz_cdf()
,
tfb_gumbel_cdf()
,
tfb_gumbel()
,
tfb_identity()
,
tfb_inline()
,
tfb_invert()
,
tfb_iterated_sigmoid_centered()
,
tfb_kumaraswamy_cdf()
,
tfb_kumaraswamy()
,
tfb_lambert_w_tail()
,
tfb_masked_autoregressive_default_template()
,
tfb_masked_autoregressive_flow()
,
tfb_masked_dense()
,
tfb_matrix_inverse_tri_l()
,
tfb_matvec_lu()
,
tfb_normal_cdf()
,
tfb_ordered()
,
tfb_pad()
,
tfb_permute()
,
tfb_power_transform()
,
tfb_rational_quadratic_spline()
,
tfb_rayleigh_cdf()
,
tfb_real_nvp_default_template()
,
tfb_real_nvp()
,
tfb_reciprocal()
,
tfb_reshape()
,
tfb_scale_matvec_diag()
,
tfb_scale_matvec_linear_operator()
,
tfb_scale_matvec_lu()
,
tfb_scale_matvec_tri_l()
,
tfb_scale_tri_l()
,
tfb_scale()
,
tfb_shifted_gompertz_cdf()
,
tfb_shift()
,
tfb_sigmoid()
,
tfb_sinh_arcsinh()
,
tfb_sinh()
,
tfb_softmax_centered()
,
tfb_softplus()
,
tfb_softsign()
,
tfb_split()
,
tfb_square()
,
tfb_tanh()
,
tfb_transform_diagonal()
,
tfb_transpose()
,
tfb_weibull()
Autoregressive distribution
Description
The Autoregressive distribution enables learning (often) richer multivariate
distributions by repeatedly applying a diffeomorphic
transformation (such as implemented by Bijector
s).
Usage
tfd_autoregressive(
distribution_fn,
sample0 = NULL,
num_steps = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Autoregressive"
)
Arguments
distribution_fn |
Function which constructs a |
sample0 |
Initial input to |
num_steps |
Number of times |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Regarding terminology, "Autoregressive models decompose the joint density as a product of conditionals, and model each conditional in turn. Normalizing flows transform a base density (e.g. a standard Gaussian) into the target density by an invertible transformation with tractable Jacobian." (Papamakarios et al., 2016)
In other words, the "autoregressive property" is equivalent to the
decomposition, p(x) = prod{ p(x[i] | x[0:i]) : i=0, ..., d }
. The provided
shift_and_log_scale_fn
, tfb_masked_autoregressive_default_template
, achieves
this property by zeroing out weights in its masked_dense
layers.
Practically speaking the autoregressive property means that there exists a
permutation of the event coordinates such that each coordinate is a
diffeomorphic function of only preceding coordinates
(van den Oord et al., 2016).
Mathematical Details
The probability function is
prob(x; fn, n) = fn(x).prob(x)
And a sample is generated by
x = fn(...fn(fn(x0).sample()).sample()).sample()
where the ellipses (...
) represent n-2
composed calls to fn
, fn
constructs a tfd$Distribution
-like instance, and x0
is a fixed initializing Tensor
.
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Batch-Reshaping distribution
Description
This "meta-distribution" reshapes the batch dimensions of another distribution.
Usage
tfd_batch_reshape(
distribution,
batch_shape,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
distribution |
The base distribution instance to reshape. Typically an
instance of |
batch_shape |
Positive |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Bates distribution.
Description
The Bates distribution is the distribution of the average of total_count
independent samples from Uniform(low, high)
. It is parameterized by the
interval bounds low
and high
, and total_count
, the number of samples.
Although some care has been taken to avoid numerical issues, the pdf
, cdf
,
and log versions thereof may still exhibit numerical instability. They are
relatively stable near the tails; however near the mode they are unstable if
total_count
is greater than about 75
for tf$float64
, 25
for
tf$float32
, and 7
for tf$float16
. Beyond these limits a warning will be
shown if validate_args=FALSE
; otherwise an exception is thrown. For high
total_count
, consider using a Normal
approximation.
Usage
tfd_bates(
total_count,
low = 0,
high = 1,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Bates"
)
Arguments
total_count |
Non-negative integer-valued |
low |
Floating point |
high |
Floating point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is supported in the interval
[low, high]
. If [low, high]
is the unit interval [0, 1]
, the pdf
is,
pdf(x; n, 0, 1) = ((n / (n-1)!) sum_{k=0}^j (-1)^k (n choose k) (nx - k)^{n-1}
where
-
total_count = n
, -
j = floor(nx)
-
n!
is the factorial ofn
, -
(n choose k)
is the binomial coefficientn! / (k!(n - k)!)
For arbitrary intervals[low, high]
, the pdf is,
pdf(x; n, low, high) = pdf((x - low) / (high - low); n, 0, 1) / (high - low)
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Bernoulli distribution
Description
The Bernoulli distribution with probs
parameter, i.e., the probability of a
1
outcome (vs a 0
outcome).
Usage
tfd_bernoulli(
logits = NULL,
probs = NULL,
dtype = tf$int32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Bernoulli"
)
Arguments
logits |
An N-D Tensor representing the log-odds of a 1 event. Each entry in the Tensor parametrizes an independent Bernoulli distribution where the probability of an event is sigmoid(logits). Only one of logits or probs should be passed in. |
probs |
An N-D Tensor representing the probability of a 1 event. Each entry in the Tensor parameterizes an independent Bernoulli distribution. Only one of logits or probs should be passed in. |
dtype |
The type of the event samples. Default: int32. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Beta distribution
Description
The Beta distribution is defined over the (0, 1)
interval using parameters
concentration1
(aka "alpha") and concentration0
(aka "beta").
Usage
tfd_beta(
concentration1 = NULL,
concentration0 = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Beta"
)
Arguments
concentration1 |
Positive floating-point |
concentration0 |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; alpha, beta) = x**(alpha - 1) (1 - x)**(beta - 1) / Z Z = Gamma(alpha) Gamma(beta) / Gamma(alpha + beta)
where:
-
concentration1 = alpha
, -
concentration0 = beta
, -
Z
is the normalization constant, and, -
Gamma
is the gamma function. The concentration parameters represent mean total counts of a1
or a0
, i.e.,
concentration1 = alpha = mean * total_concentration concentration0 = beta = (1. - mean) * total_concentration
where mean
in (0, 1)
and total_concentration
is a positive real number
representing a mean total_count = concentration1 + concentration0
.
Distribution parameters are automatically broadcast in all functions; see
examples for details.
Warning: The samples can be zero due to finite precision.
This happens more often when some of the concentrations are very small.
Make sure to round the samples to np$finfo(dtype)$tiny
before computing the density.
Samples of this distribution are reparameterized (pathwise differentiable).
The derivatives are computed using the approach described in the paper
Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Beta-Binomial compound distribution
Description
The Beta-Binomial distribution is parameterized by (a batch of) total_count
parameters, the number of trials per draw from Binomial distributions where
the probabilities of success per trial are drawn from underlying Beta
distributions; the Beta distributions are parameterized by concentration1
(aka 'alpha') and concentration0
(aka 'beta').
Mathematically, it is (equivalent to) a special case of the
Dirichlet-Multinomial over two classes, although the computational
representation is slightly different: while the Beta-Binomial is a
distribution over the number of successes in total_count
trials, the
two-class Dirichlet-Multinomial is a distribution over the number of successes
and failures.
Usage
tfd_beta_binomial(
total_count,
concentration1,
concentration0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "BetaBinomial"
)
Arguments
total_count |
Non-negative integer-valued tensor, whose dtype is the same
as |
concentration1 |
Positive floating-point |
concentration0 |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Beta-Binomial is a distribution over the number of successes in
total_count
independent Binomial trials, with each trial having the same
probability of success, the underlying probability being unknown but drawn
from a Beta distribution with known parameters.
The probability mass function (pmf) is,
pmf(k; n, a, b) = Beta(k + a, n - k + b) / Z Z = (k! (n - k)! / n!) * Beta(a, b)
where:
-
concentration1 = a > 0
, -
concentration0 = b > 0
, -
total_count = n
,n
a positive integer, -
n!
isn
factorial, -
Beta(x, y) = Gamma(x) Gamma(y) / Gamma(x + y)
is the beta function, and -
Gamma
is the gamma function.
Dirichlet-Multinomial is a compound distribution, i.e., its samples are generated as follows.
Choose success probabilities:
probs ~ Beta(concentration1, concentration0)
Draw integers representing the number of successes:
counts ~ Binomial(total_count, probs)
Distribution parameters are automatically broadcast in all functions; see examples for details.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Binomial distribution
Description
This distribution is parameterized by probs
, a (batch of) probabilities for
drawing a 1
and total_count
, the number of trials per draw from the
Binomial.
Usage
tfd_binomial(
total_count,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Beta"
)
Arguments
total_count |
Non-negative floating point tensor with shape broadcastable
to |
logits |
Floating point tensor representing the log-odds of a
positive event with shape broadcastable to |
probs |
Positive floating point tensor with shape broadcastable to
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Binomial is a distribution over the number of 1
's in total_count
independent trials, with each trial having the same probability of 1
, i.e.,
probs
.
The probability mass function (pmf) is,
pmf(k; n, p) = p**k (1 - p)**(n - k) / Z Z = k! (n - k)! / n!
where:
-
total_count = n
, -
probs = p
, -
Z
is the normalizing constant, and, -
n!
is the factorial ofn
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Blockwise distribution
Description
Blockwise distribution
Usage
tfd_blockwise(
distributions,
dtype_override = NULL,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "Blockwise"
)
Arguments
distributions |
list of Distribution instances. All distribution instances must have the same batch_shape and all must have 'event_ndims==1“, i.e., be vector-variate distributions. |
dtype_override |
samples of distributions will be cast to this dtype. If
unspecified, all distributions must have the same dtype. Default value:
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Categorical distribution over integers
Description
The Categorical distribution is parameterized by either probabilities or
log-probabilities of a set of K
classes. It is defined over the integers
{0, 1, ..., K-1}
.
Usage
tfd_categorical(
logits = NULL,
probs = NULL,
dtype = tf$int32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Categorical"
)
Arguments
logits |
An N-D |
probs |
An N-D |
dtype |
The type of the event samples (default: int32). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The Categorical distribution is closely related to the OneHotCategorical
and
Multinomial
distributions. The Categorical distribution can be intuited as
generating samples according to argmax{ OneHotCategorical(probs) }
itself
being identical to argmax{ Multinomial(probs, total_count=1) }
.
Mathematical Details
The probability mass function (pmf) is,
pmf(k; pi) = prod_j pi_j**[k == j]
Pitfalls
The number of classes, K
, must not exceed:
the largest integer representable by
self$dtype
, i.e.,2**(mantissa_bits+1)
(IEEE 754),the maximum
Tensor
index, i.e.,2**31-1
.
Note: This condition is validated only when validate_args = TRUE
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Cauchy distribution with location loc
and scale scale
Description
Mathematical details
Usage
tfd_cauchy(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Cauchy"
)
Arguments
loc |
Floating point tensor; the modes of the distribution(s). |
scale |
Floating point tensor; the locations of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) is,
pdf(x; loc, scale) = 1 / (pi scale (1 + z**2)) z = (x - loc) / scale
where loc
is the location, and scale
is the scale.
The Cauchy distribution is a member of the location-scale family, i.e.
Y ~ Cauchy(loc, scale)
is equivalent to,
X ~ Cauchy(loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Cumulative distribution function.
Given random variable X, the cumulative distribution function cdf is:
cdf(x) := P[X <= x]
Description
Cumulative distribution function.
Given random variable X, the cumulative distribution function cdf is:
cdf(x) := P[X <= x]
Usage
tfd_cdf(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_cdf(x)
Chi distribution
Description
The Chi distribution is defined over nonnegative real numbers and uses a degrees of freedom ("df") parameter.
Usage
tfd_chi(df, validate_args = FALSE, allow_nan_stats = TRUE, name = "Chi")
Arguments
df |
Floating point tensor, the degrees of freedom of the distribution(s).
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; df, x >= 0) = x**(df - 1) exp(-0.5 x**2) / Z Z = 2**(0.5 df - 1) Gamma(0.5 df)
where:
-
df
denotes the degrees of freedom, -
Z
is the normalization constant, and, -
Gamma
is the gamma function.
The Chi distribution is a transformation of the Chi2 distribution; it is the distribution of the positive square root of a variable obeying a Chi distribution.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Chi Square distribution
Description
The Chi2 distribution is defined over positive real numbers using a degrees of freedom ("df") parameter.
Usage
tfd_chi2(df, validate_args = FALSE, allow_nan_stats = TRUE, name = "Chi2")
Arguments
df |
Floating point tensor, the degrees of freedom of the
distribution(s). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; df, x > 0) = x**(0.5 df - 1) exp(-0.5 x) / Z Z = 2**(0.5 df) Gamma(0.5 df)
where
-
df
denotes the degrees of freedom, -
Z
is the normalization constant, and, -
Gamma
is the gamma function. The Chi2 distribution is a special case of the Gamma distribution, i.e.,
Chi2(df) = Gamma(concentration=0.5 * df, rate=0.5)
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The CholeskyLKJ distribution on cholesky factors of correlation matrices
Description
This is a one-parameter family of distributions on cholesky factors of
correlation matrices.
In other words, if If X ~ CholeskyLKJ(c)
, then X @ X^T ~ LKJ(c)
.
For more details on the LKJ distribution, see tfd_lkj
.
Usage
tfd_cholesky_lkj(
dimension,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "CholeskyLKJ"
)
Arguments
dimension |
|
concentration |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Continuous Bernoulli distribution.
Description
This distribution is parameterized by probs
, a (batch of) parameters
taking values in (0, 1)
. Note that, unlike in the Bernoulli case, probs
does not correspond to a probability, but the same name is used due to the
similarity with the Bernoulli.
Usage
tfd_continuous_bernoulli(
logits = NULL,
probs = NULL,
dtype = tf$float32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "ContinuousBernoulli"
)
Arguments
logits |
An N-D |
probs |
An N-D |
dtype |
The type of the event samples. Default: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The continuous Bernoulli is a distribution over the interval [0, 1]
,
parameterized by probs
in (0, 1)
.
The probability density function (pdf) is,
pdf(x; probs) = probs**x * (1 - probs)**(1 - x) * C(probs) C(probs) = (2 * atanh(1 - 2 * probs) / (1 - 2 * probs) if probs != 0.5 else 2.)
While the normalizing constant C(probs)
is a continuous function of probs
(even at probs = 0.5
), computing it at values close to 0.5 can result in
numerical instabilities due to 0/0 errors. A Taylor approximation of
C(probs)
is thus used for values of probs
in a small interval [lims[0], lims[1]]
around 0.5. For more details,
see Loaiza-Ganem and Cunningham (2019).
NOTE: Unlike the Bernoulli, numerical instabilities can happen for probs
very close to 0 or 1. Current implementation allows any value in (0, 1)
,
but this could be changed to (1e-6, 1-1e-6)
to avoid these issues.
Value
a distribution instance.
References
Loaiza-Ganem G and Cunningham JP. The continuous Bernoulli: fixing a pervasive error in variational autoencoders. NeurIPS2019. https://arxiv.org/abs/1907.06845
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Covariance.
Description
Covariance is (possibly) defined only for non-scalar-event distributions.
For example, for a length-k, vector-valued distribution, it is calculated as,
Cov[i, j] = Covariance(X_i, X_j) = E[(X_i - E[X_i]) (X_j - E[X_j])]
where Cov is a (batch of) k x k matrix, 0 <= (i, j) < k, and E denotes expectation.
Usage
tfd_covariance(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Details
Alternatively, for non-vector, multivariate distributions (e.g., matrix-valued, Wishart),
Covariance shall return a (batch of) matrices under some vectorization of the events, i.e.,
Cov[i, j] = Covariance(Vec(X)_i, Vec(X)_j) = [as above]
where Cov is a (batch of) k x k matrices, 0 <= (i, j) < k = reduce_prod(event_shape),
and Vec is some function mapping indices of this distribution's event dimensions to indices of a
length-k vector.
Value
Floating-point Tensor with shape [B1, ..., Bn, k, k]
where the first n dimensions
are batch coordinates and k = reduce_prod(self.event_shape)
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_variance()
Computes the (Shannon) cross entropy.
Description
Denote this distribution (self) by P and the other distribution by Q.
Assuming P, Q are absolutely continuous with respect to one another and permit densities
p(x) dr(x) and q(x) dr(x), (Shannon) cross entropy is defined as:
H[P, Q] = E_p[-log q(X)] = -int_F p(x) log q(x) dr(x)
where F denotes the support of the random variable X ~ P
.
Usage
tfd_cross_entropy(distribution, other, name = "cross_entropy")
Arguments
distribution |
The distribution being used. |
other |
|
name |
String prepended to names of ops created by this function. |
Value
cross_entropy: self.dtype Tensor with shape [B1, ..., Bn]
representing n different calculations of (Shannon) cross entropy.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d1 <- tfd_normal(loc = 1, scale = 1)
d2 <- tfd_normal(loc = 2, scale = 1)
d1 %>% tfd_cross_entropy(d2)
Scalar Deterministic
distribution on the real line
Description
The scalar Deterministic
distribution is parameterized by a (batch) point
loc
on the real line. The distribution is supported at this point only,
and corresponds to a random variable that is constant, equal to loc
.
See Degenerate rv.
Usage
tfd_deterministic(
loc,
atol = NULL,
rtol = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Deterministic"
)
Arguments
loc |
Numeric |
atol |
Non-negative |
rtol |
Non-negative |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability mass function (pmf) and cumulative distribution function (cdf) are
pmf(x; loc) = 1, if x == loc, else 0 cdf(x; loc) = 1, if x >= loc, else 0
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Dirichlet distribution
Description
The Dirichlet distribution is defined over the
(k-1)
-simplex using a positive,
length-k
vector concentration
(k > 1
). The Dirichlet is identically the
Beta distribution when k = 2
.
Usage
tfd_dirichlet(
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Dirichlet"
)
Arguments
concentration |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Dirichlet is a distribution over the open (k-1)
-simplex, i.e.,
S^{k-1} = { (x_0, ..., x_{k-1}) in R^k : sum_j x_j = 1 and all_j x_j > 0 }.
The probability density function (pdf) is,
pdf(x; alpha) = prod_j x_j**(alpha_j - 1) / Z Z = prod_j Gamma(alpha_j) / Gamma(sum_j alpha_j)
where:
-
x in S^{k-1}
, i.e., the(k-1)
-simplex, -
concentration = alpha = [alpha_0, ..., alpha_{k-1}]
,alpha_j > 0
, -
Z
is the normalization constant aka the multivariate beta function, and, -
Gamma
is the gamma function.
The concentration
represents mean total counts of class occurrence, i.e.,
concentration = alpha = mean * total_concentration
where mean
in S^{k-1}
and total_concentration
is a positive real number
representing a mean total count.
Distribution parameters are automatically broadcast in all functions; see
examples for details.
Warning: Some components of the samples can be zero due to finite precision.
This happens more often when some of the concentrations are very small.
Make sure to round the samples to np$finfo(dtype)$tiny
before computing the density.
Samples of this distribution are reparameterized (pathwise differentiable).
The derivatives are computed using the approach described in the paper
Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Dirichlet-Multinomial compound distribution
Description
The Dirichlet-Multinomial distribution is parameterized by a (batch of)
length-K
concentration
vectors (K > 1
) and a total_count
number of
trials, i.e., the number of trials per draw from the DirichletMultinomial. It
is defined over a (batch of) length-K
vector counts
such that
tf$reduce_sum(counts, -1) = total_count
. The Dirichlet-Multinomial is
identically the Beta-Binomial distribution when K = 2
.
Usage
tfd_dirichlet_multinomial(
total_count,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "DirichletMultinomial"
)
Arguments
total_count |
Non-negative floating point tensor, whose dtype is the same
as |
concentration |
Positive floating point tensor, whose dtype is the
same as |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Dirichlet-Multinomial is a distribution over K
-class counts, i.e., a
length-K
vector of non-negative integer counts = n = [n_0, ..., n_{K-1}]
.
The probability mass function (pmf) is,
pmf(n; alpha, N) = Beta(alpha + n) / (prod_j n_j!) / Z Z = Beta(alpha) / N!
where:
-
concentration = alpha = [alpha_0, ..., alpha_{K-1}]
,alpha_j > 0
, -
total_count = N
,N
a positive integer, -
N!
isN
factorial, and, -
Beta(x) = prod_j Gamma(x_j) / Gamma(sum_j x_j)
is the multivariate beta function, and, -
Gamma
is the gamma function.
Dirichlet-Multinomial is a compound distribution, i.e., its samples are generated as follows.
Choose class probabilities:
probs = [p_0,...,p_{K-1}] ~ Dir(concentration)
Draw integers:
counts = [n_0,...,n_{K-1}] ~ Multinomial(total_count, probs)
The last concentration
dimension parametrizes a single Dirichlet-Multinomial
distribution. When calling distribution functions (e.g., dist$prob(counts)
),
concentration
, total_count
and counts
are broadcast to the same shape.
The last dimension of counts
corresponds single Dirichlet-Multinomial distributions.
Distribution parameters are automatically broadcast in all functions; see examples for details.
Pitfalls
The number of classes, K
, must not exceed:
the largest integer representable by
self$dtype
, i.e.,2**(mantissa_bits+1)
(IEE754),the maximum
Tensor
index, i.e.,2**31-1
.
Note: This condition is validated only when validate_args = TRUE
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Double-sided Maxwell distribution.
Description
This distribution is useful to compute measure valued derivatives for Gaussian distributions. See Mohamed et al. (2019) for more details.
Usage
tfd_doublesided_maxwell(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "doublesided_maxwell"
)
Arguments
loc |
Floating point tensor; location of the distribution |
scale |
Floating point tensor; the scales of the distribution. Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
string prefixed to Ops created by this class. Default value: 'doublesided_maxwell'. |
Details
Mathematical details
The double-sided Maxwell distribution generalizes the Maxwell distribution to the entire real line.
pdf(x; mu, sigma) = 1/(sigma*sqrt(2*pi)) * ((x-mu)/sigma)^2 * exp(-0.5 ((x-mu)/sigma)^2)
where loc = mu
and scale = sigma
.
The DoublesidedMaxwell distribution is a member of the
location-scale family,
i.e., it can be constructed as,
X ~ DoublesidedMaxwell(loc=0, scale=1) Y = loc + scale * X
The double-sided Maxwell is a symmetric distribution that extends the one-sided maxwell from R+ to the entire real line. Their densities are therefore the same up to a factor of 0.5.
It has several methods for generating random variates from it. The version here uses 3 Gaussian variates and a uniform variate to generate the samples The sampling path is:
mu + sigma* sgn(U-0.5)* sqrt(X^2 + Y^2 + Z^2) U~Unif; X,Y,Z ~N(0,1)
In the sampling process above, the random variates generated by sqrt(X^2 + Y^2 + Z^2) are samples from the one-sided Maxwell (or Maxwell-Boltzmann) distribution.
Value
a distribution instance.
References
-
Mohamed, et all, "Monte Carlo Gradient Estimation in Machine Learning.",2019
B. Heidergott, et al "Sensitivity estimation for Gaussian systems", 2008. European Journal of Operational Research, vol. 187, pp193-207.
G. Pflug. "Optimization of Stochastic Models: The Interface Between Simulation and Optimization", 2002. Chp. 4.2, pg 247.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Empirical distribution
Description
The Empirical distribution is parameterized by a (batch) multiset of samples. It describes the empirical measure (observations) of a variable. Note: some methods (log_prob, prob, cdf, mode, entropy) are not differentiable with regard to samples.
Usage
tfd_empirical(
samples,
event_ndims = 0,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Empirical"
)
Arguments
samples |
Numeric |
event_ndims |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability mass function (pmf) and cumulative distribution function (cdf) are
pmf(k; s1, ..., sn) = sum_i I(k)^{k == si} / n I(k)^{k == si} == 1, if k == si, else 0. cdf(k; s1, ..., sn) = sum_i I(k)^{k >= si} / n I(k)^{k >= si} == 1, if k >= si, else 0.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Shannon entropy in nats.
Description
Shannon entropy in nats.
Usage
tfd_entropy(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_entropy()
ExpGamma distribution.
Description
The ExpGamma distribution is defined over the real line using
parameters concentration
(aka "alpha") and rate
(aka "beta").
This distribution is a transformation of the Gamma distribution such that
X ~ ExpGamma(..) => exp(X) ~ Gamma(..).
Usage
tfd_exp_gamma(
concentration,
rate = NULL,
log_rate = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "ExpGamma"
)
Arguments
concentration |
Floating point tensor, the concentration params of the distribution(s). Must contain only positive values. |
rate |
Floating point tensor, the inverse scale params of the
distribution(s). Must contain only positive values. Mutually exclusive
with |
log_rate |
Floating point tensor, natural logarithm of the inverse scale
params of the distribution(s). Mutually exclusive with |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) can be derived from the change of
variables rule (since the distribution is logically equivalent to
tfb_log()(tfd_gamma(..))
):
pdf(x; alpha, beta > 0) = exp(x)**(alpha - 1) exp(-exp(x) beta) / Z + x Z = Gamma(alpha) beta**(-alpha)
where:
-
concentration = alpha
,alpha > 0
, -
rate = beta
,beta > 0
, -
Z
is the normalizing constant of the corresponding Gamma distribution, and -
Gamma
is the gamma function.
The cumulative density function (cdf) is,
cdf(x; alpha, beta, x) = GammaInc(alpha, beta exp(x)) / Gamma(alpha)
where GammaInc
is the lower incomplete Gamma function.
Distribution parameters are automatically broadcast in all functions. Samples of this distribution are reparameterized (pathwise differentiable). The derivatives are computed using the approach described in Figurnov et al., 2018.
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
ExpInverseGamma distribution.
Description
The ExpInverseGamma
distribution is defined over the real numbers such that
X ~ ExpInverseGamma(..) => exp(X) ~ InverseGamma(..).
The distribution is logically equivalent to tfb_log()(tfd_inverse_gamma(..))
,
but can be sampled with much better precision.
Usage
tfd_exp_inverse_gamma(
concentration,
scale = NULL,
log_scale = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "ExpGamma"
)
Arguments
concentration |
Floating point tensor, the concentration params of the distribution(s). Must contain only positive values. |
scale |
Floating point tensor, the scale params of the distribution(s).
Must contain only positive values. Mutually exclusive with |
log_scale |
Floating point tensor, the natural logarithm of the scale
params of the distribution(s). Mutually exclusive with |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is very similar to ExpGamma,
pdf(x; alpha, beta > 0) = exp(-x)**(alpha - 1) exp(-exp(-x) beta) / Z - x Z = Gamma(alpha) beta**(-alpha)
where:
-
concentration = alpha
, -
scale = beta
, -
Z
is the normalizing constant, and, -
Gamma
is the gamma function.
The cumulative density function (cdf) is,
cdf(x; alpha, beta, x) = 1 - GammaInc(alpha, beta exp(-x)) / Gamma(alpha)
where GammaInc
is the upper incomplete Gamma function.
Distribution parameters are automatically broadcast in all functions. Samples of this distribution are reparameterized (pathwise differentiable). The derivatives are computed using the approach described in Figurnov et al, 2018.
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
ExpRelaxedOneHotCategorical distribution with temperature and logits.
Description
ExpRelaxedOneHotCategorical distribution with temperature and logits.
Usage
tfd_exp_relaxed_one_hot_categorical(
temperature,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "ExpRelaxedOneHotCategorical"
)
Arguments
temperature |
An 0-D Tensor, representing the temperature of a set of ExpRelaxedCategorical distributions. The temperature should be positive. |
logits |
An N-D Tensor, N >= 1, representing the log probabilities of a set of ExpRelaxedCategorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of logits for each class. Only one of logits or probs should be passed in. |
probs |
An N-D Tensor, N >= 1, representing the probabilities of a set of ExpRelaxedCategorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of probabilities for each class. Only one of logits or probs should be passed in. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Exponential distribution
Description
The Exponential distribution is parameterized by an event rate
parameter.
Usage
tfd_exponential(
rate,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Exponential"
)
Arguments
rate |
Floating point tensor, equivalent to |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; lambda, x > 0) = exp(-lambda x) / Z Z = 1 / lambda
where rate = lambda
and Z
is the normalizing constant.
The Exponential distribution is a special case of the Gamma distribution, i.e.,
Exponential(rate) = Gamma(concentration=1., rate)
The Exponential distribution uses a rate
parameter, or "inverse scale",
which can be intuited as,
X ~ Exponential(rate=1) Y = X / rate
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The finite discrete distribution.
Description
The FiniteDiscrete distribution is parameterized by either probabilities or
log-probabilities of a set of K
possible outcomes, which is defined by
a strictly ascending list of K
values.
Usage
tfd_finite_discrete(
outcomes,
logits = NULL,
probs = NULL,
rtol = NULL,
atol = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "FiniteDiscrete"
)
Arguments
outcomes |
A 1-D floating or integer |
logits |
A floating N-D |
probs |
A floating N-D |
rtol |
|
atol |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
string prefixed to Ops created by this class. |
Details
Note: log_prob, prob, cdf, mode, and entropy are differentiable with respect
to logits
or probs
but not with respect to outcomes
.
Mathematical Details
The probability mass function (pmf) is,
pmf(x; pi, qi) = prod_j pi_j**[x == qi_j]
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Gamma distribution
Description
The Gamma distribution is defined over positive real numbers using
parameters concentration
(aka "alpha") and rate
(aka "beta").
Usage
tfd_gamma(
concentration,
rate,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Gamma"
)
Arguments
concentration |
Floating point tensor, the concentration params of the distribution(s). Must contain only positive values. |
rate |
Floating point tensor, the inverse scale params of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; alpha, beta, x > 0) = x**(alpha - 1) exp(-x beta) / Z Z = Gamma(alpha) beta**(-alpha)
where
-
concentration = alpha
,alpha > 0
, -
rate = beta
,beta > 0
, -
Z
is the normalizing constant, and, -
Gamma
is the gamma function.
The cumulative density function (cdf) is,
cdf(x; alpha, beta, x > 0) = GammaInc(alpha, beta x) / Gamma(alpha)
where GammaInc
is the lower incomplete Gamma function.
The parameters can be intuited via their relationship to mean and stddev,
concentration = alpha = (mean / stddev)**2 rate = beta = mean / stddev**2 = concentration / mean
Distribution parameters are automatically broadcast in all functions; see examples for details.
Warning: The samples of this distribution are always non-negative. However,
the samples that are smaller than np$finfo(dtype)$tiny
are rounded
to this value, so it appears more often than it should.
This should only be noticeable when the concentration
is very small, or the
rate
is very large. See note in tf$random_gamma
docstring.
Samples of this distribution are reparameterized (pathwise differentiable).
The derivatives are computed using the approach described in the paper
Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Gamma-Gamma distribution
Description
Gamma-Gamma is a compound distribution
defined over positive real numbers using parameters concentration
,
mixing_concentration
and mixing_rate
.
Usage
tfd_gamma_gamma(
concentration,
mixing_concentration,
mixing_rate,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "GammaGamma"
)
Arguments
concentration |
Floating point tensor, the concentration params of the distribution(s). Must contain only positive values. |
mixing_concentration |
Floating point tensor, the concentration params of the mixing Gamma distribution(s). Must contain only positive values. |
mixing_rate |
Floating point tensor, the rate params of the mixing Gamma distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
This distribution is also referred to as the beta of the second kind (B2), and can be useful for transaction value modeling, as in Fader and Hardi, 2013.
Mathematical Details
It is derived from the following Gamma-Gamma hierarchical model by integrating
out the random variable beta
.
beta ~ Gamma(alpha0, beta0) X | beta ~ Gamma(alpha, beta)
where
-
concentration = alpha
-
mixing_concentration = alpha0
-
mixing_rate = beta0
The probability density function (pdf) is
x**(alpha - 1) pdf(x; alpha, alpha0, beta0) = Z * (x + beta0)**(alpha + alpha0)
where the normalizing constant Z = Beta(alpha, alpha0) * beta0**(-alpha0)
.
Samples of this distribution are reparameterized as samples of the Gamma
distribution are reparameterized using the technique described in
(Figurnov et al., 2018).
@section References:
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Marginal distribution of a Gaussian process at finitely many points.
Description
A Gaussian process (GP) is an indexed collection of random variables, any finite collection of which are jointly Gaussian. While this definition applies to finite index sets, it is typically implicit that the index set is infinite; in applications, it is often some finite dimensional real or complex vector space. In such cases, the GP may be thought of as a distribution over (real- or complex-valued) functions defined over the index set.
Usage
tfd_gaussian_process(
kernel,
index_points,
mean_fn = NULL,
observation_noise_variance = 0,
jitter = 1e-06,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "GaussianProcess"
)
Arguments
kernel |
|
index_points |
|
mean_fn |
function that acts on index points to produce a (batch
of) vector(s) of mean values at those index points. Takes a |
observation_noise_variance |
|
jitter |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Just as Gaussian distributions are fully specified by their first and second
moments, a Gaussian process can be completely specified by a mean and
covariance function.
Let S
denote the index set and K
the space in which
each indexed random variable takes its values (again, often R or C). The mean
function is then a map m: S -> K
, and the covariance function, or kernel, is
a positive-definite function k: (S x S) -> K
. The properties of functions
drawn from a GP are entirely dictated (up to translation) by the form of the
kernel function.
This Distribution
represents the marginal joint distribution over function
values at a given finite collection of points [x[1], ..., x[N]]
from the
index set S
. By definition, this marginal distribution is just a
multivariate normal distribution, whose mean is given by the vector
[ m(x[1]), ..., m(x[N]) ]
and whose covariance matrix is constructed from
pairwise applications of the kernel function to the given inputs:
| k(x[1], x[1]) k(x[1], x[2]) ... k(x[1], x[N]) | | k(x[2], x[1]) k(x[2], x[2]) ... k(x[2], x[N]) | | ... ... ... | | k(x[N], x[1]) k(x[N], x[2]) ... k(x[N], x[N]) |
For this to be a valid covariance matrix, it must be symmetric and positive
definite; hence the requirement that k
be a positive definite function
(which, by definition, says that the above procedure will yield PD matrices).
We also support the inclusion of zero-mean Gaussian noise in the model, via
the observation_noise_variance
parameter. This augments the generative model
to
f ~ GP(m, k) (y[i] | f, x[i]) ~ Normal(f(x[i]), s)
where
-
m
is the mean function -
k
is the covariance kernel function -
f
is the function drawn from the GP -
x[i]
are the index points at which the function is observed -
y[i]
are the observed values at the index points -
s
is the scale of the observation noise.
Note that this class represents an unconditional Gaussian process; it does not implement posterior inference conditional on observed function evaluations. This class is useful, for example, if one wishes to combine a GP prior with a non-conjugate likelihood using MCMC to sample from the posterior.
Mathematical Details
The probability density function (pdf) is a multivariate normal whose parameters are derived from the GP's properties:
pdf(x; index_points, mean_fn, kernel) = exp(-0.5 * y) / Z K = (kernel.matrix(index_points, index_points) + (observation_noise_variance + jitter) * eye(N)) y = (x - mean_fn(index_points))^T @ K @ (x - mean_fn(index_points)) Z = (2 * pi)**(.5 * N) |det(K)|**(.5)
where:
-
index_points
are points in the index set over which the GP is defined, -
mean_fn
is a callable mapping the index set to the GP's mean values, -
kernel
isPositiveSemidefiniteKernel
-like and represents the covariance function of the GP, -
observation_noise_variance
represents (optional) observation noise. -
jitter
is added to the diagonal to ensure positive definiteness up to machine precision (otherwise Cholesky-decomposition is prone to failure), -
eye(N)
is an N-by-N identity matrix.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Posterior predictive distribution in a conjugate GP regression model.
Description
Posterior predictive distribution in a conjugate GP regression model.
Usage
tfd_gaussian_process_regression_model(
kernel,
index_points = NULL,
observation_index_points = NULL,
observations = NULL,
observation_noise_variance = 0,
predictive_noise_variance = NULL,
mean_fn = NULL,
jitter = 1e-06,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "GaussianProcessRegressionModel"
)
Arguments
kernel |
|
index_points |
|
observation_index_points |
Tensor representing finite collection, or batch
of collections, of points in the index set for which some data has been observed.
Shape has the form [b1, ..., bB, e, f1, ..., fF] where F is the number of
feature dimensions and must equal |
observations |
Tensor representing collection, or batch of collections,
of observations corresponding to observation_index_points. Shape has the
form [b1, ..., bB, e], which must be brodcastable with the batch and example
shapes of observation_index_points. The batch shape [b1, ..., bB\ ] must be
broadcastable with the shapes of all other batched parameters (kernel.batch_shape,
index_points, etc.). The default value is None, which corresponds to the empty
set of observations, and simply results in the prior predictive model (a GP
with noise of variance |
observation_noise_variance |
|
predictive_noise_variance |
Tensor representing the variance in the posterior predictive model. If None, we simply re-use observation_noise_variance for the posterior predictive noise. If set explicitly, however, we use this value. This allows us, for example, to omit predictive noise variance (by setting this to zero) to obtain noiseless posterior predictions of function values, conditioned on noisy observations. |
mean_fn |
callable that acts on |
jitter |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Generalized Normal distribution.
Description
The Generalized Normal (or Generalized Gaussian) generalizes the Normal
distribution with an additional shape parameter. It is parameterized by
location loc
, scale scale
and shape power
.
Usage
tfd_generalized_normal(
loc,
scale,
power,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "GeneralizedNormal"
)
Arguments
loc |
Floating point tensor; the means of the distribution(s). |
scale |
Floating point tensor; the scale of the distribution(s). Must contain only positive values. |
power |
Floating point tensor; the shape parameter of the distribution(s).
Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details The probability density function (pdf) is,
pdf(x; loc, scale, power) = 1 / (2 * scale * Gamma(1 + 1 / power)) * exp(-(|x - loc| / scale) ^ power)
where loc
is the mean, scale
is the scale, and, power
is the shape
parameter. If the power is above two, the distribution becomes platykurtic.
A power equal to two results in a Normal distribution. A power smaller than
two produces a leptokurtic (heavy-tailed) distribution. Mean and scale behave
the same way as in the equivalent Normal distribution.
See https://en.wikipedia.org/w/index.php?title=Generalized_normal_distribution&oldid=954254464 for the definitions used here, including CDF, variance and entropy. See https://sccn.ucsd.edu/wiki/Generalized_Gaussian_Probability_Density_Function for the sampling method used here.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Generalized Pareto distribution.
Description
The Generalized Pareto distributions are a family of continuous distributions
on the reals. Special cases include Exponential
(when loc = 0
,
concentration = 0
), Pareto
(when concentration > 0
,
loc = scale / concentration
), and Uniform
(when concentration = -1
).
Usage
tfd_generalized_pareto(
loc,
scale,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
loc |
The location / shift of the distribution. GeneralizedPareto is a
location-scale distribution. This parameter lower bounds the
distribution's support. Must broadcast with |
scale |
The scale of the distribution. GeneralizedPareto is a
location-scale distribution, so doubling the |
concentration |
The shape parameter of the distribution. The larger the
magnitude, the more the distribution concentrates near |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
This distribution is often used to model the tails of other distributions.
As a member of the location-scale family,
X ~ GeneralizedPareto(loc=loc, scale=scale, concentration=conc)
maps to
Y ~ GeneralizedPareto(loc=0, scale=1, concentration=conc)
via
Y = (X - loc) / scale
.
For positive concentrations, the distribution is equivalent to a hierarchical
Exponential-Gamma model with X|rate ~ Exponential(rate)
and
rate ~ Gamma(concentration=1 / concentration, scale=scale / concentration)
.
In the following, samps1
and samps2
are identically distributed:
genp <- tfd_generalized_pareto(loc = 0, scale = scale, concentration = conc) samps1 <- genp %>% tfd_sample(1000) jd <- tfd_joint_distribution_named( list( rate = tfd_gamma(1 / genp$concentration, genp$scale / genp$concentration), x = function(rate) tfd_exponential(rate))) samps2 <- jd %>% tfd_sample(1000) %>% .$x
The support of the distribution is always lower bounded by loc
. When
concentration < 0
, the support is also upper bounded by
loc + scale / abs(concentration)
.
Mathematical Details
The probability density function (pdf) is,
pdf(x; mu, sigma, shp, x > mu) = (1 + shp * (x - mu) / sigma)**(-1 / shp - 1) / sigma
where:
-
concentration = shp
, any real value, -
scale = sigma
,sigma > 0
, -
loc = mu
.
The cumulative density function (cdf) is,
cdf(x; mu, sigma, shp, x > mu) = 1 - (1 + shp * (x - mu) / sigma)**(-1 / shp)
Distribution parameters are automatically broadcast in all functions; see examples for details. Samples of this distribution are reparameterized (pathwise differentiable).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Geometric distribution
Description
The Geometric distribution is parameterized by p, the probability of a positive event. It represents the probability that in k + 1 Bernoulli trials, the first k trials failed, before seeing a success. The pmf of this distribution is:
Usage
tfd_geometric(
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Geometric"
)
Arguments
logits |
Floating-point |
probs |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
pmf(k; p) = (1 - p)**k * p
where:
-
p
is the success probability,0 < p <= 1
, and, -
k
is a non-negative integer.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Scalar Gumbel distribution with location loc
and scale
parameters
Description
Mathematical details
Usage
tfd_gumbel(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Gumbel"
)
Arguments
loc |
Floating point tensor, the means of the distribution(s). |
scale |
Floating point tensor, the scales of the distribution(s). 'scale“ must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) of this distribution is,
pdf(x; mu, sigma) = exp(-(x - mu) / sigma - exp(-(x - mu) / sigma)) / sigma
where loc = mu
and scale = sigma
.
The cumulative density function of this distribution is,
cdf(x; mu, sigma) = exp(-exp(-(x - mu) / sigma))
The Gumbel distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ Gumbel(loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Half-Cauchy distribution
Description
The half-Cauchy distribution is parameterized by a loc
and a
scale
parameter. It represents the right half of the two symmetric halves in
a Cauchy distribution.
Usage
tfd_half_cauchy(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "HalfCauchy"
)
Arguments
loc |
Floating-point |
scale |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) for the half-Cauchy distribution is given by
pdf(x; loc, scale) = 2 / (pi scale (1 + z**2)) z = (x - loc) / scale
where loc
is a scalar in R
and scale
is a positive scalar in R
.
The support of the distribution is given by the interval [loc, infinity)
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Half-Normal distribution with scale scale
Description
Mathematical details
Usage
tfd_half_normal(
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "HalfNormal"
)
Arguments
scale |
Floating point tensor; the scales of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The half normal is a transformation of a centered normal distribution.
If some random variable X
has normal distribution,
X ~ Normal(0.0, scale) Y = |X|
Then Y
will have half normal distribution. The probability density
function (pdf) is:
pdf(x; scale, x > 0) = sqrt(2) / (scale * sqrt(pi)) * exp(- 1/2 * (x / scale) ** 2))
Where scale = sigma
is the standard deviation of the underlying normal
distribution.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Hidden Markov model distribution
Description
The HiddenMarkovModel
distribution implements a (batch of) hidden
Markov models where the initial states, transition probabilities
and observed states are all given by user-provided distributions.
Usage
tfd_hidden_markov_model(
initial_distribution,
transition_distribution,
observation_distribution,
num_steps,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "HiddenMarkovModel"
)
Arguments
A | |
A | |
A | |
The number of steps taken in Markov chain. An | |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. | |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. | |
name prefixed to Ops created by this class. |
Details
This model assumes that the transition matrices are fixed over time.
In this model, there is a sequence of integer-valued hidden states:
z[0], z[1], ..., z[num_steps - 1]
and a sequence of observed states:
x[0], ..., x[num_steps - 1]
.
The distribution of z[0]
is given by initial_distribution
.
The conditional probability of z[i + 1]
given z[i]
is described by
the batch of distributions in transition_distribution
.
For a batch of hidden Markov models, the coordinates before the rightmost one
of the transition_distribution
batch correspond to indices into the hidden
Markov model batch. The rightmost coordinate of the batch is used to select
which distribution z[i + 1]
is drawn from. The distributions corresponding
to the probability of z[i + 1]
conditional on z[i] == k
is given by the
elements of the batch whose rightmost coordinate is k
.
Similarly, the conditional distribution of z[i]
given x[i]
is given by
the batch of observation_distribution
.
When the rightmost coordinate of observation_distribution
is k
it
gives the conditional probabilities of x[i]
given z[i] == k
.
The probability distribution associated with the HiddenMarkovModel
distribution is the marginal distribution of x[0],...,x[num_steps - 1]
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Horseshoe distribution
Description
The so-called 'horseshoe' distribution is a Cauchy-Normal scale mixture,
proposed as a sparsity-inducing prior for Bayesian regression. It is
symmetric around zero, has heavy (Cauchy-like) tails, so that large
coefficients face relatively little shrinkage, but an infinitely tall spike at
0, which pushes small coefficients towards zero. It is parameterized by a
positive scalar scale
parameter: higher values yield a weaker
sparsity-inducing effect.
Usage
tfd_horseshoe(
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Horseshoe"
)
Arguments
scale |
Floating point tensor; the scales of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details
The Horseshoe distribution is centered at zero, with scale parameter $lambda$. It is defined by:
horseshoe(scale = lambda) ~ Normal(0, lamda * sigma)
where sigma ~ half_cauchy(0, 1)
Value
a distribution instance.
References
-
Carvalho, Polson, Scott. Handling Sparsity via the Horseshoe (2008).
-
Barry, Parlange, Li. Approximation for the exponential integral (2000).
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Independent distribution from batch of distributions
Description
This distribution is useful for regarding a collection of independent,
non-identical distributions as a single random variable. For example, the
Independent
distribution composed of a collection of Bernoulli
distributions might define a distribution over an image (where each
Bernoulli
is a distribution over each pixel).
Usage
tfd_independent(
distribution,
reinterpreted_batch_ndims = NULL,
validate_args = FALSE,
name = paste0("Independent", distribution$name)
)
Arguments
distribution |
The base distribution instance to transform. Typically an instance of Distribution |
reinterpreted_batch_ndims |
Scalar, integer number of rightmost batch dims which will be regarded as event dims. When NULL all but the first batch axis (batch axis 0) will be transferred to event dimensions (analogous to tf$layers$flatten). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
The name for ops managed by the distribution. Default value: Independent + distribution.name. |
Details
More precisely, a collection of B
(independent) E
-variate random variables
(rv) {X_1, ..., X_B}
, can be regarded as a [B, E]
-variate random variable
(X_1, ..., X_B)
with probability
p(x_1, ..., x_B) = p_1(x_1) * ... * p_B(x_B)
where p_b(X_b)
is the
probability of the b
-th rv. More generally B, E
can be arbitrary shapes.
Similarly, the Independent
distribution specifies a distribution over
[B, E]
-shaped events. It operates by reinterpreting the rightmost batch dims as
part of the event dimensions. The reinterpreted_batch_ndims
parameter
controls the number of batch dims which are absorbed as event dims;
reinterpreted_batch_ndims <= len(batch_shape)
. For example, the log_prob
function entails a reduce_sum
over the rightmost reinterpreted_batch_ndims
after calling the base distribution's log_prob
. In other words, since the
batch dimension(s) index independent distributions, the resultant multivariate
will have independent components.
Mathematical Details
The probability function is,
prob(x; reinterpreted_batch_ndims) = tf.reduce_prod(dist.prob(x), axis=-1-range(reinterpreted_batch_ndims))
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
InverseGamma distribution
Description
The InverseGamma
distribution is defined over positive real numbers using
parameters concentration
(aka "alpha") and scale
(aka "beta").
Usage
tfd_inverse_gamma(
concentration,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "InverseGamma"
)
Arguments
concentration |
Floating point tensor, the concentration params of the distribution(s). Must contain only positive values. |
scale |
Floating point tensor, the scale params of the distribution(s).
Must contain only positive values. This parameter was called |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; alpha, beta, x > 0) = x**(-alpha - 1) exp(-beta / x) / Z Z = Gamma(alpha) beta**-alpha
where:
-
concentration = alpha
, -
scale = beta
, -
Z
is the normalizing constant, and, -
Gamma
is the gamma function.
The cumulative density function (cdf) is,
cdf(x; alpha, beta, x > 0) = GammaInc(alpha, beta / x) / Gamma(alpha)#' ``` where `GammaInc` is the [upper incomplete Gamma function](https://en.wikipedia.org/wiki/Incomplete_gamma_function). The parameters can be intuited via their relationship to mean and variance when these moments exist,
mean = beta / (alpha - 1) when alpha > 1 variance = beta**2 / (alpha - 1)**2 / (alpha - 2) when alpha > 2
i.e., under the same conditions:
alpha = mean2 / variance + 2 beta = mean * (mean2 / variance + 1)
Distribution parameters are automatically broadcast in all functions; see examples for details. Samples of this distribution are reparameterized (pathwise differentiable). The derivatives are computed using the approach described in the paper [Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018](https://arxiv.org/abs/1805.08498) [gamma function]: R:gamma%20function [upper incomplete Gamma function]: R:upper%20incomplete%20Gamma%20function [Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018]: R:Michael%20Figurnov,%20Shakir%20Mohamed,%20Andriy%20Mnih.%20Implicit%20Reparameterization%20Gradients,%202018
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Inverse Gaussian distribution
Description
The inverse Gaussian distribution
is parameterized by a loc
and a concentration
parameter. It's also known
as the Wald distribution. Some, e.g., the Python scipy package, refer to the
special case when loc
is 1 as the Wald distribution.
Usage
tfd_inverse_gaussian(
loc,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "InverseGaussian"
)
Arguments
loc |
Floating-point |
concentration |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The "inverse" in the name does not refer to the distribution associated to the multiplicative inverse of a random variable. Rather, the cumulant generating function of this distribution is the inverse to that of a Gaussian random variable.
Mathematical Details
The probability density function (pdf) is,
pdf(x; mu, lambda) = [lambda / (2 pi x ** 3)] ** 0.5 exp{-lambda(x - mu) ** 2 / (2 mu ** 2 x)}
where
-
loc = mu
-
concentration = lambda
.
The support of the distribution is defined on (0, infinity)
.
Mapping to R and Python scipy's parameterization:
R: statmod::invgauss
mean = loc
shape = concentration
dispersion = 1 / concentration. Used only if shape is NULL.
Python: scipy.stats.invgauss
mu = loc / concentration
scale = concentration
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Johnson's SU-distribution.
Description
This distribution has parameters: shape parameters skewness
and
tailweight
, location loc
, and scale
.
Usage
tfd_johnson_s_u(
skewness,
tailweight,
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = NULL
)
Arguments
skewness |
Floating-point |
tailweight |
Floating-point |
loc |
Floating-point |
scale |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details
The probability density function (pdf) is,
pdf(x; s, t, xi, sigma) = exp(-0.5 (s + t arcsinh(y))**2) / Z where, s = skewness t = tailweight y = (x - xi) / sigma Z = sigma sqrt(2 pi) sqrt(1 + y**2) / t
where:
-
loc = xi
, -
scale = sigma
, and, -
Z
is the normalization constant. The JohnsonSU distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ JohnsonSU(skewness, tailweight, loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Joint distribution parameterized by named distribution-making functions.
Description
This distribution enables both sampling and joint probability computation from
a single model specification.
A joint distribution is a collection of possibly interdependent distributions.
Like JointDistributionSequential
, JointDistributionNamed
is parameterized
by several distribution-making functions. Unlike JointDistributionNamed
,
each distribution-making function must have its own key. Additionally every
distribution-making function's arguments must refer to only specified keys.
Usage
tfd_joint_distribution_named(model, validate_args = FALSE, name = NULL)
Arguments
model |
named list of distribution-making functions each with required args corresponding only to other keys in the named list. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
The name for ops managed by the distribution. Default value: |
Details
Mathematical Details
Internally JointDistributionNamed
implements the chain rule of probability.
That is, the probability function of a length-d
vector x
is,
p(x) = prod{ p(x[i] | x[:i]) : i = 0, ..., (d - 1) }
The JointDistributionNamed
is parameterized by a dict
(or namedtuple
)
composed of either:
-
tfp$distributions$Distribution
-like instances or, functions which return a
tfp$distributions$Distribution
-like instance. The "conditioned on" elements are represented by the function's required arguments; every argument must correspond to a key in the named distribution-making functions. Distribution-makers which are directly aDistribution
-like instance are allowed for convenience and semantically identical a zero argument function. When the maker takes no arguments it is preferable to directly provide the distribution instance.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Joint distribution parameterized by named distribution-making functions.
Description
This class provides automatic vectorization and alternative semantics for
tfd_joint_distribution_named()
, which in many cases allows for
simplifications in the model specification.
Usage
tfd_joint_distribution_named_auto_batched(
model,
batch_ndims = 0,
use_vectorized_map = TRUE,
validate_args = FALSE,
name = NULL
)
Arguments
model |
A generator that yields a sequence of |
batch_ndims |
|
use_vectorized_map |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
name prefixed to Ops created by this class. |
Details
Automatic vectorization
Auto-vectorized variants of JointDistribution allow the user to avoid
explicitly annotating a model's vectorization semantics.
When using manually-vectorized joint distributions, each operation in the
model must account for the possibility of batch dimensions in Distributions
and their samples. By contrast, auto-vectorized models need only describe
a single sample from the joint distribution; any batch evaluation is
automated using tf$vectorized_map
as required. In many cases this
allows for significant simplications. For example, the following
manually-vectorized tfd_joint_distribution_named()
model:
model <- tfd_joint_distribution_sequential( list( x = tfd_normal(loc = 0, scale = tf$ones(3L)), y = tfd_normal(loc = 0, scale = 1), z = function(y, x) { tfd_normal(loc = x[reticulate::py_ellipsis(), 1:2] + y[reticulate::py_ellipsis(), tf$newaxis], scale = 1) } ) )
can be written in auto-vectorized form as
model <- tfd_joint_distribution_sequential_auto_batched( list( x = tfd_normal(loc = 0, scale = tf$ones(3L)), y = tfd_normal(loc = 0, scale = 1), z = function(y, x) {tfd_normal(loc = x[1:2] + y, scale = 1)} ) )
in which we were able to avoid explicitly accounting for batch dimensions
when indexing and slicing computed quantities in the third line.
Note: auto-vectorization is still experimental and some TensorFlow ops may
be unsupported. It can be disabled by setting use_vectorized_map=FALSE
.
Alternative batch semantics
This class also provides alternative semantics for specifying a batch of
independent (non-identical) joint distributions.
Instead of simply summing the log_prob
s of component distributions
(which may have different shapes), it first reduces the component log_prob
s
to ensure that jd$log_prob(jd$sample())
always returns a scalar, unless
batch_ndims
is explicitly set to a nonzero value (in which case the result
will have the corresponding tensor rank).
The essential changes are:
An
event
ofJointDistributionNamedAutoBatched
is the list of tensors produced by$sample()
; thus, theevent_shape
is the list containing the shapes of sampled tensors. These combine both the event and batch dimensions of the component distributions. By contrast, the event shape of a baseJointDistribution
s does not include batch dimensions of component distributions.The
batch_shape
is a global property of the entire model, rather than a per-component property as in baseJointDistribution
s. The global batch shape must be a prefix of the batch shapes of each component; the length of this prefix is specified by an optional argumentbatch_ndims
. Ifbatch_ndims
is not specified, the model has batch shape()
.#'
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Joint distribution parameterized by distribution-making functions
Description
This distribution enables both sampling and joint probability computation from a single model specification.
Usage
tfd_joint_distribution_sequential(model, validate_args = FALSE, name = NULL)
Arguments
model |
list of either |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
name prefixed to Ops created by this class. |
Details
A joint distribution is a collection of possibly interdependent distributions.
Like tf$keras$Sequential
, the JointDistributionSequential
can be specified
via a list
of functions (each responsible for making a
tfp$distributions$Distribution
-like instance). Unlike
tf$keras$Sequential
, each function can depend on the output of all previous
elements rather than only the immediately previous.
Mathematical Details
The JointDistributionSequential
implements the chain rule of probability.
That is, the probability function of a length-d
vector x
is,
p(x) = prod{ p(x[i] | x[:i]) : i = 0, ..., (d - 1) }
The JointDistributionSequential
is parameterized by a list
comprised of
either:
-
tfp$distributions$Distribution
-like instances or, -
callable
s which return atfp$distributions$Distribution
-like instance. Eachlist
element implements thei
-th full conditional distribution,p(x[i] | x[:i])
. The "conditioned on" elements are represented by thecallable
's required arguments. Directly providing aDistribution
-like nstance is a convenience and is semantically identical a zero argumentcallable
. Denote thei
-thcallable
s non-default arguments asargs[i]
. Since thecallable
is the conditional manifest,0 <= len(args[i]) <= i - 1
. Whenlen(args[i]) < i - 1
, thecallable
only depends on a subset of the previous distributions, specifically those at indexes:range(i - 1, i - 1 - num_args[i], -1)
.
Name resolution: The names of
JointDistributionSequentialcomponents are defined by explicit
name arguments passed to distributions (
tfd.Normal(0., 1., name='x')) and/or by the argument names in distribution-making functions (
lambda x: tfd.Normal(x., 1.)). Both approaches may be used in the same distribution, as long as they are consistent; referring to a single component by multiple names will raise a
ValueError'. Unnamed components will be assigned a dummy name.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Joint distribution parameterized by distribution-making functions.
Description
This class provides automatic vectorization and alternative semantics for
tfd_joint_distribution_sequential()
, which in many cases allows for
simplifications in the model specification.
Usage
tfd_joint_distribution_sequential_auto_batched(
model,
batch_ndims = 0,
use_vectorized_map = TRUE,
validate_args = FALSE,
name = NULL
)
Arguments
model |
A generator that yields a sequence of |
batch_ndims |
|
use_vectorized_map |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
name prefixed to Ops created by this class. |
Details
Automatic vectorization
Auto-vectorized variants of JointDistribution allow the user to avoid
explicitly annotating a model's vectorization semantics.
When using manually-vectorized joint distributions, each operation in the
model must account for the possibility of batch dimensions in Distributions
and their samples. By contrast, auto-vectorized models need only describe
a single sample from the joint distribution; any batch evaluation is
automated using tf$vectorized_map
as required. In many cases this
allows for significant simplications. For example, the following
manually-vectorized tfd_joint_distribution_sequential()
model:
model <- tfd_joint_distribution_sequential( list( tfd_normal(loc = 0, scale = tf$ones(3L)), tfd_normal(loc = 0, scale = 1), function(y, x) { tfd_normal(loc = x[reticulate::py_ellipsis(), 1:2] + y[reticulate::py_ellipsis(), tf$newaxis], scale = 1) } ) )
can be written in auto-vectorized form as
model <- tfd_joint_distribution_sequential_auto_batched( list( tfd_normal(loc = 0, scale = tf$ones(3L)), tfd_normal(loc = 0, scale = 1), function(y, x) {tfd_normal(loc = x[1:2] + y, scale = 1)} ) )
in which we were able to avoid explicitly accounting for batch dimensions
when indexing and slicing computed quantities in the third line.
Note: auto-vectorization is still experimental and some TensorFlow ops may
be unsupported. It can be disabled by setting use_vectorized_map=FALSE
.
Alternative batch semantics
This class also provides alternative semantics for specifying a batch of
independent (non-identical) joint distributions.
Instead of simply summing the log_prob
s of component distributions
(which may have different shapes), it first reduces the component log_prob
s
to ensure that jd$log_prob(jd$sample())
always returns a scalar, unless
batch_ndims
is explicitly set to a nonzero value (in which case the result
will have the corresponding tensor rank).
The essential changes are:
An
event
ofJointDistributionSequentialAutoBatched
is the list of tensors produced by$sample()
; thus, theevent_shape
is the list containing the shapes of sampled tensors. These combine both the event and batch dimensions of the component distributions. By contrast, the event shape of a baseJointDistribution
s does not include batch dimensions of component distributions.The
batch_shape
is a global property of the entire model, rather than a per-component property as in baseJointDistribution
s. The global batch shape must be a prefix of the batch shapes of each component; the length of this prefix is specified by an optional argumentbatch_ndims
. Ifbatch_ndims
is not specified, the model has batch shape()
.#'
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Computes the Kullback–Leibler divergence.
Description
Denote this distribution by p and the other distribution by q.
Assuming p, q are absolutely continuous with respect to reference measure r,
the KL divergence is defined as:
KL[p, q] = E_p[log(p(X)/q(X))] = -int_F p(x) log q(x) dr(x) + int_F p(x) log p(x) dr(x) = H[p, q] - H[p]
where F denotes the support of the random variable X ~ p
, H[., .]
denotes (Shannon) cross entropy, and H[.]
denotes (Shannon) entropy.
Usage
tfd_kl_divergence(distribution, other, name = "kl_divergence")
Arguments
distribution |
The distribution being used. |
other |
|
name |
String prepended to names of ops created by this function. |
Value
self$dtype Tensor with shape [B1, ..., Bn]
representing n different calculations
of the Kullback-Leibler divergence.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d1 <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d2 <- tfd_normal(loc = c(1.5, 2), scale = c(1, 0.5))
d1 %>% tfd_kl_divergence(d2)
Kumaraswamy distribution
Description
The Kumaraswamy distribution is defined over the (0, 1)
interval using
parameters concentration1
(aka "alpha") and concentration0
(aka "beta"). It has a
shape similar to the Beta distribution, but is easier to reparameterize.
Usage
tfd_kumaraswamy(
concentration1 = 1,
concentration0 = 1,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Kumaraswamy"
)
Arguments
concentration1 |
Positive floating-point |
concentration0 |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; alpha, beta) = alpha * beta * x**(alpha - 1) * (1 - x**alpha)**(beta - 1)
where:
-
concentration1 = alpha
, -
concentration0 = beta
, Distribution parameters are automatically broadcast in all functions.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Laplace distribution with location loc
and scale
parameters
Description
Mathematical details
Usage
tfd_laplace(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Laplace"
)
Arguments
loc |
Floating point tensor which characterizes the location (center) of the distribution. |
scale |
Positive floating point tensor which characterizes the spread of the distribution. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) of this distribution is,
pdf(x; mu, sigma) = exp(-|x - mu| / sigma) / Z Z = 2 sigma
where loc = mu
, scale = sigma
, and Z
is the normalization constant.
Note that the Laplace distribution can be thought of two exponential distributions spliced together "back-to-back." The Laplace distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ Laplace(loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Observation distribution from a linear Gaussian state space model
Description
The state space model, sometimes called a Kalman filter, posits a
latent state vector z_t
of dimension latent_size
that evolves
over time following linear Gaussian transitions,
z_{t+1} = F * z_t + N(b; Q)
for transition matrix F
, bias b
and covariance matrix
Q
. At each timestep, we observe a noisy projection of the
latent state x_t = H * z_t + N(c; R)
. The transition and
observation models may be fixed or may vary between timesteps.
Usage
tfd_linear_gaussian_state_space_model(
num_timesteps,
transition_matrix,
transition_noise,
observation_matrix,
observation_noise,
initial_state_prior,
initial_step = 0L,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LinearGaussianStateSpaceModel"
)
Arguments
num_timesteps |
Integer |
transition_matrix |
A transition operator, represented by a Tensor or
LinearOperator of shape |
transition_noise |
An instance of
|
observation_matrix |
An observation operator, represented by a Tensor
or LinearOperator of shape |
observation_noise |
An instance of |
initial_state_prior |
An instance of |
initial_step |
optional |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
This Distribution represents the marginal distribution on
observations, p(x)
. The marginal log_prob
is computed by
Kalman filtering, and sample
by an efficient forward
recursion. Both operations require time linear in T
, the total
number of timesteps.
Shapes
The event shape is [num_timesteps, observation_size]
, where
observation_size
is the dimension of each observation x_t
.
The observation and transition models must return consistent
shapes.
This implementation supports vectorized computation over a batch of
models. All of the parameters (prior distribution, transition and
observation operators and noise models) must have a consistent
batch shape.
Time-varying processes
Any of the model-defining parameters (prior distribution, transition
and observation operators and noise models) may be specified as a
callable taking an integer timestep t
and returning a
time-dependent value. The dimensionality (latent_size
and
observation_size
) must be the same at all timesteps.
Importantly, the timestep is passed as a Tensor
, not a Python
integer, so any conditional behavior must occur inside the
TensorFlow graph. For example, suppose we want to use a different
transition model on even days than odd days. It does not work to
write
transition_matrix <- function(t) { if(t %% 2 == 0) even_day_matrix else odd_day_matrix }
since the value of t
is not fixed at graph-construction
time. Instead we need to write
transition_matrix <- function(t) { tf$cond(tf$equal(tf$mod(t, 2), 0), function() even_day_matrix, function() odd_day_matrix) }
so that TensorFlow can switch between operators appropriately at runtime.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
LKJ distribution on correlation matrices
Description
This is a one-parameter of distributions on correlation matrices. The
probability density is proportional to the determinant raised to the power of
the parameter: pdf(X; eta) = Z(eta) * det(X) ** (eta - 1)
, where Z(eta)
is
a normalization constant. The uniform distribution on correlation matrices is
the special case eta = 1
.
Usage
tfd_lkj(
dimension,
concentration,
input_output_cholesky = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LKJ"
)
Arguments
dimension |
|
concentration |
|
input_output_cholesky |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The distribution is named after Lewandowski, Kurowicka, and Joe, who gave a sampler for the distribution in Lewandowski, Kurowicka, Joe, 2009.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Log cumulative distribution function.
Description
Given random variable X, the cumulative distribution function cdf is:
tfd_log_cdf(x) := Log[ P[X <= x] ]
Often, a numerical approximation can be used for tfd_log_cdf(x)
that yields
a more accurate answer than simply taking the logarithm of the cdf when x << -1.
Usage
tfd_log_cdf(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_log_cdf(x)
The log-logistic distribution.
Description
The LogLogistic distribution models positive-valued random variables
whose logarithm is a logistic distribution with loc loc
and
scale scale
. It is constructed as the exponential
transformation of a Logistic distribution.
Usage
tfd_log_logistic(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LogLogistic"
)
Arguments
loc |
Floating-point |
scale |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Log-normal distribution
Description
The LogNormal distribution models positive-valued random variables
whose logarithm is normally distributed with mean loc
and
standard deviation scale
. It is constructed as the exponential
transformation of a Normal distribution.
The LogNormal distribution models positive-valued random variables
whose logarithm is normally distributed with mean loc
and
standard deviation scale
. It is constructed as the exponential
transformation of a Normal distribution.
Usage
tfd_log_normal(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LogNormal"
)
tfd_log_normal(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LogNormal"
)
Arguments
loc |
Floating-point |
scale |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Log probability density/mass function.
Description
Log probability density/mass function.
Usage
tfd_log_prob(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_log_prob(x)
Log survival function.
Description
Given random variable X, the survival function is defined:
tfd_log_survival_function(x) = Log[ P[X > x] ] = Log[ 1 - P[X <= x] ] = Log[ 1 - cdf(x) ]
Usage
tfd_log_survival_function(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Details
Typically, different numerical approximations can be used for the log survival function, which are more accurate than 1 - cdf(x) when x >> 1.
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_log_survival_function(x)
Logistic distribution with location loc
and scale
parameters
Description
Mathematical details
Usage
tfd_logistic(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Logistic"
)
Arguments
loc |
Floating point tensor, the means of the distribution(s). |
scale |
Floating point tensor, the scales of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The cumulative density function of this distribution is:
cdf(x; mu, sigma) = 1 / (1 + exp(-(x - mu) / sigma))
where loc = mu
and scale = sigma
.
The Logistic distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ Logistic(loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Logit-Normal distribution
Description
The Logit-Normal distribution models positive-valued random variables whose
logit (i.e., sigmoid_inverse, i.e., log(p) - log1p(-p)
) is normally
distributed with mean loc
and standard deviation scale
. It is
constructed as the sigmoid transformation, (i.e., 1 / (1 + exp(-x))
) of a
Normal distribution.
Usage
tfd_logit_normal(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "LogitNormal"
)
Arguments
loc |
Floating point tensor; the means of the distribution(s). |
scale |
loating point tensor; the stddevs of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Mean.
Description
Mean.
Usage
tfd_mean(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_mean()
Mixture distribution
Description
The Mixture
object implements batched mixture distributions.
The mixture model is defined by a Categorical
distribution (the mixture)
and a list of Distribution
objects.
Usage
tfd_mixture(
cat,
components,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Mixture"
)
Arguments
cat |
A |
components |
A list or tuple of |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Methods supported include tfd_log_prob
, tfd_prob
, tfd_mean
, tfd_sample
,
and entropy_lower_bound
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Mixture (same-family) distribution
Description
The MixtureSameFamily
distribution implements a (batch of) mixture
distribution where all components are from different parameterizations of the
same distribution type. It is parameterized by a Categorical
"selecting
distribution" (over k
components) and a components distribution, i.e., a
Distribution
with a rightmost batch shape (equal to [k]
) which indexes
each (batch of) component.
Usage
tfd_mixture_same_family(
mixture_distribution,
components_distribution,
reparameterize = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MixtureSameFamily"
)
Arguments
mixture_distribution |
|
components_distribution |
|
reparameterize |
Logical, default |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Mode.
Description
Mode.
Usage
tfd_mode(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_mode()
Multinomial distribution
Description
This Multinomial distribution is parameterized by probs
, a (batch of)
length-K
prob
(probability) vectors (K > 1
) such that
tf.reduce_sum(probs, -1) = 1
, and a total_count
number of trials, i.e.,
the number of trials per draw from the Multinomial. It is defined over a
(batch of) length-K
vector counts
such that
tf$reduce_sum(counts, -1) = total_count
. The Multinomial is identically the
Binomial distribution when K = 2
.
Usage
tfd_multinomial(
total_count,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Multinomial"
)
Arguments
total_count |
Non-negative floating point tensor with shape broadcastable
to |
logits |
Floating point tensor representing unnormalized log-probabilities
of a positive event with shape broadcastable to
|
probs |
Positive floating point tensor with shape broadcastable to
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Multinomial is a distribution over K
-class counts, i.e., a length-K
vector of non-negative integer counts = n = [n_0, ..., n_{K-1}]
.
The probability mass function (pmf) is,
pmf(n; pi, N) = prod_j (pi_j)**n_j / Z Z = (prod_j n_j!) / N!
where:
-
probs = pi = [pi_0, ..., pi_{K-1}]
,pi_j > 0
,sum_j pi_j = 1
, -
total_count = N
,N
a positive integer, -
Z
is the normalization constant, and, -
N!
denotesN
factorial.
Distribution parameters are automatically broadcast in all functions; see examples for details.
Pitfalls
The number of classes, K
, must not exceed:
the largest integer representable by
self$dtype
, i.e.,2**(mantissa_bits+1)
(IEE754),the maximum
Tensor
index, i.e.,2**31-1
.
Note: This condition is validated only when validate_args = TRUE
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Multivariate normal distribution on R^k
Description
The Multivariate Normal distribution is defined over R^k`` and parameterized by a (batch of) length-k loc vector (aka "mu") and a (batch of)
k x kscale matrix;
covariance = scale @ scale.Twhere
@' denotes
matrix-multiplication.
Usage
tfd_multivariate_normal_diag(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateNormalDiag"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is implicitly 0.
When specified, may have shape |
scale_diag |
Non-zero, floating-point Tensor representing a diagonal matrix added to scale.
May have shape |
scale_identity_multiplier |
Non-zero, floating-point Tensor representing a scaled-identity-matrix
added to scale. May have shape |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-0.5 ||y||**2) / Z y = inv(scale) @ (x - loc) Z = (2 pi)**(0.5 k) |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
A (non-batch) scale
matrix is:
scale = diag(scale_diag + scale_identity_multiplier * ones(k))
where:
-
scale_diag.shape = [k]
, and, -
scale_identity_multiplier.shape = []
.#'
Additional leading dimensions (if any) will index batches.
If both scale_diag
and scale_identity_multiplier
are NULL
, then
scale
is the Identity matrix.
The MultivariateNormal distribution is a member of the
location-scale family, i.e., it can be
constructed as,
X ~ MultivariateNormal(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Multivariate normal distribution on R^k
Description
The Multivariate Normal distribution is defined over R^k`` and parameterized by a (batch of) length-k loc vector (aka "mu") and a (batch of)
k x kscale matrix;
covariance = scale @ scale.Twhere
@' denotes
matrix-multiplication.
Usage
tfd_multivariate_normal_diag_plus_low_rank(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
scale_perturb_factor = NULL,
scale_perturb_diag = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateNormalDiagPlusLowRank"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is implicitly 0.
When specified, may have shape |
scale_diag |
Non-zero, floating-point Tensor representing a diagonal matrix added to scale.
May have shape |
scale_identity_multiplier |
Non-zero, floating-point Tensor representing a scaled-identity-matrix
added to scale. May have shape |
scale_perturb_factor |
Floating-point |
scale_perturb_diag |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-0.5 ||y||**2) / Z y = inv(scale) @ (x - loc) Z = (2 pi)**(0.5 k) |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
A (non-batch) scale
matrix is:
scale = diag(scale_diag + scale_identity_multiplier ones(k)) + scale_perturb_factor @ diag(scale_perturb_diag) @ scale_perturb_factor.T
where:
-
scale_diag.shape = [k]
, -
scale_identity_multiplier.shape = []
, -
scale_perturb_factor.shape = [k, r]
, typicallyk >> r
, and, -
scale_perturb_diag.shape = [r]
.
Additional leading dimensions (if any) will index batches.
If both scale_diag
and scale_identity_multiplier
are NULL
, then
scale
is the Identity matrix.
The MultivariateNormal distribution is a member of the
location-scale family, i.e., it can be
constructed as,
X ~ MultivariateNormal(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Multivariate normal distribution on R^k
Description
The Multivariate Normal distribution is defined over R^k`` and parameterized by a (batch of) length-k loc vector (aka "mu") and a (batch of)
k x kscale matrix;
covariance = scale @ scale.Twhere
@' denotes
matrix-multiplication.
Usage
tfd_multivariate_normal_full_covariance(
loc = NULL,
covariance_matrix = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateNormalFullCovariance"
)
Arguments
loc |
Floating-point |
covariance_matrix |
Floating-point, symmetric positive definite |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-0.5 ||y||**2) / Z y = inv(scale) @ (x - loc) Z = (2 pi)**(0.5 k) |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
The MultivariateNormal distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ MultivariateNormal(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
The batch_shape
is the broadcast shape between loc
and
covariance_matrix
arguments.
The event_shape
is given by last dimension of the matrix implied by
covariance_matrix
. The last dimension of loc
(if provided) must
broadcast with this.
A non-batch covariance_matrix
matrix is a k x k
symmetric positive
definite matrix. In other words it is (real) symmetric with all eigenvalues
strictly positive.
Additional leading dimensions (if any) will index batches.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The multivariate normal distribution on R^k
Description
The Multivariate Normal distribution is defined over R^k`` and parameterized by a (batch of) length-k loc vector (aka "mu") and a (batch of)
k x kscale matrix;
covariance = scale @ scale.Twhere
@' denotes
matrix-multiplication.
Usage
tfd_multivariate_normal_linear_operator(
loc = NULL,
scale = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateNormalLinearOperator"
)
Arguments
loc |
Floating-point |
scale |
Instance of |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-0.5 ||y||**2) / Z y = inv(scale) @ (x - loc) Z = (2 pi)**(0.5 k) |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
The MultivariateNormal distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ MultivariateNormal(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
The batch_shape
is the broadcast shape between loc
and scale
arguments.
The event_shape
is given by last dimension of the matrix implied by
scale
. The last dimension of loc
(if provided) must broadcast with this.
Recall that covariance = scale @ scale.T
.
Additional leading dimensions (if any) will index batches.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The multivariate normal distribution on R^k
Description
The Multivariate Normal distribution is defined over R^k`` and parameterized by a (batch of) length-k loc vector (aka "mu") and a (batch of)
k x kscale matrix;
covariance = scale @ scale.Twhere
@' denotes
matrix-multiplication.
Usage
tfd_multivariate_normal_tri_l(
loc = NULL,
scale_tril = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateNormalTriL"
)
Arguments
loc |
Floating-point |
scale_tril |
Floating-point, lower-triangular |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-0.5 ||y||**2) / Z y = inv(scale) @ (x - loc) Z = (2 pi)**(0.5 k) |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
A (non-batch) scale
matrix is:
scale = scale_tril
where scale_tril
is lower-triangular k x k
matrix with non-zero diagonal,
i.e., tf$diag_part(scale_tril) != 0
.
Additional leading dimensions (if any) will index batches.
The MultivariateNormal distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ MultivariateNormal(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Multivariate Student's t-distribution on R^k
Description
Mathematical Details
Usage
tfd_multivariate_student_t_linear_operator(
df,
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "MultivariateStudentTLinearOperator"
)
Arguments
df |
A positive floating-point |
loc |
Floating-point |
scale |
Instance of |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) is,
pdf(x; df, loc, Sigma) = (1 + ||y||**2 / df)**(-0.5 (df + k)) / Z where, y = inv(Sigma) (x - loc) Z = abs(det(Sigma)) sqrt(df pi)**k Gamma(0.5 df) / Gamma(0.5 (df + k))
where:
-
df
is a positive scalar. -
loc
is a vector inR^k
, -
Sigma
is a positive definiteshape
matrix inR^{k x k}
, parameterized asscale @ scale.T
in this class, -
Z
denotes the normalization constant, and, -
||y||**2
denotes the squared Euclidean norm ofy
.
The Multivariate Student's t-distribution distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ MultivariateT(loc=0, scale=1) # Identity scale, zero shift. Y = scale @ X + loc
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
NegativeBinomial distribution
Description
The NegativeBinomial distribution is related to the experiment of performing
Bernoulli trials in sequence. Given a Bernoulli trial with probability p
of
success, the NegativeBinomial distribution represents the distribution over
the number of successes s
that occur until we observe f
failures.
Usage
tfd_negative_binomial(
total_count,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "NegativeBinomial"
)
Arguments
total_count |
Non-negative floating-point |
logits |
Floating-point |
probs |
Positive floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability mass function (pmf) is,
pmf(s; f, p) = p**s (1 - p)**f / Z Z = s! (f - 1)! / (s + f - 1)!
where:
-
total_count = f
, -
probs = p
, -
Z
is the normalizaing constant, and, -
n!
is the factorial ofn
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Normal distribution with loc and scale parameters
Description
Mathematical details
Usage
tfd_normal(
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Normal"
)
Arguments
loc |
Floating point tensor; the means of the distribution(s). |
scale |
loating point tensor; the stddevs of the distribution(s). Must contain only positive values. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) is,
pdf(x; mu, sigma) = exp(-0.5 (x - mu)**2 / sigma**2) / Z Z = (2 pi sigma**2)**0.5
where loc = mu
is the mean, scale = sigma
is the std. deviation, and, Z
is the normalization constant.
The Normal distribution is a member of the location-scale family, i.e., it can be
constructed as,
X ~ Normal(loc=0, scale=1) Y = loc + scale * X
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
OneHotCategorical distribution
Description
The categorical distribution is parameterized by the log-probabilities of a set of classes. The difference between OneHotCategorical and Categorical distributions is that OneHotCategorical is a discrete distribution over one-hot bit vectors whereas Categorical is a discrete distribution over positive integers. OneHotCategorical is equivalent to Categorical except Categorical has event_dim=() while OneHotCategorical has event_dim=K, where K is the number of classes.
Usage
tfd_one_hot_categorical(
logits = NULL,
probs = NULL,
dtype = tf$int32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "OneHotCategorical"
)
Arguments
logits |
An N-D Tensor, N >= 1, representing the log probabilities of a set of Categorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of logits for each class. Only one of logits or probs should be passed in. |
probs |
An N-D Tensor, N >= 1, representing the probabilities of a set of Categorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of probabilities for each class. Only one of logits or probs should be passed in. |
dtype |
The type of the event samples (default: int32). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
This class provides methods to create indexed batches of OneHotCategorical distributions. If the provided logits or probs is rank 2 or higher, for every fixed set of leading dimensions, the last dimension represents one single OneHotCategorical distribution. When calling distribution functions (e.g. dist.prob(x)), logits and x are broadcast to the same shape (if possible). In all cases, the last dimension of logits, x represents single OneHotCategorical distributions.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Pareto distribution
Description
The Pareto distribution is parameterized by a scale
and a
concentration
parameter.
Usage
tfd_pareto(
concentration,
scale = 1,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Pareto"
)
Arguments
concentration |
Floating point tensor. Must contain only positive values. |
scale |
Floating point tensor, equivalent to |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(x; alpha, scale, x >= scale) = alpha * scale ** alpha / x ** (alpha + 1) ```#' where `concentration = alpha`. Note that `scale` acts as a scaling parameter, since `Pareto(c, scale).pdf(x) == Pareto(c, 1.).pdf(x / scale)`. The support of the distribution is defined on `[scale, infinity)`.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Modified PERT distribution for modeling expert predictions.
Description
The PERT distribution is a loc-scale family of Beta distributions
fit onto a real interval between low
and high
values set by the user,
along with a peak
to indicate the expert's most frequent prediction,
and temperature
to control how sharp the peak is.
Usage
tfd_pert(
low,
peak,
high,
temperature = 4,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "Pert"
)
Arguments
low |
lower bound |
peak |
most frequent value |
high |
upper bound |
temperature |
controls the shape of the distribution |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The distribution is similar to a Triangular distribution
(i.e. tfd.Triangular
) but with a smooth peak.
Mathematical Details
In terms of a Beta distribution, PERT can be expressed as
PERT ~ loc + scale * Beta(concentration1, concentration0)
where
loc = low scale = high - low concentration1 = 1 + temperature * (peak - low)/(high - low) concentration0 = 1 + temperature * (high - peak)/(high - low) temperature > 0
The support is [low, high]
. The peak
must fit in that interval:
low < peak < high
. The temperature
is a positive parameter that
controls the shape of the distribution. Higher values yield a sharper peak.
The standard PERT distribution is obtained when temperature = 4
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
The Pixel CNN++ distribution
Description
Pixel CNN++ (Salimans et al., 2017) models a distribution over image
data, parameterized by a neural network. It builds on Pixel CNN and
Conditional Pixel CNN, as originally proposed by
(van den Oord et al., 2016).
The model expresses the joint distribution over pixels as
the product of conditional distributions:
p(x|h) = prod{ p(x[i] | x[0:i], h) : i=0, ..., d }
, in which
p(x[i] | x[0:i], h) : i=0, ..., d
is the
probability of the i
-th pixel conditional on the pixels that preceded it in
raster order (color channels in RGB order, then left to right, then top to
bottom). h
is optional additional data on which to condition the image
distribution, such as class labels or VAE embeddings. The Pixel CNN++
network enforces the dependency structure among pixels by applying a mask to
the kernels of the convolutional layers that ensures that the values for each
pixel depend only on other pixels up and to the left.
Pixel values are modeled with a mixture of quantized logistic distributions,
which can take on a set of distinct integer values (e.g. between 0 and 255
for an 8-bit image).
Color intensity v
of each pixel is modeled as:
v ~ sum{q[i] * quantized_logistic(loc[i], scale[i]) : i = 0, ..., k }
,
in which k
is the number of mixture components and the q[i]
are the
Categorical probabilities over the components.
Usage
tfd_pixel_cnn(
image_shape,
conditional_shape = NULL,
num_resnet = 5,
num_hierarchies = 3,
num_filters = 160,
num_logistic_mix = 10,
receptive_field_dims = c(3, 3),
dropout_p = 0.5,
resnet_activation = "concat_elu",
use_weight_norm = TRUE,
use_data_init = TRUE,
high = 255,
low = 0,
dtype = tf$float32,
name = "PixelCNN"
)
Arguments
image_shape |
3D |
conditional_shape |
|
num_resnet |
|
num_hierarchies |
|
num_filters |
|
num_logistic_mix |
|
receptive_field_dims |
|
dropout_p |
|
resnet_activation |
|
use_weight_norm |
|
use_data_init |
|
high |
|
low |
|
dtype |
Data type of the |
name |
|
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Plackett-Luce distribution over permutations.
Description
The Plackett-Luce distribution is defined over permutations of
fixed length. It is parameterized by a positive score vector of same length.
This class provides methods to create indexed batches of PlackettLuce
distributions. If the provided scores
is rank 2 or higher, for
every fixed set of leading dimensions, the last dimension represents one
single PlackettLuce distribution. When calling distribution
functions (e.g. dist.log_prob(x)
), scores
and x
are broadcast to the
same shape (if possible). In all cases, the last dimension of scores, x
represents single PlackettLuce distributions.
Usage
tfd_plackett_luce(
scores,
dtype = tf$int32,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "PlackettLuce"
)
Arguments
scores |
An N-D |
dtype |
The type of the event samples (default: int32). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The Plackett-Luce is a distribution over permutation vectors p
of length k
where the permutation p
is an arbitrary ordering of k
indices
{0, 1, ..., k-1}
.
The probability mass function (pmf) is,
pmf(p; s) = prod_i s_{p_i} / (Z - Z_i) Z = sum_{j=0}^{k-1} s_j Z_i = sum_{j=0}^{i-1} s_{p_j} for i>0 and 0 for i=0
where scores = s = [s_0, ..., s_{k-1}]
, s_i >= 0
.
Samples from Plackett-Luce distribution are generated sequentially as follows.
Initialize normalization `N_0 = Z` For `i` in `{0, 1, ..., k-1}` 1. Sample i-th element of permutation `p_i ~ Categorical(probs=[s_0/N_i, ..., s_{k-1}/N_i])` 2. Update normalization `N_{i+1} = N_i-s_{p_i}` 3. Mask out sampled index for subsequent rounds `s_{p_i} = 0` Return p
Alternately, an equivalent way to sample from this distribution is to sort Gumbel perturbed log-scores (Aditya et al. 2019)
p = argsort(log s + g) ~ PlackettLuce(s) g = [g_0, ..., g_{k-1}], g_i~ Gumbel(0, 1)
Value
a distribution instance.
References
Aditya Grover, Eric Wang, Aaron Zweig, Stefano Ermon. Stochastic Optimization of Sorting Networks via Continuous Relaxations. ICLR 2019.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Poisson distribution
Description
The Poisson distribution is parameterized by an event rate
parameter.
Usage
tfd_poisson(
rate = NULL,
log_rate = NULL,
interpolate_nondiscrete = TRUE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Poisson"
)
Arguments
rate |
Floating point tensor, the rate parameter. |
log_rate |
Floating point tensor, the log of the rate parameter.
Must specify exactly one of |
interpolate_nondiscrete |
Logical. When |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability mass function (pmf) is,
pmf(k; lambda, k >= 0) = (lambda^k / k!) / Z Z = exp(lambda).
where rate = lambda
and Z
is the normalizing constant.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
PoissonLogNormalQuadratureCompound
distribution
Description
The PoissonLogNormalQuadratureCompound
is an approximation to a
Poisson-LogNormal compound distribution, i.e.,
p(k|loc, scale) = int_{R_+} dl LogNormal(l | loc, scale) Poisson(k | l) approx= sum{ prob[d] Poisson(k | lambda(grid[d])) : d=0, ..., deg-1 }
Usage
tfd_poisson_log_normal_quadrature_compound(
loc,
scale,
quadrature_size = 8,
quadrature_fn = tfp$distributions$quadrature_scheme_lognormal_quantiles,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "PoissonLogNormalQuadratureCompound"
)
Arguments
loc |
|
scale |
|
quadrature_size |
|
quadrature_fn |
Function taking |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
By default, the grid
is chosen as quantiles of the LogNormal
distribution
parameterized by loc
, scale
and the prob
vector is
[1. / quadrature_size]*quadrature_size
.
In the non-approximation case, a draw from the LogNormal prior represents the
Poisson rate parameter. Unfortunately, the non-approximate distribution lacks
an analytical probability density function (pdf). Therefore the
PoissonLogNormalQuadratureCompound
class implements an approximation based
on quadrature.
Note: although the PoissonLogNormalQuadratureCompound
is approximately the
Poisson-LogNormal compound distribution, it is itself a valid distribution.
Viz., it possesses a sample
, log_prob
, mean
, variance
, etc. which are
all mutually consistent.
Mathematical Details
The PoissonLogNormalQuadratureCompound
approximates a Poisson-LogNormal
compound distribution.
Using variable-substitution and numerical quadrature (default:
based on LogNormal
quantiles) we can redefine the distribution to be a
parameter-less convex combination of deg
different Poisson samples.
That is, defined over positive integers, this distribution is parameterized
by a (batch of) loc
and scale
scalars.
The probability density function (pdf) is,
pdf(k | loc, scale, deg) = sum{ prob[d] Poisson(k | lambda=exp(grid[d])) : d=0, ..., deg-1 }
Note: probs
returned by (optional) quadrature_fn
are presumed to be
either a length-quadrature_size
vector or a batch of vectors in 1-to-1
correspondence with the returned grid
. (I.e., broadcasting is only partially supported.)
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Power Spherical distribution over unit vectors on S^{n-1}
.
Description
The Power Spherical distribution is a distribution over vectors
on the unit hypersphere S^{n-1}
embedded in n
dimensions (R^n
).
It serves as an alternative to the von Mises-Fisher distribution with a
simpler (faster) log_prob
calculation, as well as a reparameterizable
sampler. In contrast, the Power Spherical distribution does have
-mean_direction
as a point with zero density (and hence a neighborhood
around that having arbitrarily small density), in contrast with the
von Mises-Fisher distribution which has non-zero density everywhere.
NOTE: mean_direction
is not in general the mean of the distribution. For
spherical distributions, the mean is generally not in the support of the
distribution.
Usage
tfd_power_spherical(
mean_direction,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "PowerSpherical"
)
Arguments
mean_direction |
Floating-point |
concentration |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details
The probability density function (pdf) is,
pdf(x; mu, kappa) = C(kappa) (1 + mu^T x) ** k where, C(kappa) = 2**(a + b) pi**b Gamma(a) / Gamma(a + b) a = (n - 1) / 2. + k b = (n - 1) / 2.
where
-
mean_direction = mu
; a unit vector inR^k
, -
concentration = kappa
; scalar real >= 0, concentration of samples aroundmean_direction
, where 0 pertains to the uniform distribution on the hypersphere, and\inf
indicates a delta function atmean_direction
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Probability density/mass function.
Description
Probability density/mass function.
Usage
tfd_prob(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_prob(x)
ProbitBernoulli distribution.
Description
The ProbitBernoulli distribution with probs
parameter, i.e., the probability
of a 1
outcome (vs a 0
outcome). Unlike a regular Bernoulli distribution,
which uses the logistic (aka 'sigmoid') function to go from the un-constrained
parameters to probabilities, this distribution uses the CDF of the
standard normal distribution:
p(x=1; probits) = 0.5 * (1 + erf(probits / sqrt(2))) p(x=0; probits) = 1 - p(x=1; probits)
Where erf
is the error function.
A typical application of this distribution is in
probit regression.
Usage
tfd_probit_bernoulli(
probits = NULL,
probs = NULL,
dtype = tf$int32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "ProbitBernoulli"
)
Arguments
probits |
An N-D |
probs |
An N-D |
dtype |
The type of the event samples. Default: |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Quantile function. Aka "inverse cdf" or "percent point function".
Description
Given random variable X and p in [0, 1]
, the quantile is:
tfd_quantile(p) := x
such that P[X <= x] == p
Usage
tfd_quantile(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_quantile(0.5)
Distribution representing the quantization Y = ceiling(X)
Description
Definition in Terms of Sampling
Usage
tfd_quantized(
distribution,
low = NULL,
high = NULL,
validate_args = FALSE,
name = "QuantizedDistribution"
)
Arguments
distribution |
The base distribution class to transform. Typically an
instance of |
low |
|
high |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
name prefixed to Ops created by this class. |
Details
1. Draw X 2. Set Y <-- ceiling(X) 3. If Y < low, reset Y <-- low 4. If Y > high, reset Y <-- high 5. Return Y
Definition in Terms of the Probability Mass Function
Given scalar random variable X
, we define a discrete random variable Y
supported on the integers as follows:
P[Y = j] := P[X <= low], if j == low, := P[X > high - 1], j == high, := 0, if j < low or j > high, := P[j - 1 < X <= j], all other j.
Conceptually, without cutoffs, the quantization process partitions the real
line R
into half open intervals, and identifies an integer j
with the
right endpoints:
R = ... (-2, -1](-1, 0](0, 1](1, 2](2, 3](3, 4] ... j = ... -1 0 1 2 3 4 ...
P[Y = j]
is the mass of X
within the jth
interval.
If low = 0
, and high = 2
, then the intervals are redrawn
and j
is re-assigned:
R = (-infty, 0](0, 1](1, infty) j = 0 1 2
P[Y = j]
is still the mass of X
within the jth
interval.
@section References:
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
RelaxedBernoulli distribution with temperature and logits parameters
Description
The RelaxedBernoulli is a distribution over the unit interval (0,1), which continuously approximates a Bernoulli. The degree of approximation is controlled by a temperature: as the temperature goes to 0 the RelaxedBernoulli becomes discrete with a distribution described by the logits or probs parameters, as the temperature goes to infinity the RelaxedBernoulli becomes the constant distribution that is identically 0.5.
Usage
tfd_relaxed_bernoulli(
temperature,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "RelaxedBernoulli"
)
Arguments
temperature |
An 0-D Tensor, representing the temperature of a set of RelaxedBernoulli distributions. The temperature should be positive. |
logits |
An N-D Tensor representing the log-odds of a positive event. Each entry in the Tensor parametrizes an independent RelaxedBernoulli distribution where the probability of an event is sigmoid(logits). Only one of logits or probs should be passed in. |
probs |
AAn N-D Tensor representing the probability of a positive event. Each entry in the Tensor parameterizes an independent Bernoulli distribution. Only one of logits or probs should be passed in. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The RelaxedBernoulli distribution is a reparameterized continuous distribution that is the binary special case of the RelaxedOneHotCategorical distribution (Maddison et al., 2016; Jang et al., 2016). For details on the binary special case see the appendix of Maddison et al. (2016) where it is referred to as BinConcrete. If you use this distribution, please cite both papers.
Some care needs to be taken for loss functions that depend on the
log-probability of RelaxedBernoullis, because computing log-probabilities of
the RelaxedBernoulli can suffer from underflow issues. In many case loss
functions such as these are invariant under invertible transformations of
the random variables. The KL divergence, found in the variational autoencoder
loss, is an example. Because RelaxedBernoullis are sampled by a Logistic
random variable followed by a tf$sigmoid
op, one solution is to treat
the Logistic as the random variable and tf$sigmoid
as downstream. The
KL divergences of two Logistics, which are always followed by a tf.sigmoid
op, is equivalent to evaluating KL divergences of RelaxedBernoulli samples.
See Maddison et al., 2016 for more details where this distribution is called
the BinConcrete.
An alternative approach is to evaluate Bernoulli log probability or KL
directly on relaxed samples, as done in Jang et al., 2016. In this case,
guarantees on the loss are usually violated. For instance, using a Bernoulli
KL in a relaxed ELBO is no longer a lower bound on the log marginal
probability of the observation. Thus care and early stopping are important.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
RelaxedOneHotCategorical distribution with temperature and logits
Description
The RelaxedOneHotCategorical is a distribution over random probability
vectors, vectors of positive real values that sum to one, which continuously
approximates a OneHotCategorical. The degree of approximation is controlled by
a temperature: as the temperature goes to 0 the RelaxedOneHotCategorical
becomes discrete with a distribution described by the logits
or probs
parameters, as the temperature goes to infinity the RelaxedOneHotCategorical
becomes the constant distribution that is identically the constant vector of
(1/event_size, ..., 1/event_size).
The RelaxedOneHotCategorical distribution was concurrently introduced as the
Gumbel-Softmax (Jang et al., 2016) and Concrete (Maddison et al., 2016)
distributions for use as a reparameterized continuous approximation to the
Categorical
one-hot distribution. If you use this distribution, please cite
both papers.
Usage
tfd_relaxed_one_hot_categorical(
temperature,
logits = NULL,
probs = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "RelaxedOneHotCategorical"
)
Arguments
temperature |
An 0-D Tensor, representing the temperature of a set of RelaxedOneHotCategorical distributions. The temperature should be positive. |
logits |
An N-D Tensor, N >= 1, representing the log probabilities of a set of RelaxedOneHotCategorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of logits for each class. Only one of logits or probs should be passed in. |
probs |
An N-D Tensor, N >= 1, representing the probabilities of a set of RelaxedOneHotCategorical distributions. The first N - 1 dimensions index into a batch of independent distributions and the last dimension represents a vector of probabilities for each class. Only one of logits or probs should be passed in. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
References
Eric Jang, Shixiang Gu, and Ben Poole. Categorical Reparameterization with Gumbel-Softmax. 2016.
Chris J. Maddison, Andriy Mnih, and Yee Whye Teh. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. 2016.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Generate samples of the specified shape.
Description
Note that a call to tfd_sample()
without arguments will generate a single sample.
Usage
tfd_sample(distribution, sample_shape = list(), ...)
Arguments
distribution |
The distribution being used. |
sample_shape |
0D or 1D int32 Tensor. Shape of the generated samples. |
... |
Additional parameters passed to Python. |
Value
a Tensor with prepended dimensions sample_shape.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_stddev()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_sample()
Sample distribution via independent draws.
Description
This distribution is useful for reducing over a collection of independent, identical draws. It is otherwise identical to the input distribution.
Usage
tfd_sample_distribution(
distribution,
sample_shape = list(),
validate_args = FALSE,
name = NULL
)
Arguments
distribution |
The base distribution instance to transform. Typically an
instance of |
sample_shape |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
name |
The name for ops managed by the distribution.
Default value: |
Details
Mathematical Details The probability function is,
p(x) = prod{ p(x[i]) : i = 0, ..., (n - 1) }
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The SinhArcsinh transformation of a distribution on (-inf, inf)
Description
This distribution models a random variable, making use of
a SinhArcsinh
transformation (which has adjustable tailweight and skew),
a rescaling, and a shift.
The SinhArcsinh
transformation of the Normal is described in great depth in
Sinh-arcsinh distributions.
Here we use a slightly different parameterization, in terms of tailweight
and skewness
. Additionally we allow for distributions other than Normal,
and control over scale
as well as a "shift" parameter loc
.
Usage
tfd_sinh_arcsinh(
loc,
scale,
skewness = NULL,
tailweight = NULL,
distribution = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "SinhArcsinh"
)
Arguments
loc |
Floating-point |
scale |
|
skewness |
Skewness parameter. Default is |
tailweight |
Tailweight parameter. Default is |
distribution |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
Given random variable Z
, we define the SinhArcsinh
transformation of Z
, Y
, parameterized by
(loc, scale, skewness, tailweight)
, via the relation:
Y := loc + scale * F(Z) * (2 / F_0(2)) F(Z) := Sinh( (Arcsinh(Z) + skewness) * tailweight ) F_0(Z) := Sinh( Arcsinh(Z) * tailweight )
This distribution is similar to the location-scale transformation
L(Z) := loc + scale * Z
in the following ways:
If
skewness = 0
andtailweight = 1
(the defaults),F(Z) = Z
, and thenY = L(Z)
exactly.-
loc
is used in both to shift the result by a constant factor. The multiplication of
scale
by2 / F_0(2)
ensures that ifskewness = 0
P[Y - loc <= 2 * scale] = P[L(Z) - loc <= 2 * scale]
. Thus it can be said that the weights in the tails ofY
andL(Z)
beyondloc + 2 * scale
are the same.
This distribution is different than loc + scale * Z
due to the
reshaping done by F
:
Positive (negative)
skewness
leads to positive (negative) skew.positive skew means, the mode of
F(Z)
is "tilted" to the right.positive skew means positive values of
F(Z)
become more likely, and negative values become less likely.Larger (smaller)
tailweight
leads to fatter (thinner) tails.Fatter tails mean larger values of
|F(Z)|
become more likely.-
tailweight < 1
leads to a distribution that is "flat" aroundY = loc
, and a very steep drop-off in the tails. -
tailweight > 1
leads to a distribution more peaked at the mode with heavier tails.
To see the argument about the tails, note that for |Z| >> 1
and
|Z| >> (|skewness| * tailweight)**tailweight
, we have
Y approx 0.5 Z**tailweight e**(sign(Z) skewness * tailweight)
.
To see the argument regarding multiplying scale
by 2 / F_0(2)
,
P[(Y - loc) / scale <= 2] = P[F(Z) * (2 / F_0(2)) <= 2] = P[F(Z) <= F_0(2)] = P[Z <= 2] (if F = F_0).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Skellam distribution.
Description
The Skellam distribution is parameterized by two rate parameters,
rate1
and rate2
. Its samples are defined as:
x ~ Poisson(rate1) y ~ Poisson(rate2) z = x - y z ~ Skellam(rate1, rate2)
where the samples x
and y
are assumed to be independent.
Usage
tfd_skellam(
rate1 = NULL,
rate2 = NULL,
log_rate1 = NULL,
log_rate2 = NULL,
force_probs_to_zero_outside_support = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Skellam"
)
Arguments
rate1 |
Floating point tensor, the first rate parameter. |
rate2 |
Floating point tensor, the second rate parameter. |
log_rate1 |
Floating point tensor, the log of the first rate parameter.
Must specify exactly one of |
log_rate2 |
Floating point tensor, the log of the second rate parameter.
Must specify exactly one of |
force_probs_to_zero_outside_support |
logical. When |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details The probability mass function (pmf) is,
pmf(k; l1, l2) = (l1 / l2) ** (k / 2) * I_k(2 * sqrt(l1 * l2)) / Z Z = exp(l1 + l2).
where rate1 = l1
, rate2 = l2
, Z
is the normalizing constant
and I_k
is the modified bessel function of the first kind.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The uniform distribution over unit vectors on S^{n-1}
.
Description
The uniform distribution on the unit hypersphere S^{n-1}
embedded in
n
dimensions (R^n
).
Usage
tfd_spherical_uniform(
dimension,
batch_shape = list(),
dtype = tf$float32,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "SphericalUniform"
)
Arguments
dimension |
|
batch_shape |
Positive |
dtype |
dtype of the generated samples. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details
The probability density function (pdf) is,
pdf(x; n) = 1. / A(n) where, A(n) = 2 * pi^{n / 2} / Gamma(n / 2), Gamma being the Gamma function.
where n = dimension
; corresponds to S^{n-1}
embedded in R^n
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Standard deviation.
Description
Standard deviation is defined as, stddev = E[(X - E[X])**2]**0.5
#' where X is the random variable associated with this distribution, E denotes expectation,
and Var$shape = batch_shape + event_shape
.
Usage
tfd_stddev(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_survival_function()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_stddev()
Student's t-distribution
Description
This distribution has parameters: degree of freedom df
, location loc
, and scale
.
Usage
tfd_student_t(
df,
loc,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "StudentT"
)
Arguments
df |
Floating-point |
loc |
Floating-point |
scale |
Floating-point |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details
The probability density function (pdf) is,
pdf(x; df, mu, sigma) = (1 + y**2 / df)**(-0.5 (df + 1)) / Z where, y = (x - mu) / sigma Z = abs(sigma) sqrt(df pi) Gamma(0.5 df) / Gamma(0.5 (df + 1))
where:
-
loc = mu
, -
scale = sigma
, and, -
Z
is the normalization constant, and, -
Gamma
is the gamma function. The StudentT distribution is a member of the location-scale family, i.e., it can be constructed as,
X ~ StudentT(df, loc=0, scale=1) Y = loc + scale * X
Notice that scale
has semantics more similar to standard deviation than
variance. However it is not actually the std. deviation; the Student's
t-distribution std. dev. is scale sqrt(df / (df - 2))
when df > 2
.
Samples of this distribution are reparameterized (pathwise differentiable). The derivatives are computed using the approach described in the paper Michael Figurnov, Shakir Mohamed, Andriy Mnih. Implicit Reparameterization Gradients, 2018
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Marginal distribution of a Student's T process at finitely many points
Description
A Student's T process (TP) is an indexed collection of random variables, any finite collection of which are jointly Multivariate Student's T. While this definition applies to finite index sets, it is typically implicit that the index set is infinite; in applications, it is often some finite dimensional real or complex vector space. In such cases, the TP may be thought of as a distribution over (real- or complex-valued) functions defined over the index set.
Usage
tfd_student_t_process(
df,
kernel,
index_points,
mean_fn = NULL,
jitter = 1e-06,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "StudentTProcess"
)
Arguments
df |
Positive Floating-point |
kernel |
|
index_points |
|
mean_fn |
Function that acts on |
jitter |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Just as Student's T distributions are fully specified by their degrees of freedom, location and scale, a Student's T process can be completely specified by a degrees of freedom parameter, mean function and covariance function.
Let S
denote the index set and K
the space in which each indexed random variable
takes its values (again, often R or C).
The mean function is then a map m: S -> K
, and the covariance function,
or kernel, is a positive-definite function k: (S x S) -> K
. The properties
of functions drawn from a TP are entirely dictated (up to translation) by
the form of the kernel function.
This Distribution
represents the marginal joint distribution over function
values at a given finite collection of points [x[1], ..., x[N]]
from the
index set S
. By definition, this marginal distribution is just a
multivariate Student's T distribution, whose mean is given by the vector
[ m(x[1]), ..., m(x[N]) ]
and whose covariance matrix is constructed from
pairwise applications of the kernel function to the given inputs:
| k(x[1], x[1]) k(x[1], x[2]) ... k(x[1], x[N]) | | k(x[2], x[1]) k(x[2], x[2]) ... k(x[2], x[N]) | | ... ... ... | | k(x[N], x[1]) k(x[N], x[2]) ... k(x[N], x[N]) |
For this to be a valid covariance matrix, it must be symmetric and positive
definite; hence the requirement that k
be a positive definite function
(which, by definition, says that the above procedure will yield PD matrices).
Note also we use a parameterization as suggested in Shat et al. (2014), which requires df
to be greater than 2. This allows for the covariance for any finite
dimensional marginal of the TP (a multivariate Student's T distribution) to
just be the PD matrix generated by the kernel.
Mathematical Details
The probability density function (pdf) is a multivariate Student's T whose parameters are derived from the TP's properties:
pdf(x; df, index_points, mean_fn, kernel) = MultivariateStudentT(df, loc, K) K = (df - 2) / df * (kernel.matrix(index_points, index_points) + jitter * eye(N)) loc = (x - mean_fn(index_points))^T @ K @ (x - mean_fn(index_points))
where:
-
df
is the degrees of freedom parameter for the TP. -
index_points
are points in the index set over which the TP is defined, -
mean_fn
is a callable mapping the index set to the TP's mean values, -
kernel
isPositiveSemidefiniteKernel
-like and represents the covariance function of the TP, -
jitter
is added to the diagonal to ensure positive definiteness up to machine precision (otherwise Cholesky-decomposition is prone to failure), -
eye(N)
is an N-by-N identity matrix.
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Survival function.
Description
Given random variable X, the survival function is defined:
tfd_survival_function(x) = P[X > x] = 1 - P[X <= x] = 1 - cdf(x)
.
Usage
tfd_survival_function(distribution, value, ...)
Arguments
distribution |
The distribution being used. |
value |
float or double Tensor. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_variance()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
x <- d %>% tfd_sample()
d %>% tfd_survival_function(x)
A Transformed Distribution
Description
A TransformedDistribution models p(y)
given a base distribution p(x)
,
and a deterministic, invertible, differentiable transform,Y = g(X)
. The
transform is typically an instance of the Bijector class and the base
distribution is typically an instance of the Distribution class.
Usage
tfd_transformed_distribution(
distribution,
bijector,
batch_shape = NULL,
event_shape = NULL,
kwargs_split_fn = NULL,
validate_args = FALSE,
parameters = NULL,
name = NULL
)
Arguments
distribution |
The base distribution instance to transform. Typically an instance of Distribution. |
bijector |
The object responsible for calculating the transformation. Typically an instance of Bijector. |
batch_shape |
integer vector Tensor which overrides distribution batch_shape; valid only if distribution.is_scalar_batch(). |
event_shape |
integer vector Tensor which overrides distribution event_shape; valid only if distribution.is_scalar_event(). |
kwargs_split_fn |
Python |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
parameters |
Locals dict captured by subclass constructor, to be used for copy/slice re-instantiation operations. |
name |
The name for ops managed by the distribution. Default value: bijector.name + distribution.name. |
Details
A Bijector
is expected to implement the following functions:
-
forward
, -
inverse
, -
inverse_log_det_jacobian
.
The semantics of these functions are outlined in the Bijector
documentation.
We now describe how a TransformedDistribution
alters the input/outputs of a
Distribution
associated with a random variable (rv) X
.
Write cdf(Y=y)
for an absolutely continuous cumulative distribution function
of random variable Y
; write the probability density function
pdf(Y=y) := d^k / (dy_1,...,dy_k) cdf(Y=y)
for its derivative wrt to Y
evaluated at
y
. Assume that Y = g(X)
where g
is a deterministic diffeomorphism,
i.e., a non-random, continuous, differentiable, and invertible function.
Write the inverse of g
as X = g^{-1}(Y)
and (J o g)(x)
for the Jacobian
of g
evaluated at x
.
A TransformedDistribution
implements the following operations:
-
sample
Mathematically:Y = g(X)
Programmatically:bijector.forward(distribution.sample(...))
-
log_prob
Mathematically:(log o pdf)(Y=y) = (log o pdf o g^{-1})(y) + (log o abs o det o J o g^{-1})(y)
Programmatically:(distribution.log_prob(bijector.inverse(y)) + bijector.inverse_log_det_jacobian(y))
-
log_cdf
Mathematically:(log o cdf)(Y=y) = (log o cdf o g^{-1})(y)
Programmatically:distribution.log_cdf(bijector.inverse(x))
and similarly for:
cdf
,prob
,log_survival_function
,survival_function
.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Triangular distribution with low
, high
and peak
parameters
Description
The parameters low
, high
and peak
must be shaped in a way that supports
broadcasting (e.g., high - low
is a valid operation).
Usage
tfd_triangular(
low = 0,
high = 1,
peak = 0.5,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Triangular"
)
Arguments
low |
Floating point tensor, lower boundary of the output interval. Must
have |
high |
Floating point tensor, upper boundary of the output interval. Must
have |
peak |
Floating point tensor, mode of the output interval. Must have
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Truncated Cauchy distribution.
Description
The truncated Cauchy is a Cauchy distribution bounded between low
and high
(the pdf is 0 outside these bounds and renormalized).
Samples from this distribution are differentiable with respect to loc
and scale
, but not with respect to the bounds low
and high
.
Usage
tfd_truncated_cauchy(
loc,
scale,
low,
high,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "TruncatedCauchy"
)
Arguments
loc |
Floating point tensor; the modes of the corresponding non-truncated Cauchy distribution(s). |
scale |
Floating point tensor; the scales of the distribution(s). Must contain only positive values. |
low |
|
high |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) of this distribution is:
pdf(x; loc, scale, low, high) = { 1 / (pi * scale * (1 + z**2) * A) for low <= x <= high { 0 otherwise where z = (x - loc) / scale A = CauchyCDF((high - loc) / scale) - CauchyCDF((low - loc) / scale)
where CauchyCDF
is the cumulative density function of the Cauchy distribution
with 0 mean and unit variance.
This is a scalar distribution so the event shape is always scalar and the
dimensions of the parameters define the batch_shape.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Truncated Normal distribution
Description
The truncated normal is a normal distribution bounded between low
and high
(the pdf is 0 outside these bounds and renormalized).
Samples from this distribution are differentiable with respect to loc
,
scale
as well as the bounds, low
and high
, i.e., this
implementation is fully reparameterizeable.
For more details, see here.
Usage
tfd_truncated_normal(
loc,
scale,
low,
high,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "TruncatedNormal"
)
Arguments
loc |
Floating point tensor; the means of the distribution(s). |
scale |
loating point tensor; the stddevs of the distribution(s). Must contain only positive values. |
low |
|
high |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) of this distribution is:
pdf(x; loc, scale, low, high) = { (2 pi)**(-0.5) exp(-0.5 y**2) / (scale * z)} for low <= x <= high { 0 } otherwise y = (x - loc)/scale z = NormalCDF((high - loc) / scale) - NormalCDF((lower - loc) / scale)
where:
-
NormalCDF
is the cumulative density function of the Normal distribution with 0 mean and unit variance.
This is a scalar distribution so the event shape is always scalar and the dimensions of the parameters defined the batch_shape.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Uniform distribution with low
and high
parameters
Description
Mathematical Details
Usage
tfd_uniform(
low = 0,
high = 1,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Uniform"
)
Arguments
low |
Floating point tensor, lower boundary of the output interval. Must
have |
high |
Floating point tensor, upper boundary of the output interval. Must
have |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The probability density function (pdf) is,
pdf(x; a, b) = I[a <= x < b] / Z Z = b - a
where
-
low = a
, -
high = b
, -
Z
is the normalizing constant, and -
I[predicate]
is the indicator function forpredicate
.
The parameters low
and high
must be shaped in a way that supports
broadcasting (e.g., high - low
is a valid operation).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Variance.
Description
Variance is defined as, Var = E[(X - E[X])**2]
where X is the random variable associated with this distribution, E denotes expectation,
and Var$shape = batch_shape + event_shape
.
Usage
tfd_variance(distribution, ...)
Arguments
distribution |
The distribution being used. |
... |
Additional parameters passed to Python. |
Value
a Tensor of shape sample_shape(x) + self$batch_shape
with values of type self$dtype
.
See Also
Other distribution_methods:
tfd_cdf()
,
tfd_covariance()
,
tfd_cross_entropy()
,
tfd_entropy()
,
tfd_kl_divergence()
,
tfd_log_cdf()
,
tfd_log_prob()
,
tfd_log_survival_function()
,
tfd_mean()
,
tfd_mode()
,
tfd_prob()
,
tfd_quantile()
,
tfd_sample()
,
tfd_stddev()
,
tfd_survival_function()
Examples
d <- tfd_normal(loc = c(1, 2), scale = c(1, 0.5))
d %>% tfd_variance()
Posterior predictive of a variational Gaussian process
Description
This distribution implements the variational Gaussian process (VGP), as
described in Titsias (2009) and Hensman (2013). The VGP is an
inducing point-based approximation of an exact GP posterior.
Ultimately, this Distribution class represents a marginal distribution over function values at a
collection of index_points
. It is parameterized by
a kernel function,
a mean function,
the (scalar) observation noise variance of the normal likelihood,
a set of index points,
a set of inducing index points, and
the parameters of the (full-rank, Gaussian) variational posterior distribution over function values at the inducing points, conditional on some observations.
Usage
tfd_variational_gaussian_process(
kernel,
index_points,
inducing_index_points,
variational_inducing_observations_loc,
variational_inducing_observations_scale,
mean_fn = NULL,
observation_noise_variance = 0,
predictive_noise_variance = 0,
jitter = 1e-06,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "VariationalGaussianProcess"
)
Arguments
kernel |
|
index_points |
|
inducing_index_points |
|
variational_inducing_observations_loc |
|
variational_inducing_observations_scale |
|
mean_fn |
function that acts on index points to produce a (batch
of) vector(s) of mean values at those index points. Takes a |
observation_noise_variance |
|
predictive_noise_variance |
|
jitter |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
A VGP is "trained" by selecting any kernel parameters, the locations of the
inducing index points, and the variational parameters. Titsias (2009) and
Hensman (2013) describe a variational lower bound on the marginal log
likelihood of observed data, which this class offers through the
variational_loss
method (this is the negative lower bound, for convenience
when plugging into a TF Optimizer's minimize
function).
Training may be done in minibatches.
Titsias (2009) describes a closed form for the optimal variational
parameters, in the case of sufficiently small observational data (ie,
small enough to fit in memory but big enough to warrant approximating the GP
posterior). A method to compute these optimal parameters in terms of the full
observational data set is provided as a staticmethod,
optimal_variational_posterior
. It returns a
MultivariateNormalLinearOperator
instance with optimal location and scale parameters.
Mathematical Details
Notation We will in general be concerned about three collections of index points, and it'll be good to give them names:
-
x[1], ..., x[N]
: observation index points – locations of our observed data. -
z[1], ..., z[M]
: inducing index points – locations of the "summarizing" inducing points -
t[1], ..., t[P]
: predictive index points – locations where we are making posterior predictions based on observations and the variational parameters.
To lighten notation, we'll use X, Z, T
to denote the above collections.
Similarly, we'll denote by f(X)
the collection of function values at each of
the x[i]
, and by Y
, the collection of (noisy) observed data at each x[i]
.
We'll denote kernel matrices generated from pairs of index points as K_tt
,
K_xt
, K_tz
, etc, e.g.,
K_tz = | k(t[1], z[1]) k(t[1], z[2]) ... k(t[1], z[M]) | | k(t[2], z[1]) k(t[2], z[2]) ... k(t[2], z[M]) | | ... ... ... | | k(t[P], z[1]) k(t[P], z[2]) ... k(t[P], z[M]) |
Preliminaries
A Gaussian process is an indexed collection of random variables, any finite
collection of which are jointly Gaussian. Typically, the index set is some
finite-dimensional, real vector space, and indeed we make this assumption in
what follows. The GP may then be thought of as a distribution over functions
on the index set. Samples from the GP are functions on the whole index set;
these can't be represented in finite compute memory, so one typically works
with the marginals at a finite collection of index points. The properties of
the GP are entirely determined by its mean function m
and covariance
function k
. The generative process, assuming a mean-zero normal likelihood
with stddev sigma
, is
f ~ GP(m, k) Y | f(X) ~ Normal(f(X), sigma), i = 1, ... , N
In finite terms (ie, marginalizing out all but a finite number of f(X), sigma), we can write
f(X) ~ MVN(loc=m(X), cov=K_xx) Y | f(X) ~ Normal(f(X), sigma), i = 1, ... , N
Posterior inference is possible in analytical closed form but becomes intractible as data sizes get large. See Rasmussen (2006) for details.
The VGP
The VGP is an inducing point-based approximation of an exact GP posterior, where two approximating assumptions have been made:
function values at non-inducing points are mutually independent conditioned on function values at the inducing points,
the (expensive) posterior over function values at inducing points conditional on obseravtions is replaced with an arbitrary (learnable) full-rank Gaussian distribution,
q(f(Z)) = MVN(loc=m, scale=S),
where m
and S
are parameters to be chosen by optimizing an evidence
lower bound (ELBO).
The posterior predictive distribution becomes
q(f(T)) = integral df(Z) p(f(T) | f(Z)) q(f(Z)) = MVN(loc = A @ m, scale = B^(1/2))
where
A = K_tz @ K_zz^-1 B = K_tt - A @ (K_zz - S S^T) A^T
The approximate posterior predictive distribution q(f(T))
is what the
VariationalGaussianProcess
class represents.
Model selection in this framework entails choosing the kernel parameters, inducing point locations, and variational parameters. We do this by optimizing a variational lower bound on the marginal log likelihood of observed data. The lower bound takes the following form (see Titsias (2009) and Hensman (2013) for details on the derivation):
L(Z, m, S, Y) = MVN(loc= (K_zx @ K_zz^-1) @ m, scale_diag=sigma).log_prob(Y) - (Tr(K_xx - K_zx @ K_zz^-1 @ K_xz) + Tr(S @ S^T @ K_zz^1 @ K_zx @ K_xz @ K_zz^-1)) / (2 * sigma^2) - KL(q(f(Z)) || p(f(Z))))
where in the final KL term, p(f(Z))
is the GP prior on inducing point
function values. This variational lower bound can be computed on minibatches
of the full data set (X, Y)
. A method to compute the negative variational
lower bound is implemented as VariationalGaussianProcess$variational_loss
.
Optimal variational parameters
As described in Titsias (2009), a closed form optimum for the variational
location and scale parameters, m
and S
, can be computed when the
observational data are not prohibitively voluminous. The
optimal_variational_posterior
function to computes the optimal variational
posterior distribution over inducing point function values in terms of the GP
parameters (mean and kernel functions), inducing point locations, observation
index points, and observations. Note that the inducing index point locations
must still be optimized even when these parameters are known functions of the
inducing index points. The optimal parameters are computed as follows:
C = sigma^-2 (K_zz + K_zx @ K_xz)^-1 optimal Gaussian covariance: K_zz @ C @ K_zz optimal Gaussian location: sigma^-2 K_zz @ C @ K_zx @ Y
Value
a distribution instance.
References
-
Titsias, M. "Variational Model Selection for Sparse Gaussian Process Regression", 2009.
-
Hensman, J., Lawrence, N. "Gaussian Processes for Big Data", 2013.
-
Carl Rasmussen, Chris Williams. Gaussian Processes For Machine Learning, 2006.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
Vector Deterministic Distribution
Description
The VectorDeterministic distribution is parameterized by a batch point loc in R^k. The distribution is supported at this point only, and corresponds to a random variable that is constant, equal to loc.
Usage
tfd_vector_deterministic(
loc,
atol = NULL,
rtol = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorDeterministic"
)
Arguments
loc |
Numeric Tensor of shape [B1, ..., Bb, k], with b >= 0, k >= 0 The point (or batch of points) on which this distribution is supported. |
atol |
Non-negative Tensor of same dtype as loc and broadcastable shape. The absolute tolerance for comparing closeness to loc. Default is 0. |
rtol |
Non-negative Tensor of same dtype as loc and broadcastable shape. The relative tolerance for comparing closeness to loc. Default is 0. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
See Degenerate rv.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
VectorDiffeomixture distribution
Description
A vector diffeomixture (VDM) is a distribution parameterized by a convex
combination of K
component loc
vectors, loc[k], k = 0,...,K-1
, and K
scale
matrices scale[k], k = 0,..., K-1
. It approximates the following
compound distribution
p(x) = int p(x | z) p(z) dz
, where z is in the K-simplex, and
p(x | z) := p(x | loc=sum_k z[k] loc[k], scale=sum_k z[k] scale[k])
Usage
tfd_vector_diffeomixture(
mix_loc,
temperature,
distribution,
loc = NULL,
scale = NULL,
quadrature_size = 8,
quadrature_fn = tfp$distributions$quadrature_scheme_softmaxnormal_quantiles,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorDiffeomixture"
)
Arguments
mix_loc |
|
temperature |
|
distribution |
|
loc |
Length- |
scale |
Length- |
quadrature_size |
|
quadrature_fn |
Function taking |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The integral int p(x | z) p(z) dz
is approximated with a quadrature scheme
adapted to the mixture density p(z)
. The N
quadrature points z_{N, n}
and weights w_{N, n}
(which are non-negative and sum to 1) are chosen such that
q_N(x) := sum_{n=1}^N w_{n, N} p(x | z_{N, n}) --> p(x)
as N --> infinity
.
Since q_N(x)
is in fact a mixture (of N
points), we may sample from
q_N
exactly. It is important to note that the VDM is defined as q_N
above, and not p(x)
. Therefore, sampling and pdf may be implemented as
exact (up to floating point error) methods.
A common choice for the conditional p(x | z)
is a multivariate Normal.
The implemented marginal p(z)
is the SoftmaxNormal
, which is a
K-1
dimensional Normal transformed by a SoftmaxCentered
bijector, making
it a density on the K
-simplex. That is,
Z = SoftmaxCentered(X)
, X = Normal(mix_loc / temperature, 1 / temperature)
The default quadrature scheme chooses z_{N, n}
as N
midpoints of
the quantiles of p(z)
(generalized quantiles if K > 2
).
See Dillon and Langmore (2018) for more details.
About Vector
distributions in TensorFlow.
The VectorDiffeomixture
is a non-standard distribution that has properties
particularly useful in variational Bayesian methods.
Conditioned on a draw from the SoftmaxNormal, X|z
is a vector whose
components are linear combinations of affine transformations, thus is itself
an affine transformation.
Note: The marginals X_1|v, ..., X_d|v
are not generally identical to some
parameterization of distribution
. This is due to the fact that the sum of
draws from distribution
are not generally itself the same distribution
.
About Diffeomixture
s and reparameterization.
The VectorDiffeomixture
is designed to be reparameterized, i.e., its
parameters are only used to transform samples from a distribution which has no
trainable parameters. This property is important because backprop stops at
sources of stochasticity. That is, as long as the parameters are used after
the underlying source of stochasticity, the computed gradient is accurate.
Reparametrization means that we can use gradient-descent (via backprop) to
optimize Monte-Carlo objectives. Such objectives are a finite-sample
approximation of an expectation and arise throughout scientific computing.
WARNING: If you backprop through a VectorDiffeomixture sample and the "base"
distribution is both: not FULLY_REPARAMETERIZED
and a function of trainable
variables, then the gradient is not guaranteed correct!
Value
a distribution instance.
References
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The vectorization of the Exponential distribution on R^k
Description
The vector exponential distribution is defined over a subset of R^k
, and
parameterized by a (batch of) length-k
loc
vector and a (batch of) k x k
scale
matrix: covariance = scale @ scale.T
, where @
denotes
matrix-multiplication.
Usage
tfd_vector_exponential_diag(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorExponentialDiag"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape |
scale_diag |
Non-zero, floating-point Tensor representing a diagonal
matrix added to scale. May have shape |
scale_identity_multiplier |
Non-zero, floating-point Tensor representing
a scaled-identity-matrix added to scale. May have shape
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is defined over the image of the
scale
matrix + loc
, applied to the positive half-space:
Supp = {loc + scale @ x : x in R^k, x_1 > 0, ..., x_k > 0}
. On this set,
pdf(y; loc, scale) = exp(-||x||_1) / Z, for y in Supp x = inv(scale) @ (y - loc), Z = |det(scale)|,
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||x||_1
denotes thel1
norm ofx
,sum_i |x_i|
. The VectorExponential distribution is a member of the location-scale family, i.e., it can be constructed as,
X = (X_1, ..., X_k), each X_i ~ Exponential(rate=1) Y = (Y_1, ...,Y_k) = scale @ X + loc
About VectorExponential
and Vector
distributions in TensorFlow.
The VectorExponential
is a non-standard distribution that has useful
properties.
The marginals Y_1, ..., Y_k
are not Exponential random variables, due to
the fact that the sum of Exponential random variables is not Exponential.
Instead, Y
is a vector whose components are linear combinations of
Exponential random variables. Thus, Y
lives in the vector space generated
by vectors
of Exponential distributions. This allows the user to decide the
mean and covariance (by setting loc
and scale
), while preserving some
properties of the Exponential distribution. In particular, the tails of Y_i
will be (up to polynomial factors) exponentially decaying.
To see this last statement, note that the pdf of Y_i
is the convolution of
the pdf of k
independent Exponential random variables. One can then show by
induction that distributions with exponential (up to polynomial factors) tails
are closed under convolution.
The batch_shape is the broadcast shape between loc and scale
arguments.
The event_shape is given by last dimension of the matrix implied by
scale. The last dimension of loc (if provided) must broadcast with this.
Recall that covariance = 2 * scale @ scale.T
.
Additional leading dimensions (if any) will index batches.
If both scale_diag
and scale_identity_multiplier
are NULL
, then
scale
is the Identity matrix.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The vectorization of the Exponential distribution on R^k
Description
The vector exponential distribution is defined over a subset of R^k
, and
parameterized by a (batch of) length-k
loc
vector and a (batch of) k x k
scale
matrix: covariance = scale @ scale.T
, where @
denotes
matrix-multiplication.
Usage
tfd_vector_exponential_linear_operator(
loc = NULL,
scale = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorExponentialLinearOperator"
)
Arguments
loc |
Floating point tensor; the means of the distribution(s). |
scale |
Instance of LinearOperator with same dtype as loc and shape
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details The probability density function (pdf) is
pdf(y; loc, scale) = exp(-||x||_1) / Z, for y in S(loc, scale), x = inv(scale) @ (y - loc), Z = |det(scale)|,
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
S = {loc + scale @ x : x in R^k, x_1 > 0, ..., x_k > 0}
, is an image of the positive half-space, -
||x||_1
denotes thel1
norm ofx
,sum_i |x_i|
, -
Z
denotes the normalization constant.
The VectorExponential distribution is a member of the location-scale family, i.e., it can be constructed as,
X = (X_1, ..., X_k), each X_i ~ Exponential(rate=1) Y = (Y_1, ...,Y_k) = scale @ X + loc
About VectorExponential
and Vector
distributions in TensorFlow.
The VectorExponential
is a non-standard distribution that has useful
properties.
The marginals Y_1, ..., Y_k
are not Exponential random variables, due to
the fact that the sum of Exponential random variables is not Exponential.
Instead, Y
is a vector whose components are linear combinations of
Exponential random variables. Thus, Y
lives in the vector space generated
by vectors
of Exponential distributions. This allows the user to decide the
mean and covariance (by setting loc
and scale
), while preserving some
properties of the Exponential distribution. In particular, the tails of Y_i
will be (up to polynomial factors) exponentially decaying.
To see this last statement, note that the pdf of Y_i
is the convolution of
the pdf of k
independent Exponential random variables. One can then show by
induction that distributions with exponential (up to polynomial factors) tails
are closed under convolution.
The batch_shape is the broadcast shape between loc and scale
arguments.
The event_shape is given by last dimension of the matrix implied by
scale. The last dimension of loc (if provided) must broadcast with this.
Recall that covariance = 2 * scale @ scale.T
.
Additional leading dimensions (if any) will index batches.
#' @param loc Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape [B1, ..., Bb, k]
where
b >= 0 and k is the event size.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The vectorization of the Laplace distribution on R^k
Description
The vector laplace distribution is defined over R^k
, and parameterized by
a (batch of) length-k loc vector (the means) and a (batch of) k x k
scale matrix: covariance = 2 * scale @ scale.T
, where @ denotes
matrix-multiplication.
Usage
tfd_vector_laplace_diag(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorLaplaceDiag"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape |
scale_diag |
Non-zero, floating-point Tensor representing a diagonal
matrix added to scale. May have shape |
scale_identity_multiplier |
Non-zero, floating-point Tensor representing
a scaled-identity-matrix added to scale. May have shape
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-||y||_1) / Z y = inv(scale) @ (x - loc) Z = 2**k |det(scale)|
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||_1
denotes thel1
norm ofy
, 'sum_i |y_i|.
A (non-batch) scale
matrix is:
scale = diag(scale_diag + scale_identity_multiplier * ones(k))
where:
-
scale_diag.shape = [k]
, and, -
scale_identity_multiplier.shape = []
. Additional leading dimensions (if any) will index batches. If bothscale_diag
andscale_identity_multiplier
areNULL
, thenscale
is the Identity matrix.
About VectorLaplace and Vector distributions in TensorFlow
The VectorLaplace is a non-standard distribution that has useful properties. The marginals Y_1, ..., Y_k are not Laplace random variables, due to the fact that the sum of Laplace random variables is not Laplace. Instead, Y is a vector whose components are linear combinations of Laplace random variables. Thus, Y lives in the vector space generated by vectors of Laplace distributions. This allows the user to decide the mean and covariance (by setting loc and scale), while preserving some properties of the Laplace distribution. In particular, the tails of Y_i will be (up to polynomial factors) exponentially decaying. To see this last statement, note that the pdf of Y_i is the convolution of the pdf of k independent Laplace random variables. One can then show by induction that distributions with exponential (up to polynomial factors) tails are closed under convolution.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The vectorization of the Laplace distribution on R^k
Description
The vector laplace distribution is defined over R^k
, and parameterized by
a (batch of) length-k loc vector (the means) and a (batch of) k x k
scale matrix: covariance = 2 * scale @ scale.T
, where @
denotes
matrix-multiplication.
Usage
tfd_vector_laplace_linear_operator(
loc = NULL,
scale = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorLaplaceLinearOperator"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape |
scale |
Instance of LinearOperator with same dtype as loc and shape
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details The probability density function (pdf) is,
pdf(x; loc, scale) = exp(-||y||_1) / Z, y = inv(scale) @ (x - loc), Z = 2**k |det(scale)|,
where:
-
loc
is a vector inR^k
, -
scale
is a linear operator inR^{k x k}
,cov = scale @ scale.T
, -
Z
denotes the normalization constant, and, -
||y||_1
denotes thel1
norm ofy
, 'sum_i |y_i|.
The VectorLaplace distribution is a member of the location-scale family, i.e., it can be constructed as,
X = (X_1, ..., X_k), each X_i ~ Laplace(loc=0, scale=1) Y = (Y_1, ...,Y_k) = scale @ X + loc
About VectorLaplace and Vector distributions in TensorFlow
The VectorLaplace is a non-standard distribution that has useful properties. The marginals Y_1, ..., Y_k are not Laplace random variables, due to the fact that the sum of Laplace random variables is not Laplace. Instead, Y is a vector whose components are linear combinations of Laplace random variables. Thus, Y lives in the vector space generated by vectors of Laplace distributions. This allows the user to decide the mean and covariance (by setting loc and scale), while preserving some properties of the Laplace distribution. In particular, the tails of Y_i will be (up to polynomial factors) exponentially decaying. To see this last statement, note that the pdf of Y_i is the convolution of the pdf of k independent Laplace random variables. One can then show by induction that distributions with exponential (up to polynomial factors) tails are closed under convolution.
The batch_shape is the broadcast shape between loc and scale
arguments.
The event_shape is given by last dimension of the matrix implied by
scale. The last dimension of loc (if provided) must broadcast with this.
Recall that covariance = 2 * scale @ scale.T
.
Additional leading dimensions (if any) will index batches.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The (diagonal) SinhArcsinh transformation of a distribution on R^k
Description
This distribution models a random vector Y = (Y1,...,Yk)
, making use of
a SinhArcsinh transformation (which has adjustable tailweight and skew),
a rescaling, and a shift.
The SinhArcsinh transformation of the Normal is described in great depth in
Sinh-arcsinh distributions.
Here we use a slightly different parameterization, in terms of tailweight
and skewness. Additionally we allow for distributions other than Normal,
and control over scale as well as a "shift" parameter loc.
Usage
tfd_vector_sinh_arcsinh_diag(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
skewness = NULL,
tailweight = NULL,
distribution = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorSinhArcsinhDiag"
)
Arguments
loc |
Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape |
scale_diag |
Non-zero, floating-point Tensor representing a diagonal
matrix added to scale. May have shape |
scale_identity_multiplier |
Non-zero, floating-point Tensor representing
a scale-identity-matrix added to scale. May have shape
|
skewness |
Skewness parameter. floating-point Tensor with shape broadcastable with event_shape. |
tailweight |
Tailweight parameter. floating-point Tensor with shape broadcastable with event_shape. |
distribution |
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
Given iid random vector Z = (Z1,...,Zk)
, we define the VectorSinhArcsinhDiag
transformation of Z
, Y
, parameterized by
(loc, scale, skewness, tailweight)
, via the relation (with @
denoting matrix multiplication):
Y := loc + scale @ F(Z) * (2 / F_0(2)) F(Z) := Sinh( (Arcsinh(Z) + skewness) * tailweight ) F_0(Z) := Sinh( Arcsinh(Z) * tailweight )
This distribution is similar to the location-scale transformation
L(Z) := loc + scale @ Z
in the following ways:
If
skewness = 0
andtailweight = 1
(the defaults),F(Z) = Z
, and thenY = L(Z)
exactly.-
loc
is used in both to shift the result by a constant factor. The multiplication of
scale
by2 / F_0(2)
ensures that ifskewness = 0
P[Y - loc <= 2 * scale] = P[L(Z) - loc <= 2 * scale]
. Thus it can be said that the weights in the tails ofY
andL(Z)
beyondloc + 2 * scale
are the same. This distribution is different thanloc + scale @ Z
due to the reshaping done byF
:Positive (negative)
skewness
leads to positive (negative) skew.positive skew means, the mode of
F(Z)
is "tilted" to the right.positive skew means positive values of
F(Z)
become more likely, and negative values become less likely.Larger (smaller)
tailweight
leads to fatter (thinner) tails.Fatter tails mean larger values of
|F(Z)|
become more likely.-
tailweight < 1
leads to a distribution that is "flat" aroundY = loc
, and a very steep drop-off in the tails. -
tailweight > 1
leads to a distribution more peaked at the mode with heavier tails. To see the argument about the tails, note that for|Z| >> 1
and|Z| >> (|skewness| * tailweight)**tailweight
, we haveY approx 0.5 Z**tailweight e**(sign(Z) skewness * tailweight)
. To see the argument regarding multiplyingscale
by2 / F_0(2)
,
P[(Y - loc) / scale <= 2] = P[F(Z) * (2 / F_0(2)) <= 2] = P[F(Z) <= F_0(2)] = P[Z <= 2] (if F = F_0).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The von Mises distribution over angles
Description
The von Mises distribution is a univariate directional distribution.
Similarly to Normal distribution, it is a maximum entropy distribution.
The samples of this distribution are angles, measured in radians.
They are 2 pi-periodic: x = 0 and x = 2pi are equivalent.
This means that the density is also 2 pi-periodic.
The generated samples, however, are guaranteed to be in [-pi, pi)
range.
When concentration = 0, this distribution becomes a Uniform distribuion on
the [-pi, pi)
domain.
Usage
tfd_von_mises(
loc,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VonMises"
)
Arguments
loc |
Floating point tensor, the circular means of the distribution(s). |
concentration |
Floating point tensor, the level of concentration of the distribution(s) around loc. Must take non-negative values. concentration = 0 defines a Uniform distribution, while concentration = +inf indicates a Deterministic distribution at loc. |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
The von Mises distribution is a special case of von Mises-Fisher distribution for n=2. However, the TFP's VonMisesFisher implementation represents the samples and location as (x, y) points on a circle, while VonMises represents them as scalar angles.
Mathematical details The probability density function (pdf) of this distribution is,
pdf(x; loc, concentration) = exp(concentration cos(x - loc)) / Z Z = 2 * pi * I_0 (concentration)
where:
-
I_0 (concentration)
is the modified Bessel function of order zero; -
loc
the circular mean of the distribution, a scalar. It can take arbitrary values, but it is 2pi-periodic: loc and loc + 2pi result in the same distribution. -
concentration >= 0
parameter is the concentration parameter. Whenconcentration = 0
, this distribution becomes a Uniform distribution on [-pi, pi).
The parameters loc and concentration must be shaped in a way that supports broadcasting (e.g. loc + concentration is a valid operation).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The von Mises-Fisher distribution over unit vectors on S^{n-1}
Description
The von Mises-Fisher distribution is a directional distribution over vectors
on the unit hypersphere S^{n-1}
embedded in n dimensions (R^n)
.
Usage
tfd_von_mises_fisher(
mean_direction,
concentration,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VonMisesFisher"
)
Arguments
mean_direction |
Floating-point Tensor with shape |
concentration |
Floating-point Tensor having batch shape |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical details The probability density function (pdf) is,
pdf(x; mu, kappa) = C(kappa) exp(kappa * mu^T x) where, C(kappa) = (2 pi)^{-n/2} kappa^{n/2-1} / I_{n/2-1}(kappa), I_v(z) being the modified Bessel function of the first kind of order v
where:
-
mean_direction = mu
; a unit vector inR^k
, -
concentration = kappa
; scalar real >= 0, concentration of samples aroundmean_direction
, where 0 pertains to the uniform distribution on the hypersphere, andinf
indicates a delta function atmean_direction
.
NOTE: Currently only n in 2, 3, 4, 5 are supported. For n=5 some numerical instability can occur for low concentrations (<.01).
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The Weibull distribution with 'concentration' and scale
parameters.
Description
The probability density function (pdf) of this distribution is,
pdf(x; lambda, k) = k / lambda * (x / lambda) ** (k - 1) * exp(-(x / lambda) ** k)
where concentration = k
and scale = lambda
.
The cumulative density function of this distribution is,
cdf(x; lambda, k) = 1 - exp(-(x / lambda) ** k)
The Weibull distribution includes the Exponential and Rayleigh distributions as special cases:
Exponential(rate) = Weibull(concentration=1., 1. / rate)
Rayleigh(scale) = Weibull(concentration=2., sqrt(2.) * scale)
Usage
tfd_weibull(
concentration,
scale,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Weibull"
)
Arguments
concentration |
Positive Float-type |
scale |
Positive Float-type |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The matrix Wishart distribution on positive definite matrices
Description
This distribution is defined by a scalar number of degrees of freedom df and an instance of LinearOperator, which provides matrix-free access to a symmetric positive definite operator, which defines the scale matrix.
Usage
tfd_wishart(
df,
scale = NULL,
scale_tril = NULL,
input_output_cholesky = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "Wishart"
)
Arguments
df |
float or double tensor, the degrees of freedom of the distribution(s). df must be greater than or equal to k. |
scale |
float or double Tensor. The symmetric positive definite scale matrix of the distribution. Exactly one of scale and 'scale_tril must be passed. |
scale_tril |
float or double Tensor. The Cholesky factorization of the symmetric positive definite scale matrix of the distribution. Exactly one of scale and 'scale_tril must be passed. |
input_output_cholesky |
Logical. If TRUE, functions whose input or output have the semantics of samples assume inputs are in Cholesky form and return outputs in Cholesky form. In particular, if this flag is TRUE, input to log_prob is presumed of Cholesky form and output from sample, mean, and mode are of Cholesky form. Setting this argument to TRUE is purely a computational optimization and does not change the underlying distribution; for instance, mean returns the Cholesky of the mean, not the mean of Cholesky factors. The variance and stddev methods are unaffected by this flag. Default value: FALSE (i.e., input/output does not have Cholesky semantics). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(X; df, scale) = det(X)**(0.5 (df-k-1)) exp(-0.5 tr[inv(scale) X]) / Z Z = 2**(0.5 df k) |det(scale)|**(0.5 df) Gamma_k(0.5 df)
where:
-
df >= k
denotes the degrees of freedom, -
scale
is a symmetric, positive definite,k x k
matrix, -
Z
is the normalizing constant, and, -
Gamma_k
is the multivariate Gamma function.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_zipf()
The matrix Wishart distribution on positive definite matrices
Description
This distribution is defined by a scalar number of degrees of freedom df and an instance of LinearOperator, which provides matrix-free access to a symmetric positive definite operator, which defines the scale matrix.
Usage
tfd_wishart_linear_operator(
df,
scale,
input_output_cholesky = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "WishartLinearOperator"
)
Arguments
df |
float or double tensor, the degrees of freedom of the distribution(s). df must be greater than or equal to k. |
scale |
|
input_output_cholesky |
Logical. If TRUE, functions whose input or output have the semantics of samples assume inputs are in Cholesky form and return outputs in Cholesky form. In particular, if this flag is TRUE, input to log_prob is presumed of Cholesky form and output from sample, mean, and mode are of Cholesky form. Setting this argument to TRUE is purely a computational optimization and does not change the underlying distribution; for instance, mean returns the Cholesky of the mean, not the mean of Cholesky factors. The variance and stddev methods are unaffected by this flag. Default value: FALSE (i.e., input/output does not have Cholesky semantics). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(X; df, scale) = det(X)**(0.5 (df-k-1)) exp(-0.5 tr[inv(scale) X]) / Z Z = 2**(0.5 df k) |det(scale)|**(0.5 df) Gamma_k(0.5 df)
where:
-
df >= k
denotes the degrees of freedom, -
scale
is a symmetric, positive definite,k x k
matrix, -
Z
is the normalizing constant, and, -
Gamma_k
is the multivariate Gamma function.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()
The matrix Wishart distribution parameterized with Cholesky factors.
Description
This distribution is defined by a scalar degrees of freedom df
and a scale
matrix, expressed as a lower triangular Cholesky factor.
Usage
tfd_wishart_tri_l(
df,
scale_tril,
input_output_cholesky = FALSE,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "WishartTriL"
)
Arguments
df |
float or double tensor, the degrees of freedom of the distribution(s). df must be greater than or equal to k. |
scale_tril |
|
input_output_cholesky |
Logical. If TRUE, functions whose input or output have the semantics of samples assume inputs are in Cholesky form and return outputs in Cholesky form. In particular, if this flag is TRUE, input to log_prob is presumed of Cholesky form and output from sample, mean, and mode are of Cholesky form. Setting this argument to TRUE is purely a computational optimization and does not change the underlying distribution; for instance, mean returns the Cholesky of the mean, not the mean of Cholesky factors. The variance and stddev methods are unaffected by this flag. Default value: FALSE (i.e., input/output does not have Cholesky semantics). |
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details
The probability density function (pdf) is,
pdf(X; df, scale) = det(X)**(0.5 (df-k-1)) exp(-0.5 tr[inv(scale) X]) / Z Z = 2**(0.5 df k) |det(scale)|**(0.5 df) Gamma_k(0.5 df)
where:
-
df >= k
denotes the degrees of freedom, -
scale
is a symmetric, positive definite,k x k
matrix, -
Z
is the normalizing constant, and, -
Gamma_k
is the multivariate Gamma function.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart()
,
tfd_zipf()
Zipf distribution
Description
The Zipf distribution is parameterized by a power parameter.
Usage
tfd_zipf(
power,
dtype = tf$int32,
interpolate_nondiscrete = TRUE,
sample_maximum_iterations = 100,
validate_args = FALSE,
allow_nan_stats = FALSE,
name = "Zipf"
)
Arguments
power |
Float like Tensor representing the power parameter. Must be strictly greater than 1. |
dtype |
The dtype of Tensor returned by sample. Default value: tf$int32. |
interpolate_nondiscrete |
Logical. When FALSE, log_prob returns
-inf (and prob returns 0) for non-integer inputs. When TRUE,
log_prob evaluates the continuous function |
sample_maximum_iterations |
Maximum number of iterations of allowable
iterations in sample. When validate_args=TRUE, samples which fail to
reach convergence (subject to this cap) are masked out with
|
validate_args |
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE. |
allow_nan_stats |
Default value: FALSE. |
name |
name prefixed to Ops created by this class. |
Details
Mathematical Details The probability mass function (pmf) is,
pmf(k; alpha, k >= 0) = (k^(-alpha)) / Z Z = zeta(alpha).
where power = alpha
and Z is the normalization constant.
zeta
is the Riemann zeta function.
Note that gradients with respect to the power
parameter are not
supported in the current implementation.
Value
a distribution instance.
See Also
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_vector_sinh_arcsinh_diag()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
Handle to the tensorflow_probability
module
Description
Handle to the tensorflow_probability
module
Usage
tfp
Format
An object of class python.builtin.module
(inherits from python.builtin.object
) of length 0.
Value
Module(tensorflow_probability)
TensorFlow Probability Version
Description
TensorFlow Probability Version
Usage
tfp_version()
Value
the Python TFP version
The Amari-alpha Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_amari_alpha(logu, alpha = 1, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
alpha |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = TRUE
, the Amari-alpha Csiszar-function is:
f(u) = { -log(u) + (u - 1)}, alpha = 0 { u log(u) - (u - 1)}, alpha = 1 { ((u^alpha - 1) - alpha (u - 1) / (alpha (alpha - 1))}, otherwise
When self_normalized = FALSE
the (u - 1)
terms are omitted.
Warning: when alpha != 0
and/or self_normalized = True
this function makes
non-log-space calculations and may therefore be numerically unstable for
|logu| >> 0
.
Value
amari_alpha_of_u float
-like Tensor
of the Csiszar-function evaluated
at u = exp(logu)
.
References
A. Cichocki and S. Amari. "Families of Alpha-Beta-and GammaDivergences: Flexible and Robust Measures of Similarities." Entropy, vol. 12, no. 6, pp. 1532-1568, 2010.
See Also
Other vi-functions:
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Arithmetic-Geometric Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_arithmetic_geometric(logu, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = True
the Arithmetic-Geometric Csiszar-function is:
f(u) = (1 + u) log( (1 + u) / sqrt(u) ) - (1 + u) log(2)
When self_normalized = False
the (1 + u) log(2)
term is omitted.
Observe that as an f-Divergence, this Csiszar-function implies:
D_f[p, q] = KL[m, p] + KL[m, q] m(x) = 0.5 p(x) + 0.5 q(x)
In a sense, this divergence is the "reverse" of the Jensen-Shannon
f-Divergence.
This Csiszar-function induces a symmetric f-Divergence, i.e.,
D_f[p, q] = D_f[q, p]
.
Warning: when self_normalized = Truethis function makes non-log-space calculations and may therefore be numerically unstable for
|logu| >> 0'.
Value
arithmetic_geometric_of_u: float
-like Tensor
of the
Csiszar-function evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The chi-square Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_chi_square(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Chi-square Csiszar-function is:
f(u) = u**2 - 1
Warning: this function makes non-log-space calculations and may
therefore be numerically unstable for |logu| >> 0
.
Value
chi_square_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
Use VIMCO to lower the variance of the gradient of csiszar_function(Avg(logu))
Description
This function generalizes VIMCO (Mnih and Rezende, 2016) to Csiszar f-Divergences.
Usage
vi_csiszar_vimco(
f,
p_log_prob,
q,
num_draws,
num_batch_draws = 1,
seed = NULL,
name = NULL
)
Arguments
f |
function representing a Csiszar-function in log-space. |
p_log_prob |
function representing the natural-log of the
probability under distribution |
q |
|
num_draws |
Integer scalar number of draws used to approximate the f-Divergence expectation. |
num_batch_draws |
Integer scalar number of draws used to approximate the f-Divergence expectation. |
seed |
|
name |
String prefixed to Ops created by this function. |
Details
Note: if q.reparameterization_type = tfd.FULLY_REPARAMETERIZED
,
consider using monte_carlo_csiszar_f_divergence
.
The VIMCO loss is:
vimco = f(Avg{logu[i] : i=0,...,m-1}) where, logu[i] = log( p(x, h[i]) / q(h[i] | x) ) h[i] iid~ q(H | x)
Interestingly, the VIMCO gradient is not the naive gradient of vimco
.
Rather, it is characterized by:
grad[vimco] - variance_reducing_term
where,
variance_reducing_term = Sum{ grad[log q(h[i] | x)] * (vimco - f(log Avg{h[j;i] : j=0,...,m-1})) #' : i=0, ..., m-1 } h[j;i] = u[j] for j!=i, GeometricAverage{ u[k] : k!=i} for j==i
(We omitted stop_gradient
for brevity. See implementation for more details.)
The Avg{h[j;i] : j}
term is a kind of "swap-out average" where the i
-th
element has been replaced by the leave-i
-out Geometric-average.
This implementation prefers numerical precision over efficiency, i.e.,
O(num_draws * num_batch_draws * prod(batch_shape) * prod(event_shape))
.
(The constant may be fairly large, perhaps around 12.)
Value
vimco The Csiszar f-Divergence generalized VIMCO objective
References
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
Calculates the dual Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_dual_csiszar_function(logu, csiszar_function, name = NULL)
Arguments
logu |
|
csiszar_function |
function representing a Csiszar-function over log-domain. |
name |
name prefixed to Ops created by this function. |
Details
The Csiszar-dual is defined as:
f^*(u) = u f(1 / u)
where f
is some other Csiszar-function.
For example, the dual of kl_reverse
is kl_forward
, i.e.,
f(u) = -log(u) f^*(u) = u f(1 / u) = -u log(1 / u) = u log(u)
The dual of the dual is the original function:
f^**(u) = {u f(1/u)}^*(u) = u (1/u) f(1/(1/u)) = f(u)
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
dual_f_of_u float
-like Tensor
of the result of calculating the dual of
f
at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
Fit a surrogate posterior to a target (unnormalized) log density
Description
The default behavior constructs and minimizes the negative variational
evidence lower bound (ELBO), given by q_samples <- surrogate_posterior$sample(num_draws) elbo_loss <- -tf$reduce_mean(target_log_prob_fn(q_samples) - surrogate_posterior$log_prob(q_samples))
Usage
vi_fit_surrogate_posterior(
target_log_prob_fn,
surrogate_posterior,
optimizer,
num_steps,
convergence_criterion = NULL,
trace_fn = tfp$vi$optimization$`_trace_loss`,
variational_loss_fn = NULL,
discrepancy_fn = tfp$vi$kl_reverse,
sample_size = 1,
importance_sample_size = 1,
trainable_variables = NULL,
jit_compile = NULL,
seed = NULL,
name = "fit_surrogate_posterior"
)
Arguments
target_log_prob_fn |
function that takes a set of |
surrogate_posterior |
A |
optimizer |
Optimizer instance to use. This may be a TF1-style
|
num_steps |
|
convergence_criterion |
Optional instance of
|
trace_fn |
function with signature |
variational_loss_fn |
function with signature |
discrepancy_fn |
A function of Python |
sample_size |
|
importance_sample_size |
An integer number of terms used to define
an importance-weighted divergence. If |
trainable_variables |
Optional list of |
jit_compile |
If |
seed |
integer to seed the random number generator. |
name |
name prefixed to ops created by this function. Default value: 'fit_surrogate_posterior'. |
Details
This corresponds to minimizing the 'reverse' Kullback-Liebler divergence
(KL[q||p]
) between the variational distribution and the unnormalized
target_log_prob_fn
, and defines a lower bound on the marginal log
likelihood, log p(x) >= -elbo_loss
.
More generally, this function supports fitting variational distributions that minimize any Csiszar f-divergence.
Value
results Tensor
or nested structure of Tensor
s, according to
the return type of result_fn
. Each Tensor
has an added leading
dimension of size num_steps
, packing the trajectory of the result
over the course of the optimization.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Jeffreys Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_jeffreys(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Jeffreys Csiszar-function is:
f(u) = 0.5 ( u log(u) - log(u)) = 0.5 kl_forward + 0.5 kl_reverse = symmetrized_csiszar_function(kl_reverse) = symmetrized_csiszar_function(kl_forward)
This Csiszar-function induces a symmetric f-Divergence, i.e.,
D_f[p, q] = D_f[q, p]
.
Warning: this function makes non-log-space calculations and may
therefore be numerically unstable for |logu| >> 0
.
Value
jeffreys_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Jensen-Shannon Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_jensen_shannon(logu, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = True
, the Jensen-Shannon Csiszar-function is:
f(u) = u log(u) - (1 + u) log(1 + u) + (u + 1) log(2)
When self_normalized = False
the (u + 1) log(2)
term is omitted.
Observe that as an f-Divergence, this Csiszar-function implies:
D_f[p, q] = KL[p, m] + KL[q, m] m(x) = 0.5 p(x) + 0.5 q(x)
In a sense, this divergence is the "reverse" of the Arithmetic-Geometric f-Divergence.
This Csiszar-function induces a symmetric f-Divergence, i.e.,
D_f[p, q] = D_f[q, p]
.
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
jensen_shannon_of_u, float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
References
Lin, J. "Divergence measures based on the Shannon entropy." IEEE Trans. Inf. Th., 37, 145-151, 1991.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The forward Kullback-Leibler Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_kl_forward(logu, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = TRUE
, the KL-reverse Csiszar-function is f(u) = u log(u) - (u - 1)
.
When self_normalized = FALSE
the (u - 1)
term is omitted.
Observe that as an f-Divergence, this Csiszar-function implies: D_f[p, q] = KL[q, p]
The KL is "forward" because in maximum likelihood we think of minimizing q
as in KL[p, q]
.
Warning: when self_normalized = Truethis function makes non-log-space calculations and may therefore be numerically unstable for
|logu| >> 0'.
Value
kl_forward_of_u: float
-like Tensor
of the Csiszar-function evaluated at
u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The reverse Kullback-Leibler Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_kl_reverse(logu, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = TRUE
, the KL-reverse Csiszar-function is f(u) = -log(u) + (u - 1)
.
When self_normalized = FALSE
the (u - 1)
term is omitted.
Observe that as an f-Divergence, this Csiszar-function implies: D_f[p, q] = KL[q, p]
The KL is "reverse" because in maximum likelihood we think of minimizing q
as in KL[p, q]
.
Warning: when self_normalized = Truethis function makes non-log-space calculations and may therefore be numerically unstable for
|logu| >> 0'.
Value
kl_reverse_of_u float
-like Tensor
of the Csiszar-function evaluated at
u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The log1p-abs Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_log1p_abs(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Log1p-Abs Csiszar-function is:
f(u) = u**(sign(u-1)) - 1
This function is so-named because it was invented from the following recipe. Choose a convex function g such that g(0)=0 and solve for f:
log(1 + f(u)) = g(log(u)). <=> f(u) = exp(g(log(u))) - 1
That is, the graph is identically g
when y-axis is log1p
-domain and x-axis
is log
-domain.
Warning: this function makes non-log-space calculations and may
therefore be numerically unstable for |logu| >> 0
.
Value
log1p_abs_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Modified-GAN Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_modified_gan(logu, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = True
the modified-GAN (Generative/Adversarial
Network) Csiszar-function is:
f(u) = log(1 + u) - log(u) + 0.5 (u - 1)
When self_normalized = False
the 0.5 (u - 1)
is omitted.
The unmodified GAN Csiszar-function is identical to Jensen-Shannon (with
self_normalized = False
).
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
jensen_shannon_of_u, float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
Monte-Carlo approximation of an f-Divergence variational loss
Description
Variational losses measure the divergence between an unnormalized target
distribution p
(provided via target_log_prob_fn
) and a surrogate
distribution q
(provided as surrogate_posterior
). When the
target distribution is an unnormalized posterior from conditioning a model on
data, minimizing the loss with respect to the parameters of
surrogate_posterior
performs approximate posterior inference.
Usage
vi_monte_carlo_variational_loss(
target_log_prob_fn,
surrogate_posterior,
sample_size = 1L,
importance_sample_size = 1L,
discrepancy_fn = vi_kl_reverse,
use_reparametrization = NULL,
seed = NULL,
name = NULL
)
Arguments
target_log_prob_fn |
function that takes a set of |
surrogate_posterior |
A |
sample_size |
|
importance_sample_size |
integer number of terms used to define an importance-weighted divergence. If importance_sample_size > 1, then the surrogate_posterior is optimized to function as an importance-sampling proposal distribution. In this case it often makes sense to use importance sampling to approximate posterior expectations (see tfp.vi.fit_surrogate_posterior for an example). Default value: 1. |
discrepancy_fn |
function representing a Csiszar |
use_reparametrization |
|
seed |
|
name |
name prefixed to Ops created by this function. |
Details
This function defines divergences of the form
E_q[discrepancy_fn(log p(z) - log q(z))]
, sometimes known as
f-divergences.
In the special case discrepancy_fn(logu) == -logu
(the default
vi_kl_reverse
), this is the reverse Kullback-Liebler divergence
KL[q||p]
, whose negation applied to an unnormalized p
is the widely-used
evidence lower bound (ELBO). Other cases of interest available under
tfp$vi
include the forward KL[p||q]
(given by vi_kl_forward(logu) == exp(logu) * logu
),
total variation distance, Amari alpha-divergences, and more.
Csiszar f-divergences
A Csiszar function f
is a convex function from R^+
(the positive reals)
to R
. The Csiszar f-Divergence is given by:
D_f[p(X), q(X)] := E_{q(X)}[ f( p(X) / q(X) ) ] ~= m**-1 sum_j^m f( p(x_j) / q(x_j) ), where x_j ~iid q(X)
For example, f = lambda u: -log(u)
recovers KL[q||p]
, while f = lambda u: u * log(u)
recovers the forward KL[p||q]
. These and other functions are available in tfp$vi
.
Tricks: Reparameterization and Score-Gradient
When q is "reparameterized", i.e., a diffeomorphic transformation of a
parameterless distribution (e.g., Normal(Y; m, s) <=> Y = sX + m, X ~ Normal(0,1)
),
we can swap gradient and expectation, i.e.,
grad[Avg{ s_i : i=1...n }] = Avg{ grad[s_i] : i=1...n }
where S_n=Avg{s_i}
and s_i = f(x_i), x_i ~iid q(X)
.
However, if q is not reparameterized, TensorFlow's gradient will be incorrect since the chain-rule stops at samples of unreparameterized distributions. In this circumstance using the Score-Gradient trick results in an unbiased gradient, i.e.,
grad[ E_q[f(X)] ] = grad[ int dx q(x) f(x) ] = int dx grad[ q(x) f(x) ] = int dx [ q'(x) f(x) + q(x) f'(x) ] = int dx q(x) [q'(x) / q(x) f(x) + f'(x) ] = int dx q(x) grad[ f(x) q(x) / stop_grad[q(x)] ] = E_q[ grad[ f(x) q(x) / stop_grad[q(x)] ] ]
Unless q.reparameterization_type != tfd.FULLY_REPARAMETERIZED
it is
usually preferable to set use_reparametrization = True
.
Example Application: The Csiszar f-Divergence is a useful framework for variational inference. I.e., observe that,
f(p(x)) = f( E_{q(Z | x)}[ p(x, Z) / q(Z | x) ] ) <= E_{q(Z | x)}[ f( p(x, Z) / q(Z | x) ) ] := D_f[p(x, Z), q(Z | x)]
The inequality follows from the fact that the "perspective" of f
, i.e.,
(s, t) |-> t f(s / t))
, is convex in (s, t)
when s/t in domain(f)
and
t
is a real. Since the above framework includes the popular Evidence Lower
BOund (ELBO) as a special case, i.e., f(u) = -log(u)
, we call this framework
"Evidence Divergence Bound Optimization" (EDBO).
Value
monte_carlo_variational_loss float
-like Tensor
Monte Carlo
approximation of the Csiszar f-Divergence.
References
Ali, Syed Mumtaz, and Samuel D. Silvey. "A general class of coefficients of divergence of one distribution from another." Journal of the Royal Statistical Society: Series B (Methodological) 28.1 (1966): 131-142.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_pearson()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Pearson Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_pearson(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Pearson Csiszar-function is:
f(u) = (u - 1)**2
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
pearson_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_squared_hellinger()
,
vi_symmetrized_csiszar_function()
The Squared-Hellinger Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_squared_hellinger(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Squared-Hellinger Csiszar-function is:
f(u) = (sqrt(u) - 1)**2
This Csiszar-function induces a symmetric f-Divergence, i.e.,
D_f[p, q] = D_f[q, p]
.
Warning: this function makes non-log-space calculations and may
therefore be numerically unstable for |logu| >> 0
.
Value
Squared-Hellinger_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_symmetrized_csiszar_function()
Symmetrizes a Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_symmetrized_csiszar_function(logu, csiszar_function, name = NULL)
Arguments
logu |
|
csiszar_function |
function representing a Csiszar-function over log-domain. |
name |
name prefixed to Ops created by this function. |
Details
The symmetrized Csiszar-function is defined as:
f_g(u) = 0.5 g(u) + 0.5 u g (1 / u)
where g
is some other Csiszar-function.
We say the function is "symmetrized" because:
D_{f_g}[p, q] = D_{f_g}[q, p]
for all p << >> q
(i.e., support(p) = support(q)
).
There exists alternatives for symmetrizing a Csiszar-function. For example,
f_g(u) = max(f(u), f^*(u)),
where f^*
is the dual Csiszar-function, also implies a symmetric
f-Divergence.
Example: When either of the following functions are symmetrized, we obtain the Jensen-Shannon Csiszar-function, i.e.,
g(u) = -log(u) - (1 + u) log((1 + u) / 2) + u - 1 h(u) = log(4) + 2 u log(u / (1 + u))
implies,
f_g(u) = f_h(u) = u log(u) - (1 + u) log((1 + u) / 2) = jensen_shannon(log(u)).
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
symmetrized_g_of_u: float
-like Tensor
of the result of applying the
symmetrization of g
evaluated at u = exp(logu)
.
See Also
Other vi-functions:
vi_amari_alpha()
,
vi_arithmetic_geometric()
,
vi_chi_square()
,
vi_csiszar_vimco()
,
vi_dual_csiszar_function()
,
vi_fit_surrogate_posterior()
,
vi_jeffreys()
,
vi_jensen_shannon()
,
vi_kl_forward()
,
vi_kl_reverse()
,
vi_log1p_abs()
,
vi_modified_gan()
,
vi_monte_carlo_variational_loss()
,
vi_pearson()
,
vi_squared_hellinger()
The T-Power Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_t_power(logu, t, self_normalized = FALSE, name = NULL)
Arguments
logu |
|
t |
|
self_normalized |
|
name |
name prefixed to Ops created by this function. |
Details
When self_normalized = True
the T-Power Csiszar-function is:
f(u) = s [ u**t - 1 - t(u - 1) ] s = { -1 0 < t < 1 } { +1 otherwise }
When self_normalized = False
the - t(u - 1)
term is omitted.
This is similar to the amari_alpha
Csiszar-function, with the associated
divergence being the same up to factors depending only on t
.
Warning: when self_normalized = Truethis function makes non-log-space calculations and may therefore be numerically unstable for
|logu| >> 0'.
Value
t_power_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions#':
vi_total_variation()
,
vi_triangular()
The Total Variation Csiszar-function in log-space
Description
A Csiszar-function is a member of F = { f:R_+ to R : f convex }
.
Usage
vi_total_variation(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
The Total-Variation Csiszar-function is:
f(u) = 0.5 |u - 1|
Warning: this function makes non-log-space calculations and may therefore be
numerically unstable for |logu| >> 0
.
Value
total_variation_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions#':
vi_t_power()
,
vi_triangular()
The Triangular Csiszar-function in log-space
Description
The Triangular Csiszar-function is:
Usage
vi_triangular(logu, name = NULL)
Arguments
logu |
|
name |
name prefixed to Ops created by this function. |
Details
f(u) = (u - 1)**2 / (1 + u)
Warning: this function makes non-log-space calculations and may
therefore be numerically unstable for |logu| >> 0
.
Value
triangular_of_u: float
-like Tensor
of the Csiszar-function
evaluated at u = exp(logu)
.
See Also
Other vi-functions#':
vi_t_power()
,
vi_total_variation()