Distributions¶
Distribution interfaces¶
-
class
chainerrl.distribution.Distribution[source]¶ Batch of distributions of data.
-
entropy¶ Entropy of distributions.
Returns: chainer.Variable
-
kl¶ Compute KL divergence D_KL(P|Q).
Parameters: distrib (Distribution) – Distribution Q. Returns: chainer.Variable
-
most_probable¶ Most probable data points.
Returns: chainer.Variable
-
params¶ Learnable parameters of this distribution.
Returns: tuple of chainer.Variable
-
Distribution implementations¶
-
class
chainerrl.distribution.SoftmaxDistribution(logits, beta=1.0, min_prob=0.0)[source]¶ Softmax distribution.
Parameters:
-
class
chainerrl.distribution.MellowmaxDistribution(values, omega=8.0)[source]¶ Maximum entropy mellowmax distribution.
See: http://arxiv.org/abs/1612.05628
Parameters: values (ndarray or chainer.Variable) – Values to apply mellowmax.