Distributions¶
Distribution interfaces¶
-
class
chainerrl.distribution.
Distribution
[source]¶ Batch of distributions of data.
-
entropy
¶ Entropy of distributions.
Returns: chainer.Variable
-
kl
¶ Compute KL divergence D_KL(P|Q).
Parameters: distrib (Distribution) – Distribution Q. Returns: chainer.Variable
-
most_probable
¶ Most probable data points.
Returns: chainer.Variable
-
params
¶ Learnable parameters of this distribution.
Returns: tuple of chainer.Variable
-
Distribution implementations¶
-
class
chainerrl.distribution.
SoftmaxDistribution
(logits, beta=1.0, min_prob=0.0)[source]¶ Softmax distribution.
Parameters:
-
class
chainerrl.distribution.
MellowmaxDistribution
(values, omega=8.0)[source]¶ Maximum entropy mellowmax distribution.
See: http://arxiv.org/abs/1612.05628
Parameters: values (ndarray or chainer.Variable) – Values to apply mellowmax.