torch.nn.init
-
torch.nn.init.calculate_gain(nonlinearity, param=None)
[source] -
Return the recommended gain value for the given nonlinearity function. The values are as follows:
nonlinearity
gain
Linear / Identity
Conv{1,2,3}D
Sigmoid
Tanh
ReLU
Leaky Relu
SELU
- Parameters
-
-
nonlinearity – the non-linear function (
nn.functional
name) - param – optional parameter for the non-linear function
-
nonlinearity – the non-linear function (
Examples
>>> gain = nn.init.calculate_gain('leaky_relu', 0.2) # leaky_relu with negative_slope=0.2
-
torch.nn.init.uniform_(tensor, a=0.0, b=1.0)
[source] -
Fills the input Tensor with values drawn from the uniform distribution .
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- a – the lower bound of the uniform distribution
- b – the upper bound of the uniform distribution
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.uniform_(w)
-
torch.nn.init.normal_(tensor, mean=0.0, std=1.0)
[source] -
Fills the input Tensor with values drawn from the normal distribution .
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- mean – the mean of the normal distribution
- std – the standard deviation of the normal distribution
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.normal_(w)
-
torch.nn.init.constant_(tensor, val)
[source] -
Fills the input Tensor with the value .
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- val – the value to fill the tensor with
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.constant_(w, 0.3)
-
torch.nn.init.ones_(tensor)
[source] -
Fills the input Tensor with the scalar value
1
.- Parameters
-
tensor – an n-dimensional
torch.Tensor
Examples
>>> w = torch.empty(3, 5) >>> nn.init.ones_(w)
-
torch.nn.init.zeros_(tensor)
[source] -
Fills the input Tensor with the scalar value
0
.- Parameters
-
tensor – an n-dimensional
torch.Tensor
Examples
>>> w = torch.empty(3, 5) >>> nn.init.zeros_(w)
-
torch.nn.init.eye_(tensor)
[source] -
Fills the 2-dimensional input
Tensor
with the identity matrix. Preserves the identity of the inputs inLinear
layers, where as many inputs are preserved as possible.- Parameters
-
tensor – a 2-dimensional
torch.Tensor
Examples
>>> w = torch.empty(3, 5) >>> nn.init.eye_(w)
-
torch.nn.init.dirac_(tensor, groups=1)
[source] -
Fills the {3, 4, 5}-dimensional input
Tensor
with the Dirac delta function. Preserves the identity of the inputs inConvolutional
layers, where as many input channels are preserved as possible. In case of groups>1, each group of channels preserves identity- Parameters
-
-
tensor – a {3, 4, 5}-dimensional
torch.Tensor
- groups (optional) – number of groups in the conv layer (default: 1)
-
tensor – a {3, 4, 5}-dimensional
Examples
>>> w = torch.empty(3, 16, 5, 5) >>> nn.init.dirac_(w) >>> w = torch.empty(3, 24, 5, 5) >>> nn.init.dirac_(w, 3)
-
torch.nn.init.xavier_uniform_(tensor, gain=1.0)
[source] -
Fills the input
Tensor
with values according to the method described inUnderstanding the difficulty of training deep feedforward neural networks
- Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from whereAlso known as Glorot initialization.
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- gain – an optional scaling factor
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))
-
torch.nn.init.xavier_normal_(tensor, gain=1.0)
[source] -
Fills the input
Tensor
with values according to the method described inUnderstanding the difficulty of training deep feedforward neural networks
- Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from whereAlso known as Glorot initialization.
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- gain – an optional scaling factor
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.xavier_normal_(w)
-
torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
[source] -
Fills the input
Tensor
with values according to the method described inDelving deep into rectifiers: Surpassing human-level performance on ImageNet classification
- He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from whereAlso known as He initialization.
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
-
a – the negative slope of the rectifier used after this layer (only used with
'leaky_relu'
) -
mode – either
'fan_in'
(default) or'fan_out'
. Choosing'fan_in'
preserves the magnitude of the variance of the weights in the forward pass. Choosing'fan_out'
preserves the magnitudes in the backwards pass. -
nonlinearity – the non-linear function (
nn.functional
name), recommended to use only with'relu'
or'leaky_relu'
(default).
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu')
-
torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
[source] -
Fills the input
Tensor
with values according to the method described inDelving deep into rectifiers: Surpassing human-level performance on ImageNet classification
- He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from whereAlso known as He initialization.
- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
-
a – the negative slope of the rectifier used after this layer (only used with
'leaky_relu'
) -
mode – either
'fan_in'
(default) or'fan_out'
. Choosing'fan_in'
preserves the magnitude of the variance of the weights in the forward pass. Choosing'fan_out'
preserves the magnitudes in the backwards pass. -
nonlinearity – the non-linear function (
nn.functional
name), recommended to use only with'relu'
or'leaky_relu'
(default).
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu')
-
torch.nn.init.orthogonal_(tensor, gain=1)
[source] -
Fills the input
Tensor
with a (semi) orthogonal matrix, as described inExact solutions to the nonlinear dynamics of learning in deep linear neural networks
- Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened.- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
, where - gain – optional scaling factor
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.orthogonal_(w)
-
torch.nn.init.sparse_(tensor, sparsity, std=0.01)
[source] -
Fills the 2D input
Tensor
as a sparse matrix, where the non-zero elements will be drawn from the normal distribution , as described inDeep learning via Hessian-free optimization
- Martens, J. (2010).- Parameters
-
-
tensor – an n-dimensional
torch.Tensor
- sparsity – The fraction of elements in each column to be set to zero
- std – the standard deviation of the normal distribution used to generate the non-zero values
-
tensor – an n-dimensional
Examples
>>> w = torch.empty(3, 5) >>> nn.init.sparse_(w, sparsity=0.1)
© 2019 Torch Contributors
Licensed under the 3-clause BSD License.
https://pytorch.org/docs/1.8.0/nn.init.html