*Memos:
- My post explains Layer Normalization.
- My post explains BatchNorm1d().
- My post explains BatchNorm2d().
- My post explains BatchNorm3d().
- My post explains requires_grad.
LayerNorm() can get the 1D or more D tensor of the zero or more elements computed by Layer Normalization from the 1D or more D tensor of zero or more elements as shown below:
*Memos:
- The 1st argument for initialization is
normalized_shape
(Required-Type:int
,tuple
orlist
ofint
or torch.Size). *It must be0 <= x
. - The 2nd argument for initialization is
eps
(Optional-Default:1e-05
-Type:float
). - The 3rd argument for initialization is
elementwise_affine=True
(Optional-Default:True
-Type:bool
). - The 4th argument for initialization is
bias
(Optional-Default:True
-Type:bool
). *My post explainsbias
argument. - The 5th argument for initialization is
device
(Optional-Default:None
-Type:str
,int
or device()): *Memos:- If it's
None
, get_default_device() is used. *My post explainsget_default_device()
and set_default_device(). -
device=
can be omitted. -
My post explains
device
argument.
- If it's
- The 6th argument for initialization is
dtype
(Optional-Default:None
-Type:dtype): *Memos:- If it's
None
, get_default_dtype() is used. *My post explainsget_default_dtype()
and set_default_dtype(). -
dtype=
can be omitted. -
My post explains
dtype
argument.
- If it's
- The 1st argument is
input
(Required-Type:tensor
offloat
): *Memos:- It must be the 1D or more D tensor of zero or more elements.
- The number of the elements of the deepest dimension must be same as
normalized_shape
. - Its
device
anddtype
must be same asLayerNorm()
's. - The tensor's
requires_grad
which isFalse
by default is set toTrue
byLayerNorm()
.
-
layernorm1.device
andlayernorm1.dtype
don't work.
import torch
from torch import nn
tensor1 = torch.tensor([8., -3., 0., 1., 5., -2.])
tensor1.requires_grad
# False
layernorm1 = nn.LayerNorm(normalized_shape=6)
tensor2 = layernorm1(input=tensor1)
tensor2
# tensor([1.6830, -1.1651, -0.3884, -0.1295, 0.9062, -0.9062],
# grad_fn=<NativeLayerNormBackward0>)
tensor2.requires_grad
# True
layernorm1
# LayerNorm((6,), eps=1e-05, elementwise_affine=True)
layernorm1.normalized_shape
# (6,)
layernorm1.eps
# 1e-05
layernorm1.elementwise_affine
# True
layernorm1.bias
# Parameter containing:
# tensor([0., 0., 0., 0., 0., 0.], requires_grad=True)
layernorm1.weight
# Parameter containing:
# tensor([1., 1., 1., 1., 1., 1.], requires_grad=True)
layernorm2 = nn.LayerNorm(normalized_shape=6)
layernorm2(input=tensor2)
# tensor([1.6830, -1.1651, -0.3884, -0.1295, 0.9062, -0.9062],
# grad_fn=<NativeLayerNormBackward0>)
layernorm = nn.LayerNorm(normalized_shape=6, eps=1e-05,
elementwise_affine=True, bias=True,
device=None, dtype=None)
layernorm(input=tensor1)
# tensor([1.6830, -1.1651, -0.3884, -0.1295, 0.9062, -0.9062],
# grad_fn=<NativeLayerNormBackward0>)
my_tensor = torch.tensor([[8., -3., 0.],
[1., 5., -2.]])
layernorm = nn.LayerNorm(normalized_shape=3)
layernorm(input=my_tensor)
# tensor([[1.3641, -1.0051, -0.3590],
# [-0.1162, 1.2787, -1.1625]],
# grad_fn=<NativeLayerNormBackward0>)
layernorm = nn.LayerNorm(normalized_shape=(2, 3))
layernorm(input=my_tensor)
# tensor([[1.6830, -1.1651, -0.3884],
# [-0.1295, 0.9062, -0.9062]],
# grad_fn=<NativeLayerNormBackward0>)
layernorm = nn.LayerNorm(normalized_shape=my_tensor.size())
layernorm(input=my_tensor)
# tensor([[1.6830, -1.1651, -0.3884],
# [-0.1295, 0.9062, -0.9062]],
# grad_fn=<NativeLayerNormBackward0>)
my_tensor = torch.tensor([[8.], [-3.], [0.],
[1.], [5.], [-2.]])
layernorm = nn.LayerNorm(normalized_shape=1)
layernorm(input=my_tensor)
# tensor([[0.], [0.], [0.], [0.], [0.], [0.]],
# grad_fn=<NativeLayerNormBackward0>)
layernorm = nn.LayerNorm(normalized_shape=(6, 1))
layernorm(input=my_tensor)
# tensor([[1.6830], [-1.1651], [-0.3884], [-0.1295], [0.9062], [-0.9062]], # grad_fn=<NativeLayerNormBackward0>)
layernorm = nn.LayerNorm(normalized_shape=my_tensor.size())
layernorm(input=my_tensor)
# tensor([[1.6830], [-1.1651], [-0.3884], [-0.1295], [0.9062], [-0.9062]],
# grad_fn=<NativeLayerNormBackward0>)
my_tensor = torch.tensor([[[8., -3., 0.],
[1., 5., -2.]]])
layernorm = nn.LayerNorm(normalized_shape=3)
layernorm(input=my_tensor)
# tensor([[[1.3641, -1.0051, -0.3590],
# [-0.1162, 1.2787, -1.1625]]],
# grad_fn=<NativeLayerNormBackward0>)
Top comments (0)