*Memos:
- My post explains BCE(Binary Cross Entropy) Loss. *My post explains BCELoss().
- My post explains Sigmoid. *My post explains Sigmoid().
- My post explains CrossEntropyLoss().
BCEWithLogitsLoss() can get the 0D or more D tensor of the zero or more values(float
) computed by BCE Loss and Sigmoid from the 0D or more D tensor of zero or more elements as shown below:
*Memos:
- The 1st argument for initialization is
weight
(Optional-Default:None
-Type:tensor
ofint
,float
orbool
): *Memos:- If it's not given, it's
1
. - It must be the 0D or more D tensor of zero or more elements.
- If it's not given, it's
- There is
reduction
argument for initialization(Optional-Default:'mean'
-Type:str
). *'none'
,'mean'
or'sum'
can be selected. - There is
pos_weight
argument for initialization(Optional-Default:None
-Type:tensor
ofint
orfloat
): *Memos:- If it's not given, it's
1
. - It must be the 0D or more D tensor of zero or more elements.
- If it's not given, it's
- There are
size_average
andreduce
argument for initialization but they are deprecated. - The 1st argument is
input
(Required-Type:tensor
offloat
). *It must be the 0D or more D tensor of zero or more elements. - The 2nd argument is
target
(Required-Type:tensor
offloat
). *It must be the 0D or more D tensor of zero or more elements. -
input
andtarget
must be the same size otherwise there is error. - The empty 1D or more D
input
andtarget
tensor withreduction='mean'
returnnan
. - The empty 1D or more D
input
andtarget
tensor withreduction='sum'
return0.
. -
BCEWithLogitsLoss()
is the combination of Sigmoid and BCE Loss.
import torch
from torch import nn
tensor1 = torch.tensor([ 8., -3., 0., 1., 5., -2.])
tensor2 = torch.tensor([-3., 7., 4., -2., -9., 6.])
# -w*(p*y*log(1/(1+exp(-x))+(1-y)*log(1-(1/1+exp(-x))))
# -1*(1*(-3)*log(1/(1+exp(-8)))+(1-(-3))*log(1-(1/(1+exp(-8)))))
# ↓↓↓↓↓↓↓
# 32.0003 + 21.0486 + 0.6931 + 3.3133 + 50.0067 + 50.0067 = 82.8423
# 119.1890 / 6 = 19.8648
bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)
bcelogits
# BCEWithLogitsLoss()
print(bcelogits.weight)
# None
bcelogits.reduction
# 'mean'
bcelogits = nn.BCEWithLogitsLoss(weight=None,
reduction='mean',
pos_weight=None)
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)
bcelogits = nn.BCEWithLogitsLoss(reduction='sum')
bcelogits(input=tensor1, target=tensor2)
# tensor(119.1890)
bcelogits = nn.BCEWithLogitsLoss(reduction='none')
bcelogits(input=tensor1, target=tensor2)
# tensor([32.0003, 21.0486, 0.6931, 3.3133, 50.0067, 12.1269])
bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0., 1., 2., 3., 4., 5.]))
bcelogits(input=tensor1, target=tensor2)
# tensor(48.8394)
bcelogits = nn.BCEWithLogitsLoss(
pos_weight=torch.tensor([0., 1., 2., 3., 4., 5.])
)
bcelogits(input=tensor1, target=tensor2)
# tensor(28.5957)
bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0.))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)
bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0.))
bcelogits(input=tensor1, target=tensor2)
# tensor(13.8338)
bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([0, 1, 2, 3, 4, 5]))
bcelogits(input=tensor1, target=tensor2)
# tensor(48.8394)
bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor([0, 1, 2, 3, 4, 5]))
bcelogits(input=tensor1, target=tensor2)
# tensor(28.5957)
bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor(0))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)
bcelogits = nn.BCEWithLogitsLoss(pos_weight=torch.tensor(0))
bcelogits(input=tensor1, target=tensor2)
# tensor(13.8338)
bcelogits = nn.BCEWithLogitsLoss(
weight=torch.tensor([True, False, True, False, True, False])
)
bcelogits(input=tensor1, target=tensor2)
# tensor(13.7834)
bcelogits = nn.BCEWithLogitsLoss(weight=torch.tensor([False]))
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)
tensor1 = torch.tensor([[8., -3., 0.], [1., 5., -2.]])
tensor2 = torch.tensor([[-3., 7., 4.], [-2., -9., 6.]])
bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)
tensor1 = torch.tensor([[[8.], [-3.], [0.]], [[1.], [5.], [-2.]]])
tensor2 = torch.tensor([[[-3.], [7.], [4.]], [[-2.], [-9.], [6.]]])
bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(19.8648)
tensor1 = torch.tensor([])
tensor2 = torch.tensor([])
bcelogits = nn.BCEWithLogitsLoss(reduction='mean')
bcelogits(input=tensor1, target=tensor2)
# tensor(nan)
bcelogits = nn.BCEWithLogitsLoss(reduction='sum')
bcelogits(input=tensor1, target=tensor2)
# tensor(0.)
BCEWithLogitsLoss()
and BCELoss()
with Sigmoid() or torch.sigmoid() for input
argument can get the same results as shown below. *BCEWithLogitsLoss()
can accept the values out of 0<=y<=1
for input
and target
argument while BCELoss()
cannot:
import torch
from torch import nn
tensor1 = torch.tensor([0.4, 0.8, 0.6, 0.3, 0.0, 0.5])
tensor2 = torch.tensor([0.2, 0.9, 0.4, 0.1, 0.8, 0.5])
bcelogits = nn.BCEWithLogitsLoss()
bcelogits(input=tensor1, target=tensor2)
# tensor(0.7205)
sigmoid = nn.Sigmoid()
bceloss = nn.BCELoss()
bceloss(input=sigmoid(input=tensor1), target=tensor2)
# tensor(0.7205)
bceloss = nn.BCELoss()
bceloss(input=torch.sigmoid(input=tensor1), target=tensor2)
# tensor(0.7205)
Top comments (0)