损失函数 nn.CrossEntropyLoss
。nn.CrossEntropyLoss()是nn.logSoftmax()和nn.NLLLoss()的整合,可以直接使用它来替换网络中的这两个操作,这个函数可以用于多分类问题。函数的参数:weight(Tensor, optional):如果输入这个参数的话必须是一个1维的tensor,长度为类别数C,每个值对应每一类的权重。reduction (string, optional) :指定最终的输出类型,默认为‘mean’。
pytorch中常用的四种优化器。SGD、Momentum、RMSProp、Adam。
opt_SGD = torch.optim.SGD(net_SGD.parameters(),lr=LR)
opt_Momentum = torch.optim.SGD(net_Momentum.parameters(),lr=LR,momentum=0.8)
opt_RMSprop = torch.optim.RMSprop(net_RMSprop.parameters(),lr=LR,alpha=0.9)
opt_Adam = torch.optim.Adam(net_Adam.parameters(),lr=LR,betas=(0.9,0.99))
SGD 是最普通的优化器, 也可以说没有加速效果, 而 Momentum 是 SGD 的改良版, 它加入了动量原则。后面的 RMSprop 又是 Momentum 的升级版。而 Adam 又是 RMSprop 的升级版。不过,并不是越先进的优化器, 结果越佳。
这里用的优化器optim.Adam
。
每一个epoch进行一次训练和测试。对于训练,将训练参数传入网络net.train()
,定义运行时的损失以及开始时间初始化。
net.train()
running_loss = 0.0
t1 = time.perf_counter()
for step, data in enumerate(train_loader, start=0):
images,labels = data
optimizer.zero_grad()
outputs = net(images.to(device))
loss = loss_function(outputs,labels.to(device))
loss.backward()
optimizer.step()
running_loss += loss.item()
rate = (step + 1) / len(train_loader)
a ="*" * int(rate * 50)
b ="." * int((1-rate) * 50)
print("\rtrain loss: {:^3.0f}%[{}->{}]{:.f}".format(int(rate * 100), a, b, loss), end="")
print()
print(time.perf_counter()-t1)
net = AlexNet(num_classes=5,init_weights=True)
net.to(device)
loss_function = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.0002)
save_path = ‘./AlexNet.pth‘
best_acc = 0.0
for epoch in range(10):
net.train()
running_loss = 0.0
t1 = time.perf_counter()
for step, data in enumerate(train_loader, start=0):
images,labels = data
optimizer.zero_grad()
outputs = net(images.to(device))
loss = loss_function(outputs,labels.to(device))
loss.backward()
optimizer.step()
running_loss += loss.item()
rate = (step + 1) / len(train_loader)
a ="*" * int(rate * 50)
b ="." * int((1-rate) * 50)
print("\rtrain loss: {:^3.0f}%[{}->{}]{:.f}".format(int(rate * 100), a, b, loss), end="")
print()
print(time.perf_counter()-t1)
net.eval()
acc = 0.0
with torch.no_grad():
for val_data in validate_loader:
val_images,val_labels = val_data
outputs = net(val_images.to(device))
predict_y = torch.max(outputs, dim=1)[1]
acc += (predict_y == val_labels.to(device)).sum().item()
val_accurate = acc / val_num
if val_accurate > best_acc:
best_acc = val_accurate
torch.save(net.state_dict(), save_path)
print(‘[epoch %d] train_loss: %.3f test_accuracy;%.3f‘ % (epoch + 1, running_loss / step, val_accurate))
print(‘Finished Training‘)
原文:https://www.cnblogs.com/xmy-0904-lfx/p/14890681.html