memo: PyTorch | LR & Scheduler

Table of contents

CosAnelWrmRst

Docs

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import torch
from torch import nn
from torch.utils.data import TensorDataset, DataLoader

from matplotlib import pyplot as plt
import numpy as np

inps = torch.arange(50.).expand(10,-1).reshape(100, 5)
tgts = torch.arange(50.).expand(10,-1).reshape(100, 5)
dataset = TensorDataset(inps, tgts)

loader = DataLoader(dataset, batch_size=1, pin_memory=True)

model = nn.Sequential(nn.Linear(5,5))

criterion = torch.nn.MSELoss()
optimizer = torch.optim.AdamW(model.parameters())
scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(
          optimizer, T_0=6,  # First cycle is 6 epochs
          T_mult=2,  # next cycle will be 2x epochs (int)
          eta_min=1e-5,   # minimum lr
          last_epoch=-1, # set when resuming
          verbose=False) # print lr

eps = np.arange(100)
lrs = []

for idx, batch in enumerate(loader):
    x = batch[0]
    y = batch[1]
    y_pred = model(x)
    loss = criterion(y_pred, y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    lrs.append(scheduler.get_last_lr())
    scheduler.step()

plt.plot(eps, lrs)
plt.title("CosineAnnealingWarmRestarts")
  • scheduler.step(0) will set lr to the value at epoch 0.

WarmUp + CosAnelWrmRst

Ref: firstelfin/WarmUpLR

  • The original CosineAnnealingWarmRestarts doesn’t have warmup.
  1. WarmUpLR followed by CosineAnnealingWarmRestarts

    1
    2
    
    cosine = CosineAnnealingWarmRestarts(**param)
    warm_up_lr = WarmUpLR(cosine)
    

    The first 9 epochs use WarmUpLR, and the following use CosineAnnealingWarmRestarts.


A Visual Guide to Learning Rate Schedulers in PyTorch - Medium - Leonie Monigatti


(2023-10-30)

Max_lr Decay

  1. qu-gg/pytorch-cosine-annealing-with-decay-and-initial-warmup

    Found by github searching “CosineAnnealingWarmRestart”. Results

Built with Hugo
Theme Stack designed by Jimmy