07-01 Live Session

1 minute read

GAP (Global Average Pooling)

import torch.nn.functional as F
x = F.adaptive_avg_pool2d(x, (1, 1))
x = torch.randn(16, 14, 14)
out = F.adaptive_max_pool2d(x.unsqueeze(0), output_size=1)

# Calculate result manually to compare results
out_manual = torch.stack([out[:, i:i+4].mean() for i in range(0, 16, 4)])

out = out.view(out.size(0), out.size(1)//4, -1)
out = out.mean(2)

print(torch.allclose(out_manual, out))
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv = nn.Sequential(
            #3 224 128
            nn.Conv2d(3, 64, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(64, 64, 3, padding=1),nn.LeakyReLU(0.2),
            nn.MaxPool2d(2, 2),
            #64 112 64
            nn.Conv2d(64, 128, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(128, 128, 3, padding=1),nn.LeakyReLU(0.2),
            nn.MaxPool2d(2, 2),
            #128 56 32
            nn.Conv2d(128, 256, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(256, 256, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(256, 256, 3, padding=1),nn.LeakyReLU(0.2),
            nn.MaxPool2d(2, 2),
            #256 28 16
            nn.Conv2d(256, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(512, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(512, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.MaxPool2d(2, 2),
            #512 14 8
            nn.Conv2d(512, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(512, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.Conv2d(512, 512, 3, padding=1),nn.LeakyReLU(0.2),
            nn.MaxPool2d(2, 2)
        )
        #512 7 4

        self.avg_pool = nn.AvgPool2d(7)
        #512 1 1
        self.classifier = nn.Linear(512, 10)
        """
        self.fc1 = nn.Linear(512*2*2,4096)
        self.fc2 = nn.Linear(4096,4096)
        self.fc3 = nn.Linear(4096,10)
        """

    def forward(self, x):

        #print(x.size())
        features = self.conv(x)
        #print(features.size())
        x = self.avg_pool(features)
        #print(avg_pool.size())
        x = x.view(features.size(0), -1)
        #print(flatten.size())
        x = self.classifier(x)
        #x = self.softmax(x)
        return x, features

CNN History

Alexnet

VGG

Inception

Resnet

resnetμ΄ν›„μ—λŠ” κΉŠμ΄μ— λŒ€ν•œ κ²½μŸμ€ 쀄어듬

Efficientnet

https://hoya012.github.io/blog/EfficientNet-review/

GAN

VAE

RNN

LSTM

RNN의 ν•œκ³„λ₯Ό κ·Ήλ³΅ν•˜κ³ μž ν•΄μ„œ λ‚˜μ˜¨ 것.

Transfer Learning

Domain Adaptation

Generalization

mlp mixer

Q & A

  1. 데이터λ₯Ό 보고 μ „μ΄ν•™μŠ΅μ„ μ“°λ©΄ μ’‹κ² λ‹€ / μ“°λ©΄ μ•ˆλ˜κ² λ‹€ ν•˜λŠ” νŒλ‹¨μ˜ 기쀀이 μžˆμ„κΉŒμš”?
    • inductive: 도메인이 λ˜‘κ°™κ³  νƒœμŠ€ν¬λ§Œ λ‹¬λΌμ§ˆλ•Œ, μ‰½κ²Œ ν•΄κ²° κ°€λŠ₯
    • 도메인이 λ‹¬λΌμ§€λŠ” 경우 μ–΄λ ΅λ‹€.
  2. fcμΈ΅μ—μ„œ avg pooling을 μ¨μ„œ 1λŒ€ 1 λŒ€μΉ­μ΄ 이루어진닀고 ν•˜μ…§λŠ”λ°, fcμΈ΅μ—λŠ” linear측을 쓰지 μ•ŠλŠ”λ‹€λŠ” κ²ƒμΌκΉŒμš”
    • λ§ˆμ§€λ§‰ λ ˆμ΄μ–΄μΈ΅λ§Œ fcμ“°κ³  그전에 gap둜 μΆ”μΆœν•œ 것을 1:1둜 λŒ€μ‘.
  3. μ˜ˆμ „μ—λŠ” μ—°μ‚°λŸ‰μ„ 쀄이렀고 애썼닀면 μš”μ¦˜μ€ ν•˜λ“œμ›¨μ–΄μ˜ λŠ₯λ ₯치λ₯Ό λ―Ώκ³  μ—°μ‚°λŸ‰μ„ μ€„μ΄κΈ°λ³΄λ‹€λŠ” μ„±λŠ₯μœ„μ£Όλ‘œ κ°€κ³  μžˆλ‹€κ³  μƒκ°ν•˜λ©΄ λ˜λ‚˜μš”? μ•„λ‹ˆλ©΄ μ§€κΈˆλ„ μ—°μ‚°λŸ‰μ€ μ΅œλŒ€ν•œ μ€„μ΄λ €λŠ” λ…Έλ ₯을 ν•˜λ‚˜μš”?
    • deeper, lighter

http://cs231n.stanford.edu/slides/2021/lecture_4.pdf
http://cs231n.stanford.edu/slides/2021/lecture_5.pdf
http://cs231n.stanford.edu/slides/2021/lecture_6.pdf

Reference

gap: https://discuss.pytorch.org/t/tensor-global-max-pooling-and-average/38988
cam with gap: https://ctkim.tistory.com/117
CNN Development History: https://hoya012.github.io/blog/deeplearning-classification-guidebook-1/, https://hoya012.github.io/blog/deeplearning-classification-guidebook-2/, https://hoya012.github.io/blog/deeplearning-classification-guidebook-3/, https://hoya012.github.io/blog/deeplearning-classification-guidebook-4/ :http://cs231n.stanford.edu/slides/2021/lecture_5.pdf

Leave a comment