NN basics - ML Intro

4 minute read

To do

Gradient Boost

XGBoost

Sources

  • ๊ธฐ๊ณ„ํ•™์Šต, ์˜ค์ผ์„
  • Machine Learning: a Probabilistic Perspective by K. Murphy
  • Deep Learning by Goodfellow, Bengio and Courville
  • stanford cs231n

Prerequisites

  • Linear Algebra
  • Probaility and Statistics
  • Information Theory

๋ฐ์ดํ„ฐ ์˜์—ญ ๊ณต๊ฐ„์€ ๋†’์•„์ง€๋Š”๋ฐ ์–ด๋–ป๊ฒŒ ๋†’์€ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋Š”๊ฐ€?

๊ทธ ์•ˆ์˜ ๋ฐ์ดํ„ฐ๋Š” ์ ์–ด์ง์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ .

๋ฐ์ดํ„ฐ ํฌ์†ŒํŠน์„ฑ ๊ฐ€์ •

Mnist๋ฅผ ์˜ˆ๋กœ๋“ค๋ฉด, 28 by 28์˜ ๊ณต๊ฐ„์—์„œ ๋ชจ๋“  ์˜์—ญ์—์„œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹˜.
ํฌ์†Œํ•œ ์˜์—ญ์—์„œ ๋ฐœ์ƒํ•  ๊ฒƒ์ž„.
\(2^784\) ๊ณต๊ฐ„ ์•ˆ์—์„œ ์ผ๋ถ€ ํฌ์†Œํ•œ ๊ณต๊ฐ„์—์„œ ์œ ์˜๋ฏธํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐœ์ƒํ•จ.

๋งค๋‹ˆํด๋“œ ๊ฐ€์ •

์‚ฌ์ง„๊ณผ ๊ฐ™์€ ๊ณ ์ฐจ์›์˜ ๋ฐ์ดํ„ฐ๋Š” ๊ทธ ์•ˆ์˜ ๋‚ด์žฌ๋˜์–ด ์žˆ๋Š” ๊ทœ์น™์— ๋”ฐ๋ฅด๋ฉด, ์œ ์‚ฌํ•˜๊ฒŒ ๋ฌถ์—ฌ ์žˆ๋Š” ๋ถ€๋ถ„์ด ๋งŽ์Œ.
๊ณ ์ฐจ์›์˜ ๋ฐ์ดํ„ฐ๋Š” ๋‚ฎ์€ ์ฐจ์›์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ํˆฌ์˜ํ•ด๋„ ๊ทธ ๊ทœ์น™์€ ๋ณด์กด๋˜์–ด์•ผ ํ•œ๋‹ค๋Š” ์ 
๋žœ๋คํ•œ ๋…ธ์ด์ฆˆ๋Š” smoothํ•จ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์— ์œ ์‚ฌ์„ฑ์ด ์ €์ฐจ์›์—์„œ ๋ณด์กด์ด ๋˜์ง€ ์•Š์Œ. ๋”ฐ๋ผ์„œ ํฌ์†Œํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด๋„ ํ•™์Šต์ด ์•ˆ๋จ.

์›๋ž˜ ์ฐจ์›์˜ ์ €์ฃผ ๋•Œ๋ฌธ์— ํ•™์Šต์ด ์ž˜ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์•„์•ผ ํ•˜์ง€๋งŒ, ๊ทธ ๋‚ด์—์„œ ํŠน์ •ํ•œ ๊ทœ์น™๋“ค์— ์˜ํ•ด ์œ ์‚ฌํ•จ์ด ๋ณด์กด๋จ.

Although there is a formal mathematical meaning to the term โ€œmanifold,โ€ in machine learning it tends to be used more loosely to designate a connected set of points that can be approximated well by considering only a small number of degrees of freedom, or dimensions, embedded in a higher-dimensional space. Each dimension corresponds to a local direction of variation. (Deep Learning, By Ian Goodfellow, Yoshua Bengio and Aaron Courville)

smoothness in manifold

Smoothness:
In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives it has over some domain. At the very minimum, a function could be considered โ€œsmoothโ€ if it is differentiable everywhere (hence continuous). At the other end, it might also possess derivatives of all orders in its domain, in which case it is said to be infinitely differentiable and referred to as a C-infinity function (or \(C^{\infty}\) function). wikipedia

Manifold Learning (from. ์˜คํ† ์ธ์ฝ”๋”์˜ ๋ชจ๋“  ๊ฒƒ)

Manifold๋ž€ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ(e.g Image์˜ ๊ฒฝ์šฐ (256, 256, 3) orโ€ฆ)๊ฐ€ ์žˆ์„ ๋•Œ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ์ดํ„ฐ ๊ณต๊ฐ„์— ๋ฟŒ๋ฆฌ๋ฉด sample๋“ค์„ ์ž˜ ์•„์šฐ๋ฅด๋Š” subspace๊ฐ€ ์žˆ์„ ๊ฒƒ์ด๋ผ๋Š” ๊ฐ€์ •์—์„œ ํ•™์Šต์„ ์ง„ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ• ์ด๋ ‡๊ฒŒ ์ฐพ์€ manifold๋Š” ๋ฐ์ดํ„ฐ์˜ ์ฐจ์›์„ ์ถ•์†Œ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ.

Usage of Manifold Learning

    1. Data Compression
      • Noisy Image Compression
    1. Data Visualization
      • t-sne
    1. Curse of dimensionality (Manifold Hypothesis)
      • ๋ฐ์ดํ„ฐ์˜ ์ฐจ์›์ด ์ฆ๊ฐ€ํ• ์ˆ˜๋ก ํ•ด๋‹น ๊ณต๊ฐ„์˜ ํฌ๊ธฐ(๋ถ€ํ”ผ)๋Š” ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋™์ผํ•œ ๊ฐœ์ˆ˜์˜ ๋ฐ์ดํ„ฐ์˜ ๋ฐ€๋„๋Š” ์ฐจ์›์ด ์ฆ๊ฐ€ํ• ์ˆ˜๋ก ๊ธ‰์†๋„๋กœ ํฌ๋ฐ•ํ•ด์ง€๊ฒŒ ๋จ
      • ๋”ฐ๋ผ์„œ ์ฐจ์›์ด ์ฆ๊ฐ€ํ• ์ˆ˜๋ก ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๋ถ„์„ ๋˜๋Š” ๋ชจ๋ธ ์ถ”์ •์— ํ•„์š”ํ•œ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ์˜ ๊ฐœ์ˆ˜๊ฐ€ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๊ฒŒ ๋จ. Manifold Hypothesis
      • Natural data in high dimensional spaces concentrates close to lower dimensional manifolds. (๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ๋ฐ€๋„๋Š” ๋‚ฎ์ง€๋งŒ, ์ด๋“ค์˜ ์ง‘ํ•ฉ์„ ํฌํ•จํ•˜๋Š” ์ €์ฐจ์›์˜ ๋งค๋‹ˆํด๋“œ๊ฐ€ ์žˆ๋‹ค.)
      • Probability density decreases very rapidly when moving away from the supporting manifold. (์ด ์ €์ฐจ์›์˜ ๋งค๋‹ˆํด๋“œ๋ฅผ ๋ฒ—์–ด๋‚˜๋Š” ์ˆœ๊ฐ„ ๊ธ‰๊ฒฉํžˆ ๋ฐ€๋„๋Š” ๋‚ฎ์•„์ง„๋‹ค.)
    1. Discovering most important features (Reasonable distance metric, Needs disentangling the underlying explanatory factors)
      • Manifold follows naturally from continuous underlying factors (\(\approx\) intrinsic manifold coordinates)
      • Such continuous factors are part of a meaningful representation.
      • Resonable distance metric
      • ์˜๋ฏธ์ ์œผ๋กœ ๊ฐ€๊น๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋Š” ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ์˜ ๋‘ ์ƒ˜ํ”Œ๋“ค ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋Š” ๋จผ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.
      • ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ ๊ฐ€๊นŒ์šด ๋‘ ์ƒ˜ํ”Œ๋“ค์€ ์˜๋ฏธ์ ์œผ๋กœ๋Š” ๊ต‰์žฅํžˆ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค.
      • ์ฐจ์›์˜ ์ €์ฃผ๋กœ ์ธํ•ด ๊ณ ์ฐจ์›์—์„œ์˜ ์œ ์˜๋ฏธํ•œ ๊ฑฐ๋ฆฌ ์ธก์ • ๋ฐฉ์‹์„ ์ฐพ๊ธฐ ์–ด๋ ต๋‹ค.
        manifold_learning_figure4
        - Needs disentangling the underlying explanatory factors.
      • In general, learned manifold is entangled, i.e. encoded in a data space in a complicated manner. When a manifold is disentangled, it would be more interpretable and easier to apply to tasks.
        disentangled-manifold

Types of Dimensionality Reduction

  • Linear
    • PCA
    • LDA
    • etc..
  • Non-Linear
    • Autoencoders(AE)
    • t-SNE
    • Isomap
    • LLE(Locally-linear embedding)
    • etc..

Manifold๋ž€?

์ด๋ฏธ์ง€๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ํ”ฝ์…€, ํ™”์†Œ๋ฅผ ํ•˜๋‚˜์˜ ์ฐจ์›์œผ๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ์šฐ๋ฆฌ๋Š” ๊ณ ์ฐจ์› ๊ณต๊ฐ„์— ํ•œ ์ ์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ๋งคํ•‘์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ๊ฐ€์ง„ ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์— ์กด์žฌํ•˜๋Š” ์ˆ˜๋งŽ์€ ์ด๋ฏธ์ง€๋ฅผ ๊ณ ์ฐจ์› ๊ณต๊ฐ„ ์†์— ๋งคํ•‘์‹œํ‚ค๋ฉด ์œ ์‚ฌํ•œ ์ด๋ฏธ์ง€๋Š” ํŠน์ • ๊ณต๊ฐ„์— ๋ชจ์—ฌ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ๋‚ด์šฉ๊ณผ ๊ด€๋ จํ•œ ์ข‹์€ ์‹œ๊ฐํ™” ์ž๋ฃŒ๋Š” ์—ฌ๊ธฐ๋ฅผ ์ฐธ๊ณ ํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ ๋“ค์˜ ์ง‘ํ•ฉ์„ ์ž˜ ์•„์šฐ๋ฅด๋Š” ์ „์ฒด ๊ณต๊ฐ„์˜ ๋ถ€๋ถ„์ง‘ํ•ฉ(subspace)์ด ์กด์žฌํ•  ์ˆ˜ ์žˆ์„ํ…๋ฐ ๊ทธ๊ฒƒ์„ ์šฐ๋ฆฌ๋Š” ๋งค๋‹ˆํด๋“œ(manifold)๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

์šฐ์„  ๋งค๋‹ˆํด๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํŠน์ง•์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Natural data in high dimensional spaces concentrates close to lower dimensional manifolds.
๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ์˜ ๋ฐ€๋„๋Š” ๋‚ฎ์ง€๋งŒ, ์ด๋“ค์˜ ์ง‘ํ•ฉ์„ ํฌํ•จํ•˜๋Š” ์ €์ฐจ์›์˜ ๋งค๋‹ˆํด๋“œ๊ฐ€ ์žˆ๋‹ค.

Probability density decreases very rapidly when moving away from the supporting manifold.
์ด ์ €์ฐจ์›์˜ ๋งค๋‹ˆํด๋“œ๋ฅผ ๋ฒ—์–ด๋‚˜๋Š” ์ˆœ๊ฐ„ ๊ธ‰๊ฒฉํžˆ ๋ฐ€๋„๋Š” ๋‚ฎ์•„์ง„๋‹ค.

๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์€ ๋ณธ๋ž˜ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์˜ subspace์ด๊ธฐ ๋•Œ๋ฌธ์— ์ฐจ์›์ˆ˜๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์•„์ง‘๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐ์ดํ„ฐ ์ฐจ์› ์ถ•์†Œ(dimension reduction)๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ฐจ์› ์ถ•์†Œ๊ฐ€ ์ž˜ ๋˜์—ˆ๋‹ค๋Š” ๊ฒƒ์€ ๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์„ ์ž˜ ์ฐพ์•˜๋‹ค๋Š” ๊ฒƒ์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋ณธ๋ž˜ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ ๊ฐ ์ฐจ์›๋“ค์„ ์ž˜ ์„ค๋ช…ํ•˜๋Š” ์ƒˆ๋กœ์šด ํŠน์ง•(feature)์„ ์ถ•์œผ๋กœ ํ•˜๋Š” ๊ณต๊ฐ„์„ ์ฐพ์•˜๋‹ค๋Š” ๋œป์œผ๋กœ ํ•ด์„ํ• ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์„ ์˜ˆ์‹œ๋กœ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

manifold_learning_figure3

์œ ๋ช…ํ•œ MNIST ๋ฐ์ดํ„ฐ์…‹์€ 784์ฐจ์› ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ 2์ฐจ์›์œผ๋กœ ์ถ•์†Œํ•˜์˜€์„ ๋•Œ ํ•œ ์ถ•์€ ๋‘๊ป˜๋ฅผ ์กฐ์ ˆํ•˜๊ณ  ํ•œ ์ถ•์€ ํšŒ์ „์„ ๋‹ด๋‹นํ•จ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์—์„œ ๋‘ ๊ฐœ์˜ ์ถ•์€ ๋‘ ๊ฐœ์˜ ํŠน์ง•(feature)๋ฅผ ์˜๋ฏธํ•˜๊ณ  ์ด๋ฅผ ๋ณ€๊ฒฝํ•˜์˜€์„ ๋•Œ ๋ณ€ํ™”๋˜๋Š” ์ด๋ฏธ์ง€ ํ˜•ํƒœ๋ฅผ ํš๋“ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์€ ์ด๋ ‡๊ฒŒ ์˜๋ฏธ๋ก ์  ์œ ์‚ฌ์„ฑ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ณต๊ฐ„์œผ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋˜ ์–ด๋–ค ์ด์ ์ด ์žˆ์„๊นŒ์š”?

๊ณต๊ฐ„์†์—์„œ ๋งคํ•‘๋œ ๋ฐ์ดํ„ฐ๋“ค์ด ์–ผ๋งˆ๋‚˜ ์œ ์‚ฌํ•œ์ง€ ์ธก์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์—๋Š” ๊ฑฐ๋ฆฌ๋ฅผ ์žฌ๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์œ ํด๋ฆฌ๋””์•ˆ ๊ฑฐ๋ฆฌ๋ฅผ ํ†ตํ•ด ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ ๋“ค์ด ๋‚˜์™€ ๊ฐ€์žฅ ์œ ์‚ฌํ•˜๊ณ  ์ƒ๊ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ณ ์ฐจ์› ๊ณต๊ฐ„์ƒ์—์„œ ๋‚˜์™€ ๊ฐ€๊นŒ์šด ์ ์ด ์‹ค์ œ๋กœ ๋‚˜์™€ ์œ ์‚ฌํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ด€์ ์€ ๋งค๋‹ˆํด๋“œ๋กœ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

manifold_learning_figure4

๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ B์™€ A1 ๊ฑฐ๋ฆฌ๊ฐ€ A2 ๊ฑฐ๋ฆฌ๋ณด๋‹ค ๊ฐ€๊น์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์—์„œ๋Š” A2๊ฐ€ B์— ๋” ๊ฐ€๊น์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ํ”ฝ์…€ ๊ฐ„ ๊ฑฐ๋ฆฌ๋Š” A1,B๊ฐ€ ๋” ๊ฐ€๊นŒ์šธ ์ˆ˜ ์žˆ์œผ๋‚˜ ์˜๋ฏธ์ ์ธ ์œ ์‚ฌ์„ฑ ๊ด€์ ์—์„œ๋Š” A2,B๊ฐ€ ๋” ๊ฐ€๊นŒ์šธ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทผ์ฒ˜์— ์žˆ๋Š” ์ ์ด ๋‚˜๋ž‘ ์œ ์‚ฌํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์ง€๋งŒ ์‹ค์ œ๋กœ๋Š” ์•„๋‹ ์ˆ˜ ์žˆ๋Š” ์˜ˆ์‹œ๊ฐ€๋ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ์‹ค์ œ ์ด๋ฏธ์ง€๋กœ ํ™•์ธํ•œ๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ ๋ ๊นŒ์š”?

manifold_learning_figure5

์ž์„ธํžˆ๋ณด๋ฉด ๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ ์ด๋ฏธ์ง€๋Š” ํŒ”์ด 2๊ฐœ ๊ณจํ”„์ฑ„๊ฐ€ 2๊ฐœ๋กœ ์ขŒ์šฐ ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€ ์ค‘๊ฐ„๋ชจ์Šต์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ์‚ฌ์ง„์ด ์•„๋‹™๋‹ˆ๋‹ค. ๋ฐ˜๋Œ€๋กœ ๋งค๋‹ˆํด๋“œ ๊ณต๊ฐ„์—์„œ ์ค‘๊ฐ„๊ฐ’์€ ๊ณต์„ ์น˜๋Š” ์ค‘๊ฐ„๊ณผ์ • ๋ชจ์Šต, ์˜๋ฏธ์ ์œผ๋กœ ์ค‘๊ฐ„์— ์žˆ๋Š” ๋ชจ์Šต์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” ๊ฒƒ๋„ ์‚ฌ์‹ค ์ด๊ฒƒ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๊ฒ ์ง€์š”. ๋งค๋‹ˆํด๋“œ๋ฅผ ์ž˜ ์ฐพ์œผ๋ฉด ์˜๋ฏธ์ ์ธ ์œ ์‚ฌ์„ฑ์„ ์ž˜ ๋ณด์กดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํš๋“ํ•˜์—ฌ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ํš๋“ํ•  ๊ฐ€๋Šฅ์„ฑ๋„ ์—ด๋ฆฌ๊ฒŒ๋ฉ๋‹ˆ๋‹ค.

Bootstrap

Sampling with replacement
๋ฐ์ดํ„ฐ ๋ถ„ํฌ๊ฐ€ ๋ถˆ๊ท ํ˜•์ผ ๋•Œ ์ ์šฉ

์ฃผ๋กœ ์ด์ƒํƒ์ง€๋‚˜ ๋ณด์•ˆ์˜ ํƒœ์Šคํฌ์— ์ ์šฉ (ํด๋ž˜์Šค imbalance๊ฐ€ ์‹ฌํ•œ ํƒœ์Šคํฌ)

Appendix

Reference

Manifold Learning: https://deepinsight.tistory.com/124
Manifold Learning slides: https://www.slideshare.net/NaverEngineering/ss-96581209
manifold: https://kh-mo.github.io/notation/2019/03/10/manifold_learning/
Gradient Boost: https://bkshin.tistory.com/entry/%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-15-Gradient-Boost
Bagging & Boosting: https://bkshin.tistory.com/entry/%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-11-%EC%95%99%EC%83%81%EB%B8%94-%ED%95%99%EC%8A%B5-Ensemble-Learning-%EB%B0%B0%EA%B9%85Bagging%EA%B3%BC-%EB%B6%80%EC%8A%A4%ED%8C%85Boosting
[Paper Review] XGBoost: A Scalable Tree Boosting System:https://youtu.be/VkaZXGknN3g
XGBoost Part 1 (of 4): Regression: https://youtu.be/OtD8wVaFm6E
Gradient Boost Part 1 (of 4): Regression Main Ideas: https://youtu.be/3CC4N4z3GJc

Leave a comment