Recent posts

NLP - Word Embedding

8 minute read

Word Embedding ๋‹จ์–ด์˜ ์˜๋ฏธ๋ฅผ ์–ด๋–ป๊ฒŒ ๋‚˜ํƒ€๋‚ผ ๊ฒƒ์ธ๊ฐ€ ์ข‹์€ ํ‘œํ˜„ ๋ฐฉ์‹: ๋‹จ์–ด๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์ž˜ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•จ

Download Gdrive file via wget or gdown

less than 1 minute read

Wget ๋งํฌ ์ƒ์„ฑ(๋งํฌ๊ฐ€ ์žˆ๋Š” ๋ชจ๋“  ์‚ฌ์šฉ์ž์—๊ฒŒ ๊ณต๊ฐœ) wget ์ฟ ํ‚ค ์„ค์ •์œผ๋กœ ๋‹ค์šด๋กœ๋“œ wget --load-cookies ~/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget -...

NLP - Document Classification

6 minute read

๋ฌธ์„œ ๋ถ„๋ฅ˜ (Text Classification) ๋ฌธ์„œ ๋ถ„๋ฅ˜๋ž€ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„, ํ…์ŠคํŠธ๊ฐ€ ์–ด๋–ค ์ข…๋ฅ˜์˜ ๋ฒ”์ฃผ์— ์†ํ•˜๋Š”์ง€๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š” ์ž‘์—… ๋‹ค์–‘ํ•œ ๋ฌธ์„œ ๋ถ„๋ฅ˜ ๋ฌธ์ œ ๋ฌธ์„œ์˜ ๋ฒ”์ฃผ, ์ฃผ์ œ ๋ถ„๋ฅ˜ ์ด๋ฉ”์ผ ์ŠคํŒธ ๋ถ„๋ฅ˜ ๊ฐ์„ฑ ๋ถ„๋ฅ˜ ์–ธ์–ด ๋ถ„๋ฅ˜ ...

Docker Volume vs Bind Mount

less than 1 minute read

๐Ÿ‘‰ Reference: https://www.daleseo.com/docker-volumes-bind-mounts/

Docker Build

less than 1 minute read

Build Docker Image & run, start, stop and exit container by using shell command $ bash docker/build.sh

NLP - Language Model

5 minute read

์–ธ์–ด๋ชจ๋ธ ๋‹ค์Œ ๋ฌธ์žฅ ๋‹ค์Œ์— ์ด์–ด์งˆ ๋‹จ์–ด๋Š”? Please turn your homework in or the ? ๋‹ค์Œ ๋‘ ๋ฌธ์žฅ ์ค‘ ๋‚˜ํƒ€๋‚  ํ™•๋ฅ ์ด ๋” ๋†’์€ ๊ฒƒ์€? ...

NLP - text preprocessing

5 minute read

Intro ์ž์—ฐ์–ด์˜ ์˜๋ฏธ๋ฅผ ์ปดํ“จํ„ฐ๋กœ ๋ถ„์„ํ•ด์„œ ํŠน์ •์ž‘์—…์„ ์œ„ํ•ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๊ฒƒ ์‘์šฉ๋ถ„์•ผ ๊ธฐ๊ณ„๋ฒˆ์—ญ ๊ฐ์„ฑ๋ถ„์„ ๋ฌธ์„œ๋ถ„๋ฅ˜ ์งˆ์˜์‘๋‹ต์‹œ์Šคํ…œ ์ฑ—๋ด‡ ์–ธ์–ด์ƒ์„ฑ ์Œ์„ฑ์ธ์‹ ์ถ”์ฒœ์‹œ์Šคํ…œ ...

Docker install on ubuntu

5 minute read

Setup the docker via Repository 1. Update the apt package index and install packages to allow apt to use a repository over HTTPS: $ sudo apt-get update $ sud...

Big data - ML Pipeline๊ณผ Tuning ์†Œ๊ฐœ์™€ ์‹ค์Šต

4 minute read

Spark MLlib ๋ชจ๋ธ ํŠœ๋‹ ์†Œ๊ฐœ Spark MLlib ๋ชจ๋ธ ํŠœ๋‹ ์ตœ์ ์˜ ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ ์„ ํƒ ์ตœ์ ์˜ ๋ชจ๋ธ ํ˜น์€ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ฐพ๋Š” ๊ฒƒ์ด ์•„์ฃผ ์ค‘์š” ํ•˜๋‚˜์”ฉ ํ…Œ์ŠคํŠธํ•ด๋ณด๋Š” ๊ฒƒ vs. ๋‹ค์ˆ˜๋ฅผ ๋™์‹œ ํ…Œ์ŠคํŠธ ํ•˜๋Š” ๊ฒƒ ๋ชจ๋ธ ์„ ํƒ์˜ ์ค‘์š”ํ•œ ...

07-22 Live Session

less than 1 minute read

Spark ์ด๋ž€ Pandas์˜ ์Šคํ…Œ๋กœ์ด๋“œ ๋ฒ„์ „ + SQL + Scikit-learn + Streaming

Big data - SparkSQL์„ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„

10 minute read

์ปค๋ฆฌ์–ด ์ด์•ผ๊ธฐ ๋‚จ๊ณผ ๋น„๊ตํ•˜์ง€ ๋ง๊ณ  ์•ž์œผ๋กœ 20-30๋…„์„ ๋ณด๊ธฐ ํ•˜๋‚˜๋ฅผ ํ•˜๊ธฐ๋กœ ํ–ˆ์œผ๋ฉด ์ ์–ด๋„ 6๊ฐœ์›”์€ ํŒŒ๊ณ  ๋“ค๊ธฐ ๋„ˆ๋ฌด ๋นจ๋ฆฌ ํฌ๊ธฐํ•˜์ง€ ์•Š๊ธฐ ๋ญ”๊ฐ€ ์ž˜ ์•ˆ๋˜๋ฉด ์„œ๋‘๋ฅด๊ธฐ ๋ณด๋‹ค๋Š” ์˜คํžˆ๋ ค ์ฒœ์ฒœํžˆ ๊ฐ€๊ธฐ ๊ณต๋ถ€๋ฅผ ์œ„ํ•œ ๊ณต๋ถ€๋ฅผ ํ•˜๊ธฐ ๋ณด๋‹ค๋Š” ์ผ์„ ...

Big data - Spark

6 minute read

๋น…๋ฐ์ดํ„ฐ ๊ธฐ์ˆ ์ด๋ž€ Spark ์†Œ๊ฐœ ํŒ๋‹ค์Šค์™€ ๋น„๊ต Spark ์‹ค์Šต

Big data

6 minute read

๋ฐ์ดํ„ฐ ํŒ€์˜ ์—ญํ•  ๋ฐ์ดํ„ฐ ํŒ€์˜ ๋ฏธ์…˜ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ๋ถ€๊ฐ€๊ฐ€์น˜ ์ƒ์„ฑ

ML Basics - todo

less than 1 minute read

์ •๋ฆฌํ•  ๊ฒƒ