๐Ÿ“š ์Šคํ„ฐ๋””

๐Ÿ“š ์Šคํ„ฐ๋””/CS224N

[CS224N] 6, 7, 8. RNN, LSTM, Seq2seq, Attention & Transformers

์ข…๊ฐ• ํ›„์— 2023๋…„ ๋ฒ„์ „์— ๋งž์ถฐ ์ƒˆ๋กญ๊ฒŒ ์—…๋ฐ์ดํŠธ๋œ CS224N ๊ฐ•์˜๋ฅผ ์ˆ˜๊ฐ• ์ค‘์ด๋‹ค. ํ™•์‹คํžˆ ์š”์ฆ˜ ๊ฐ•์˜๋“ค์ด ํ›จ์”ฌ ๋” ์ตœ์‹  ์ •๋ณด๋“ค๋„ ๋งŽ๊ณ , ๊ทธ์— ๋”ฐ๋ผ ๊ฐ•์˜์˜ ์งˆ๋„ ์ข‹์€ ๊ฒƒ ๊ฐ™๋‹ค. ํ˜๋Ÿฌ๊ฐ€๋“ฏ์ด ๋“ค์—ˆ๋˜ ๊ณผ๊ฑฐ์™€๋Š” ๋‹ค๋ฅด๊ฒŒ, ์ด๋ฒˆ์—๋Š” ์ค‘์š”ํ•œ ์ •๋ณด๋“ค์„ ์ดํ•ดํ•˜๊ณ  ๋‹ค์‹œ ๊ฐœ๋… ํ™•์ธ์ฐจ ๋ธ”๋กœ๊ทธ์— ์ •๋ฆฌํ•ด๋ณด๊ณ  ์žˆ๋‹ค. ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” RNN์˜ ๋„์ž…๋ถ€ํ„ฐ LSTM, Transformer๊นŒ์ง€ ์˜ค๊ฒŒ ๋œ ๊ณผ์ •๊ณผ ๊ฐ๊ฐ์˜ ๋ชจ๋ธ๋“ค์— ๋Œ€ํ•ด์„œ ์ž‘์„ฑํ•ด ๋ณด์•˜๋‹ค. ์œ„ ๋ชจ๋ธ์— ๋Œ€ํ•ด์„œ ๋“ค์–ด๋งŒ ๋ณด๊ณ  ์ž˜ ์•Œ์ง€๋Š” ๋ชปํ•˜์‹  ๋ถ„๋“ค์—๊ฒŒ ๊ฐ•์ถ”. 1. RNN Simple RNN ์ง€๋‚œ ๊ธ€์—์„œ๋„ ์ž‘์„ฑํ–ˆ๋‹ค์‹œํ”ผ, RNN์˜ ํ•ต์‹ฌ์€ ๊ฐ™์€ ๊ฐ€์ค‘์น˜ W๋ฅผ ๋ฐ˜๋ณตํ•˜์—ฌ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ์Šค์Šค๋กœ์—๊ฒŒ ํ”ผ๋“œ๋ฐฑ์„ ์ฃผ๋Š” ๋ฐฉ์‹์ด๋‹ค. ๊ธฐ๋ณธ์ ์ธ ๊ตฌ์กฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค. Training RNN ๊ทธ๋Ÿผ ์ด๋Ÿฐ ๊ตฌ์กฐ์˜ RNN์€ ..

๐Ÿ“š ์Šคํ„ฐ๋””/CS224N

[CS224N] 5. Language Models and Recurrent Neural Networks

์ €๋ฒˆ ๊ธ€์—์„œ๋Š” dependency parser์˜ ์—ญ์‚ฌ์™€ neural net์„ ์ด์šฉํ•ด dependency parser๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ•, ๊ทธ๋ฆฌ๊ณ  neural net์˜ regularization์— ๋Œ€ํ•ด ๊ฐ„๋‹จํžˆ ๋‹ค๋ฃจ์—ˆ๋‹ค. ์ด๋ฒˆ ๊ธ€์—์„œ๋Š” Language Modeling์— ๋Œ€ํ•ด ๊ฐ„๋‹จํžˆ ๋‹ค๋ค„๋ณธ ํ›„, RNN์˜ ๊ธฐ์ดˆ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•œ๋‹ค. 1. Language Modeling Language Modeling์ด๋ž€, ์–ด๋–ค ๋‹จ์–ด๊ฐ€ ๋‹ค์Œ์— ๋‚˜์˜ฌ์ง€ ์˜ˆ์ธกํ•˜๋Š” ํƒœ์Šคํฌ๋ฅผ ๋œปํ•œ๋‹ค. ์ฆ‰, context์— ๋Œ€ํ•ด ๋‹จ์–ด๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ๋‹ค์Œ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๊ฑธ ์ข€ ์ˆ˜์‹์„ ๊ณ๋“ค์—ฌ ๋งํ•ด๋ณด๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค: ๋‹จ์–ด x1, x2, .., xt์— ๋Œ€ํ•ด x(t+1)์˜ ํ™•๋ฅ  ๋ถ„ํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ํ™•๋ฅ ๋กœ ํ‘œํ˜„ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค: Language Model..

๐Ÿ“š ์Šคํ„ฐ๋””/CS224N

[CS224N] 4. Syntactic Structure and Dependency Parsing

4๋ฒˆ์งธ ๊ฐ•์˜๋Š” ๋ฌธ์žฅ์— ๋Œ€ํ•œ ๋ถ„์„ ๋ฐฉ๋ฒ•์— ๋‹ค๋ฃฌ๋‹ค. ํŠนํžˆ Dependency Parsing ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ์„ค๋ช…ํ•˜๋Š”๋ฐ, ๊ทธ๋™์•ˆ์˜ ๋ฐฉ์‹๋“ค๊ณผ ํ˜„๋Œ€์˜ neural dependency parsing ๋ฐฉ์‹์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•œ๋‹ค. 1. Two views of linguistic structure ๋ฌธ์žฅ์˜ ๊ตฌ์กฐ๋ฅผ ํŒŒ์•…ํ•˜๋Š” ๊ฒƒ์€ ๋‘ ๊ฐ€์ง€๊ฐ€ ์žˆ๋Š”๋ฐ, ํ•˜๋‚˜๋Š” Constituency parsing, ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” Dependency parsing์ด๋‹ค. ๊ฐ„๋‹จํ•˜๊ฒŒ Consitituency parsing์€ ๋ฌธ์žฅ์˜ ๊ตฌ์„ฑ์š”์†Œ๋ฅผ ํ†ตํ•ด ๋ฌธ์žฅ ๊ตฌ์กฐ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๊ณ , Dependency parsing์€ ๋‹จ์–ด ๊ฐ„ ์˜์กด ๊ด€๊ณ„๋ฅผ ํ†ตํ•ด ๊ตฌ์กฐ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ์กฐ๊ธˆ ๋” ๊นŠ๊ฒŒ ๋“ค์–ด๊ฐ€ ๋ณด์ž. 1. Constituency Parsing: Context-Free-..

๐Ÿ“š ์Šคํ„ฐ๋””/CS224N

[CS224N] 3. Natural Language Processing with Deep Learning

1. NER NER์€ Named Entity Recognition์˜ ์•ฝ์ž๋กœ, ๋‹จ์–ด๋ฅผ ์ฐพ์•„ ๋ถ„๋ฅ˜ํ•˜๊ณ  ์นดํ…Œ๊ณ ๋ฆฌํ™”์‹œํ‚ค๋Š” ์ž‘์—…์ด๋‹ค. ๋‹ค์Œ์„ ์˜ˆ์‹œ๋กœ ๋“ค์–ด๋ณด์ž. ์—ฌ๊ธฐ์„œ Paris๋ผ๋Š” ๋‹จ์–ด๋ฅผ ๋‹จ์–ด์žฅ์—์„œ ์ฐพ์œผ๋ฉด ํ”„๋ž‘์Šค์˜ ํŒŒ๋ฆฌ๊ฐ€ ์ฐพ์•„์ง€์ง€๋งŒ, ๋ณธ๋ฌธ์—์„œ๋Š” ์‚ฌ๋žŒ ์ด๋ฆ„์œผ๋กœ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ์ด๋ ‡๋“ฏ, NER์„ ์ •ํ™•ํ•˜๊ฒŒ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ํ•ญ์ƒ context๋ฅผ ๊ณ ๋ คํ•ด์•ผ ํ•œ๋‹ค. ์ด๊ฑธ Neural Network์œผ๋กœ ์–ด๋–ป๊ฒŒ ํ•  ์ˆ˜ ์žˆ์„๊นŒ? Simple NER: Window classification using binary logistic classfier ๋จผ์ € ์•„์ด๋””์–ด๋Š” word vectors๋ฅผ ์ด์šฉํ•ด์„œ word vectors๋กœ ์ด๋ฃจ์–ด์ง„ context window๋ฅผ ๋งŒ๋“ค๊ณ , ๊ทธ๊ฑธ neural network layer์— ๋„ฃ๊ณ , logistic ..

์žฅ์˜์ค€
'๐Ÿ“š ์Šคํ„ฐ๋””' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก