๐Ÿ“š ๋…ผ๋ฌธ

Discovering New Intents with Deep Aligned Clustering

2023. 8. 16. 04:12
๋ชฉ์ฐจ
  1. Introduction
  2. Approach
  3. 1. Intent Representation
  4. 2. Transferring Knowledge from Known Intents
  5. 3. Deep Aligned Clustering
  6. Experiments
  7. Dataset
  8. Baselines
  9. Evaluation Metrics
  10. Evaluation Settings
  11. Conclusion

์ง€๋‚œ๋ฒˆ A Probabilistic Framework for Discovering New Intents ๋…ผ๋ฌธ์„ ์ฝ๊ณ , ๋…ผ๋ฌธ์„ ๋” ์ž˜ ์ดํ•ดํ•˜๊ณ ์ž ํ•ด๋‹น ๋…ผ๋ฌธ์˜ ๋ฒ ์ด์Šค๊ฐ€ ๋˜๋Š” DeepAligned ๋…ผ๋ฌธ์„ ์ฝ๊ฒŒ ๋˜์—ˆ๋‹ค.

 

Introduction

์šฐ์„  ์ด ๋…ผ๋ฌธ์˜ ๋ชฉ์ ์€ known intent๋กœ labeled ๋œ data๋ฅผ ๊ฐ€์ง€๊ณ  ์ƒˆ๋กœ์šด intent๋ฅผ ๋ฐœ๊ฒฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๊ธฐ์กด์—๋Š” ๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ์–ด๋ ค์›€์ด ์žˆ์—ˆ๋‹ค:

1. ์ œํ•œ๋œ ์–‘์˜ known intents์˜ ์‚ฌ์ „์ง€์‹์„ new intent์—๊ฒŒ ์ „๋‹ฌํ•˜๊ธฐ ์–ด๋ ต๋‹ค.

2. unlabeled known๊ณผ new intent๋ฅผ ๋‘˜๋‹ค clustering ํ•˜๊ธฐ ์œ„ํ•ด ์นœ๊ทผํ•œ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ ๋†’์€ ํ€„๋ฆฌํ‹ฐ์˜ supervised signal์„ ๋งŒ๋“ค๊ธฐ ์–ด๋ ต๋‹ค.

 

์ด์— ๋Œ€ํ•œ ํ•ด๊ฒฐ๋ฐฉ์•ˆ์œผ๋กœ, ์ด ๋…ผ๋ฌธ์€ DeepAligned๋ฅผ ํ†ตํ•ด feature learning์„ ์œ„ํ•ด known intent์˜ ์‚ฌ์ „์ง€์‹์„ ํ™œ์šฉํ•ด์„œ ๋†’์€ ํ€„๋ฆฌํ‹ฐ์˜ supervised signal์„ ๋งŒ๋“ค์—ˆ๋‹ค.

DeepAligned์˜ ์ „์ฒด์ ์ธ ์•„ํ‚คํ…์ณ๋Š” ์œ„์™€ ๊ฐ™๋‹ค.

1. BERT๋ฅผ ์‚ฌ์šฉํ•ด์„œ intent feature๋ฅผ ์ถ”์ถœํ•œ๋‹ค.

2. ์ ์€ labeled data๋กœ ๋ชจ๋ธ์„ pre-trainํ•˜๊ณ , ์˜๋„ ๊ฐœ์ˆ˜ K๋ฅผ ์ถ”์ •ํ•œ๋‹ค.

3. K-means์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ด์„œ cluster centroid๋ฅผ ๋งŒ๋“ค๊ณ  cluster assignment๋ฅผ pseudo label๋กœ ํ• ๋‹นํ•œ๋‹ค.

4. ํ˜„์žฌ training epoch๊ณผ ์ด์ „ training epoch ์‚ฌ์ด๋ฅผ ์ตœ๋Œ€ํ•œ ๊ฐ€๊น๊ฒŒ ๋งŒ๋“ค๋„๋ก cluster centroid๋ฅผ ์กฐ์ •ํ•˜๊ณ  projection G๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

5. ์ตœ์ข…์ ์œผ๋กœ pseudo-label์— G๋ฅผ ์‚ฌ์šฉํ•ด์„œ self-supervised learning์„ ์œ„ํ•ด ์กฐ์ •๋œ label (aligned label)์„ ์ƒ์„ฑํ•œ๋‹ค.


Approach

์œ„์˜ ๊ณผ์ •์„ ํ•ด๋‹น ์„น์…˜์—์„œ ์กฐ๊ธˆ ๋” ๊ตฌ์ฒดํ™”์‹œ์ผœ๋ณธ๋‹ค.

1. Intent Representation

์šฐ์„  BERT๋ฅผ ํ™œ์šฉํ•ด์„œ intent representation์„ ์ถ”์ถœํ•œ๋‹ค.

ํ•ด๋‹น ์ž‘์—…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ž‘์—…์„ ๊ฑฐ์นœ๋‹ค:

1. input sentence s_i๋ฅผ BERT์— ๋„ฃ๊ณ , ๋งˆ์ง€๋ง‰ hidden layer์—์„œ ๋ชจ๋“  token embedding์„ ๊ฐ€์ ธ์˜จ๋‹ค.

2. mean-pooling์„ ํ†ตํ•ด ํ‰๊ท  feature representation z_i๋ฅผ ์–ป๋Š”๋‹ค.

์—ฌ๊ธฐ์„œ CLS๋Š” text classification์„ ์œ„ํ•œ vector, M์€ ๋ฌธ์žฅ์˜ ๊ธธ์ด, H๋Š” hidden size ์ด๋‹ค.

3. ๋” ๋‚˜์€ ์˜๋ฏธ์  ํ‘œํ˜„ ์ถ”์ถœ์„ ์œ„ํ•ด dense layer h ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ intent feature representation I_i๋ฅผ ์–ป๋Š”๋‹ค.

 

2. Transferring Knowledge from Known Intents

Pre-training

๋‹ค์Œ์€ known intent๋กœ ์•Œ๊ณ  ์žˆ๋Š” ์ •๋ณด๋“ค์„ transfer ํ•˜๋Š” ๊ณผ์ •์„ ๊ฑฐ์ณ์•ผ ํ•œ๋‹ค.

์ด knowledge๋ฅผ ์ž˜ transfer ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ œํ•œ๋œ labeled data๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋ธ์„ pre-train ์‹œํ‚ค๊ณ ,

์ž˜ ํ›ˆ๋ จ๋œ intent ํŠน์ง•๋“ค์„ ํ™œ์šฉํ•˜๋ฉด ํด๋Ÿฌ์Šคํ„ฐ์˜ ๊ฐœ์ˆ˜๋ฅผ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

Predict K

์šฐ์„  ํด๋Ÿฌ์Šคํ„ฐ ๊ฐœ์ˆ˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” K๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ณผ์ •์„ ๊ฑฐ์นœ๋‹ค:

1. ๊ธฐ๋ณธ K๊ฐ’์ธ K' ์„ค์ • (์ฃผ๋กœ ์›๋ž˜ intent์˜ ๋ฐฐ์ˆ˜๋กœ ๊ฒฐ์ •ํ•œ๋‹ค.)

2. pre-trainํ•œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด์„œ intent feature์„ ์ถ”์ถœํ•œ๋‹ค.

3. ์ถ”์ถœ๋œ feature๋“ค์„ ์‚ฌ์šฉํ•ด์„œ K-means๋ฅผ ์ˆ˜ํ•ธํ•œ๋‹ค.

4. ํŠน์ • ์ž„๊ณ„๊ฐ’ ๋ฏธ๋งŒ์˜ ๊ฐ’์€ low confidence๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ๋ฒ„๋ฆฐ๋‹ค.

์ด ๊ณผ์ •์„ ๊ฑฐ์นœ K ๊ฐ’ ์ถ”์ธก์€ ๋‹ค์Œ ์‹์œผ๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅํ•˜๋‹ค:

|S_i|๋Š” i๋ฒˆ์งธ ์ƒ์„ฑ๋œ cluster ๊ฐœ์ˆ˜, ฮด๋Š” indicator function์ธ๋ฐ, |S_i|๊ฐ€ t๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์œผ๋ฉด 1์„, ์•„๋‹ˆ๋ฉด 0์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.

 

3. Deep Aligned Clustering

known intent๋กœ๋ถ€ํ„ฐ knowledge๋ฅผ transfer ํ•œ ํ›„, ์ด์ œ unlabeled known, novel classes๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด clustering ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•œ๋‹ค. ์šฐ์„  ํด๋Ÿฌ์Šคํ„ฐ๋ง ํ›„ cluster assignment์™€ centroid๋ฅผ ์–ป๊ณ , self-supervised learning์„ ์œ„ํ•œ ์ „๋žต์„ ์‹คํ–‰ํ•œ๋‹ค.

Unsupervised Learning by Clustering

๊ฑฐ์˜ ๋Œ€๋ถ€๋ถ„์˜ data๋“ค์€ unlabeled ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ, unlabeled sample๋“ค์„ ์ด์šฉํ•ด์„œ ์ƒˆ๋กœ์šด class๋ฅผ ์ฐพ์•„๋ณด์ž.

1. training data์— ๋Œ€ํ•œ intent feature์„ pre-train ์‹œํ‚จ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด์„œ ์ถ”์ถœํ•œ๋‹ค.

2. K-means ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ด์„œ optimal cluster centroid matrix C์™€ cluster assignment๋ฅผ ํ•™์Šต์‹œํ‚จ๋‹ค. ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

N์€ training sample์˜ ๊ฐœ์ˆ˜, ||~~||^2_2๋Š” ์œ ํด๋ฆฌ๋””์•ˆ ๊ฑฐ๋ฆฌ์˜ ์ œ๊ณฑ์„ ์˜๋ฏธํ•œ๋‹ค.

์ดํ›„ cluster assignment๋ฅผ feature learning์˜ pseudo-label๋กœ ๊ฐ„์ฃผํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค.

 

Self-supervised Learning with Aligned Pseudo-labels

์ด ๋…ผ๋ฌธ์ด ์ฐธ๊ณ ํ•œ DeepCluster ๋…ผ๋ฌธ์—์„œ๋Š” K-means๋ฅผ ํ™œ์šฉํ•œ clustering๊ณผ ํŒŒ๋ผ๋ฏธํ„ฐ update๋ฅผ ๋ฒˆ๊ฐˆ์•„ ๊ฐ€๋ฉฐ ์ง„ํ–‰ํ–ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ์ด ๋ฐฉ์‹์—์„œ, ๊ฐ epoch๋งˆ๋‹ค K-menas๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด data index๊ฐ€ ๊ณ„์† ์žฌ๋ฐฐ์น˜๋œ๋‹ค๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ๋‹ค.

์ด๋Š” data์˜ label์ด epoch๋งˆ๋‹ค ๋ฐ”๋€” ์ˆ˜ ์žˆ์Œ์„ ๋œปํ–ˆ๊ณ , ๊ทธ๋Ÿผ ๋ชจ๋ธ์€ epoch๋งˆ๋‹ค ๋‹ค๋ฅธ label์„ ๊ฐ€์ง„ data๋กœ ํ›ˆ๋ จํ•˜๊ฒŒ ๋œ๋‹ค.

์ด๋Š” ์ผ๊ด€๋œ ํ•™์Šต์ด ์–ด๋ ต๋‹ค๋Š” ์ ์—์„œ ์น˜๋ช…์ ์ธ ๋‹จ์ ์ด์—ˆ๋‹ค.

 

๋”ฐ๋ผ์„œ ํ•ด๋‹น ๋…ผ๋ฌธ์—์„œ๋Š” assignment inconsistency ๋ฌธ์ œ๋ฅผ ์œ„ํ•ด alignment ์ „๋žต์„ ๋„์ž…ํ•œ๋‹ค.

์œ„์—์„œ ๋ฌธ์ œ๋Š” epoch๋งˆ๋‹ค ๋‹ค๋ฅธ label์„ ๊ฐ€์ง„๋‹ค, ์ฆ‰ ์ด์ „์˜ ํ•™์Šต ์ •๋ณด๊ฐ€ ๊ธฐ์–ต๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ ์ด์—ˆ๋‹ค.

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํ•ด๋‹น ๋…ผ๋ฌธ์—์„œ๋Š” cluster centroid๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋‹จ๊ณ„์ ์ธ ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

 

1. ์ค€๋น„

์ด์ „๊ณผ ํ˜„์žฌ epoch์—์„œ์˜ ํด๋Ÿฌ์Šคํ„ฐ ์ค‘์‹ฌ ํ–‰๋ ฌ (centriod matrix)์„ ์ค€๋น„ํ•œ๋‹ค.

 

2. ์œ ์‚ฌ๋„ matrix

ํด๋Ÿฌ์Šคํ„ฐ ์ค‘์‹ฌ ํ–‰๋ ฌ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์œ ์‚ฌ๋„ matrix๋ฅผ ๋งŒ๋“ ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, matrix์˜ (i,j)๋Š” C^l์˜ i๋ฒˆ์งธ์™€ C^c์˜ j๋ฒˆ์งธ ์œ ์‚ฌ๋„๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

(C^c๋Š” current epoch์˜ centroid matrix, C^l์€ last(์ด์ „) epoch์˜ centroid matrix๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.) 

 

3. ํ—๊ฐ€๋ฆฌ์•ˆ ์•Œ๊ณ ๋ฆฌ์ฆ˜

์œ ์‚ฌ๋„ matrix์— ํ—๊ฐ€๋ฆฌ์•ˆ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ ์šฉํ•ด์„œ ์ตœ์ ์˜ ๋งคํ•‘์„ ์ฐพ๋Š”๋‹ค.

์ด๋Š” ๋‘ ํด๋Ÿฌ์Šคํ„ฐ ์ค‘์‹ฌ ํ–‰๋ ฌ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ๊ณผ์ •์ธ๋ฐ,

์œ„์—์„œ ์ฐพ์€ (i,j) ์œ ์‚ฌ๋„ ์ค‘ ์œ ์‚ฌ๋„ ๋†’์€ ๊ฒƒ์ด ๊ฐ™์€ index์— ์œ„์น˜ํ•˜๋„๋ก C^c๋ฅผ ๋ณ€ํ™”์‹œํ‚จ๋‹ค. ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

์—ฌ๊ธฐ์„œ G๋Š” ํ—๊ฐ€๋ฆฌ์•ˆ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ์–ป์€ ์ตœ์ ์˜ mapping์ด๋‹ค.

 

4. ์ •๋ ฌ๋œ ์ค‘์‹ฌ ํ–‰๋ ฌ ์ƒ์„ฑ

์œ„ ๊ณผ์ •์„ ๊ฑฐ์ณ ์ •๋ ฌ๋œ ์ค‘์‹ฌ ํ–‰๋ ฌ C^c๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

 

5. ์ •๋ ฌ๋œ(aligned) psuedo label ์ƒ์„ฑ

y^c๋ฅผ ์ •๋ ฌ๋œ ์ค‘์‹ฌ ํ–‰๋ ฌ์— ๋งคํ•‘ํ•˜์—ฌ y^align์„ ์ƒ์„ฑํ•œ๋‹ค. ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

 

6. Self-supervised learning

์œ„์˜ aligned pseudo-label์„ ์‚ฌ์šฉํ•˜๊ณ  ๋‹ค์Œ softmax loss๋ฅผ ์‚ฌ์šฉํ•ด์„œ self-supervised learning์„ ์ง„ํ–‰ํ•œ๋‹ค:

ฯ†(ยท)๋Š” pseudo-classifier์ด๋‹ค.

 

์œ„์™€ ๊ฐ™์€ clustering ๊ณผ์ •์„ ๊ฑฐ์นœ ํ›„์—๋Š” cluster validity index (CVI)๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ฐ training epoch๋งˆ๋‹ค clustering ํ›„์— ์–ป์€ cluster์˜ quality๋ฅผ ํ‰๊ฐ€ํ•œ๋‹ค. ํŠนํžˆ, ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด์„œ๋Š” unsupervised metric์ธ Silhouette Coefficient๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ๊ด€๋ จ ํ‰๊ฐ€ ๋ฉ”์†Œ๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

 

a(I_i)๋Š” I_i์™€ ๋‹ค๋ฅธ i๋ฒˆ์งธ cluster์— ์žˆ๋Š” sample๋“ค์˜ ํ‰๊ท  ๊ฑฐ๋ฆฌ์ด๊ณ  (์ด๋Š” intra-class compactness๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค๊ณ  ํ•œ๋‹ค.)

b(I_i)๋Š” I_i์™€ i๋ฒˆ์งธ๊ฐ€ ์•„๋‹Œ cluster์— ์žˆ๋Š” ๋ชจ๋“  sample๋“ค ์ค‘ ๊ฐ€์žฅ ์งง์€ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. (์ด๋Š” inter-class seperation์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.)

SC์˜ ๋ฒ”์œ„๋Š” -1๊ณผ 1 ์‚ฌ์ด์ด๊ณ , ๋†’์€ ์ ์ˆ˜์ผ์ˆ˜๋ก ์ข‹์€ clustering ๊ฒฐ๊ณผ๋ฅผ ๋œปํ•œ๋‹ค.


Experiments

Dataset

๋ฐ์ดํ„ฐ๋Š” CLINC(intent classification dataset)๊ณผ BANKING(์€ํ–‰, ๊ธˆ์œต๊ณผ ๊ด€๋ จ๋œ dataset)์ด๋‹ค.

CLINC์€ 10๊ฐœ์˜ ๋„๋ฉ”์ธ์„ ๊ฑฐ์ณ 150๊ฐœ์˜ ์˜๋„์™€ 22500๊ฐœ์˜ ๋ฐœํ™”๋กœ ๊ตฌ์„ฑ๋ผ ์žˆ๊ณ ,

BANKING์€ 77๊ฐœ์˜ ์˜๋„์™€ 13083๊ฐœ์˜ ๋ฐœํ™”๋กœ ๊ตฌ์„ฑ๋ผ ์žˆ๋‹ค.

Baselines

ํ‰๊ฐ€ ๋ฐฉ์‹์€ unsupervised์™€ semi-supervised์˜ 2๊ฐ€์ง€๋กœ ๋‚˜๋‰œ๋‹ค.

Evaluation Metrics

ํ‰๊ฐ€ metric์œผ๋กœ๋Š” NMI, ARI, ACC๋ฅผ ์‚ฌ์šฉํ•˜๋Š”๋ฐ, ACC๋กœ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š”

ํ—๊ฐ€๋ฆฌ์•ˆ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ด์„œ ์˜ˆ์ธก๋œ ํด๋ž˜์Šค์™€ ground-truth ํด๋ž˜์Šค์˜ mapping์„ ์–ป๋Š”๋‹ค.

Evaluation Settings

๋ฐ์ดํ„ฐ์…‹์€ 10%์˜ training data ์ค‘ 75% known intent๋กœ, ๋‚˜๋จธ์ง€ 25%๋ฅผ unknown intent๋กœ ๋žœ๋คํ•˜๊ฒŒ ์„ ํƒํ•œ๋‹ค.

์ดํ›„, ํ•ด๋‹น ๋ฐ์ดํ„ฐ์…‹๋“ค์„ training, validation, test set๋กœ ๊ตฌ๋ถ„ํ•œ๋‹ค.

์ด๋•Œ, intent category ์ˆ˜ (K)๋ฅผ ์‹ค์ œ ์ •๋‹ต๊ฐ’ (ground-truth)๋กœ ์—ฌ๊ธด๋‹ค.

ํ‰๊ฐ€ ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

1. ์ ์€ ์–‘์˜ known intent๋ฅผ ๊ฐ€์ง„ labeled data๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋ธ์„ pre-training์„ ํ•˜๊ณ , validation set๋กœ ํŠœ๋‹ํ•œ๋‹ค.

2. ๋ชจ๋“  training data๋ฅผ self-supervised learning์„ ์œ„ํ•ด ์‚ฌ์šฉํ•˜๊ณ  cluster์„ SC๋กœ ํ‰๊ฐ€ํ•œ๋‹ค.

3. test set์— ๋Œ€ํ•œ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ณ  ์ตœ์ข… ํ‰๊ท  ๊ฒฐ๊ณผ๋ฅผ ์ž‘์„ฑํ•œ๋‹ค.

๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:


Conclusion

์ด๋ ‡๊ฒŒ, ์ด ์—ฐ๊ตฌ์—์„œ๋Š” ์ƒˆ๋กœ์šด ์˜๋„๋ฅผ ๋ฐœ๊ฒฌํ•˜๋Š” ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ–ˆ๋‹ค.

์ด ๋ฐฉ๋ฒ•์€ ์ œํ•œ๋œ known intent์˜ ์‚ฌ์ „ ์ง€์‹์„ ์„ฑ๊ณต์ ์œผ๋กœ transferํ•˜๋ฉฐ, low-confidence cluster๋ฅผ ์ œ๊ฑฐํ•จ์œผ๋กœ์จ ์˜๋„ ์ˆ˜๋ฅผ ์ถ”์ •ํ•œ๋‹ค.

๋˜ํ•œ, clustering ํ”„๋กœ์„ธ์Šค๋ฅผ ์•ˆ์ •์ ์ด๊ณ  ๊ตฌ์ฒด์ ์œผ๋กœ ์•ˆ๋‚ดํ•˜๋Š” ๋” ์•ˆ์ •์ ์ธ supervised signal๋ฅผ ์ œ๊ณตํ•œ๋‹ค.

DeepAligned๋‚œ ๋น„๊ต ๋Œ€์ƒ ๋ฐฉ๋ฒ•๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ, ์ œํ•œ๋œ ์‚ฌ์ „ ์ง€์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋” ์ •ํ™•ํ•œ ์ถ”์ •๋œ cluster ์ˆ˜๋ฅผ ์–ป๋Š”๋‹ค. 

์ €์ž‘์žํ‘œ์‹œ (์ƒˆ์ฐฝ์—ด๋ฆผ)

'๐Ÿ“š ๋…ผ๋ฌธ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

New Intent Discovery with Pre-training and Contrastive Learning  (0) 2023.09.30
Two Birds One Stone: Dynamic Ensemble for OOD Intent Classification  (0) 2023.08.28
A Probabilistic Framework for Discovering New Intents  (0) 2023.07.27
USTORY: Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding  (0) 2023.07.11
CLICK: Constrastive Learning for Injecting Contextual Knowledge to Conversational Recommender System  (0) 2023.06.26
  1. Introduction
  2. Approach
  3. 1. Intent Representation
  4. 2. Transferring Knowledge from Known Intents
  5. 3. Deep Aligned Clustering
  6. Experiments
  7. Dataset
  8. Baselines
  9. Evaluation Metrics
  10. Evaluation Settings
  11. Conclusion
'๐Ÿ“š ๋…ผ๋ฌธ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • New Intent Discovery with Pre-training and Contrastive Learning
  • Two Birds One Stone: Dynamic Ensemble for OOD Intent Classification
  • A Probabilistic Framework for Discovering New Intents
  • USTORY: Unsupervised Story Discovery from Continuous News Streams via Scalable Thematic Embedding
์žฅ์˜์ค€
์žฅ์˜์ค€
groomielife
์žฅ์˜์ค€
youngjangjoon
์žฅ์˜์ค€
์ „์ฒด
์˜ค๋Š˜
์–ด์ œ
  • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (35)
    • ๐Ÿ“š ๋…ผ๋ฌธ (10)
    • ๐Ÿ’ป ํ”„๋กœ์ ํŠธ (14)
      • ๐ŸŽ“ RESUMAI (6)
      • ๐Ÿงธ TOY-PROJECTS (8)
    • ๐Ÿ“š ์Šคํ„ฐ๋”” (11)
      • CS224N (6)
      • NLP (5)

์ธ๊ธฐ ๊ธ€

ํƒœ๊ทธ

  • project
  • text embedding
  • dj-rest-auth
  • contrastive learning
  • ๋น„๋™๊ธฐ ์ €์žฅ
  • Representation Training
  • ์ƒ์„ฑAI
  • allauth
  • ์ž๊ธฐ์†Œ๊ฐœ์„œ์ƒ์„ฑ
  • ๋…ผ๋ฌธ
  • Neural Net
  • ์ž์†Œ์„œ์ƒ์„ฑํ”„๋กœ์ ํŠธ
  • NLP
  • pinecone
  • DEEPALIGNED
  • ArcFace
  • RESUMAI
  • rag
  • GenAI
  • MTP-CL
  • CS224N
  • vectordb
  • gpt-1
  • NeuralNet
  • Haar-cascade
  • text clustering
  • DEEPLOOK
  • Conversational Agent
  • story discovery
  • cv
hELLO ยท Designed By ์ •์ƒ์šฐ.
์žฅ์˜์ค€
Discovering New Intents with Deep Aligned Clustering
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”

๋‹จ์ถ•ํ‚ค

๋‚ด ๋ธ”๋กœ๊ทธ

๋‚ด ๋ธ”๋กœ๊ทธ - ๊ด€๋ฆฌ์ž ํ™ˆ ์ „ํ™˜
Q
Q
์ƒˆ ๊ธ€ ์“ฐ๊ธฐ
W
W

๋ธ”๋กœ๊ทธ ๊ฒŒ์‹œ๊ธ€

๊ธ€ ์ˆ˜์ • (๊ถŒํ•œ ์žˆ๋Š” ๊ฒฝ์šฐ)
E
E
๋Œ“๊ธ€ ์˜์—ญ์œผ๋กœ ์ด๋™
C
C

๋ชจ๋“  ์˜์—ญ

์ด ํŽ˜์ด์ง€์˜ URL ๋ณต์‚ฌ
S
S
๋งจ ์œ„๋กœ ์ด๋™
T
T
ํ‹ฐ์Šคํ† ๋ฆฌ ํ™ˆ ์ด๋™
H
H
๋‹จ์ถ•ํ‚ค ์•ˆ๋‚ด
Shift + /
โ‡ง + /

* ๋‹จ์ถ•ํ‚ค๋Š” ํ•œ๊ธ€/์˜๋ฌธ ๋Œ€์†Œ๋ฌธ์ž๋กœ ์ด์šฉ ๊ฐ€๋Šฅํ•˜๋ฉฐ, ํ‹ฐ์Šคํ† ๋ฆฌ ๊ธฐ๋ณธ ๋„๋ฉ”์ธ์—์„œ๋งŒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.