๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
Deep Learning

[PAPER REVIEW] KLUE: Korean Language Understanding Evaluation

by 2soupsoup 2023. 6. 26.
Paper | https://arxiv.org/pdf/2105.09680.pdf

Github | https://klue-benchmark.com/

Homepage | https://klue-benchmark.com/

 

Notion (Task๋ณ„ ์ •๋ฆฌ ver.)

 

KLUE: Korean Language Understanding Evaluation

Paper | https://arxiv.org/pdf/2105.09680.pdf Github | https://klue-benchmark.com/ Homepage | https://klue-benchmark.com/

awake-roast-a5b.notion.site

 

0. Abstract

  1. 8๊ฐœ์˜ ํ•œ๊ตญ์–ด ์ž์—ฐ์–ด ์ดํ•ด ํƒœ์Šคํฌ
    • Topic Classification
    • Semantic Textual Similarity
    • Natural Language Inference
    • Named Entity Recognition
    •  Relation Extraction
    • Dependency Parsing
    • Machine Reading Comprehension
    • Dialogue State Tracking
  2. ๋ชจ๋ธ ๋ฐฐํฌ : PLM, KLUE-BERT, KLUE-RoBERTa
  3. ๋ฐœ์ „๋œ ์ 
    • KLUE-RoBERTa large : ๋‹ค์ค‘์–ธ์–ด PLM์ด๋‚˜ ๊ธฐ์กด ํ•œ๊ตญ์–ด PLM ์˜คํ”ˆ์†Œ์Šค๋ฅผ ํฌํ•จํ•œ ๋‹ค๋ฅธ ๋ฒ ์ด์Šค๋ผ์ธ๋ณด๋‹ค ์ข‹์Œ
    • pretrained corpus์—์„œ PII ๊ต์ฒดํ•˜๋”๋ผ๋„ ์„ฑ๋Šฅ ์ €ํ•˜ ์ตœ์†Œํ™” == ๊ฐœ์ธ์ •๋ณด๋ณดํ˜ธ์™€ NLU ๊ธฐ๋Šฅ ์ƒ์ถฉ X
    • ํ˜•ํƒœ์†Œ pre-tokenization + BPE tokenization, ํ˜•ํƒœ์†Œ ๋‹จ์œ„ ํƒœ๊น…/๊ฐ์ง€/์ƒ์„ฑ์— ํšจ๊ณผ์ 

1. Introduction

  1. NLU์—์„œ ํšจ๊ณผ์„ฑ ํ‰๊ฐ€ ์œ„ํ•œ ์ž˜ ๋งž๋Š” ๋ฒค์น˜๋งˆํฌ๋ฅผ ํ†ตํ•ด BERT๋‚˜ GPT ์„ฑ๊ณต ๊ฐ€๋Šฅ
  2. ํ•œ๊ตญ์–ด NLU ํ‰๊ฐ€ ์œ„ํ•œ ๋ฒค์น˜๋งˆํฌ ๋ฐ์ดํ„ฐ์…‹ ๊ตฌ์ถ•

1.1. Summary

[Design Principles]

  1. ๋‹ค์–‘ํ•œ ํƒœ์Šคํฌ์™€ ์ฝ”ํผ์Šค ์ข…ํ•ฉ : ๋‹ค์–‘ํ•œ ์ธก๋ฉด์—์„œ์˜ ์–ธ์–ด ์ดํ•ด > 8๊ฐ€์ง€ ํƒœ์Šคํฌ์—์„œ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ ์ปค๋ฒ„ํ•ด์•ผ
  2. ๋ˆ„๊ตฌ๋‚˜ ์ œํ•œ ์—†์ด ์ ‘๊ทผ ๊ฐ€๋Šฅ
  3. ์ •ํ™•ํ•˜๊ณ  ๋ช…ํ™•ํ•œ ์ฃผ์„
  4.  AI ์œค๋ฆฌ์  ๋ฌธ์ œ ์™„ํ™”

[Diverse Task Selection]

๋‹ค์Œ ๋‘ ๊ฐ€์ง€ ๋ชฉ์ ์„ ๊ฐ€์ง€๊ณ  ํƒœ์Šคํฌ ์„ ํƒ

  1. ํ•œ๊ตญ์–ด NLU์˜ ๋‹ค์–‘ํ•œ ์ธก๋ฉด ์ปค๋ฒ„ ๊ฐ€๋Šฅ
  2. ํ…Œ์Šคํฌ ์ค‘๋ณต์„ฑ ์ตœ์†Œํ™” : Topic Modeling, Semantic Textual Similarity, Naturl Language Inference ๋“ฑ

[source Corpra Collection]

  1. derivative / ์žฌ๊ฐ€๊ณต / ์ƒ์—…์  ์ด์šฉ ๊ฐ€๋Šฅํ•œ ์ €์ž‘๊ถŒ ๋ฌธ์ œ์—†๋Š” 10๊ฐ€์ง€ ๋ฐ์ดํ„ฐ ์†Œ์Šค ์‚ฌ์šฉ
    • ์•ผํ›„ ๋‰ด์Šค ํ—ค๋“œ๋ผ์ธ, ์œ„ํ‚คํ”ผ๋””์•„, ์œ„ํ‚ค๋‰ด์Šค, ์ •์ฑ…๋‰ด์Šค, ParaKQC, Airbnb ๋ฆฌ๋ทฐ, ๋„ค์ด๋ฒ„ Sentiment Movie Corpus, ํ•œ๊ตญ ๊ฒฝ์ œ๋‰ด์Šค, Acrofan ๋‰ด์Šค
  2. Annotation ์ง„ํ–‰ ์ „์— ๋…ธ์ด์ฆˆ / ์œ ํ•ด ์ปจํ…์ธ  / ์‚ฌํšŒ์  ํŽธํ–ฅ ์ปจํ…์ธ  / ๊ฐœ์ธ์‹๋ณ„์ •๋ณด(PII)๋Š” ์ œ๊ฑฐ 
    • ์‚ฌ์ „ ์ •์˜๋œ ๋ฃฐ๊ณผ ML ์„ ํ†ตํ•ด ์ž๋™ํ™”

[Consideration in Annotation]

  1. ๊ฐ ํƒœ์Šคํฌ๋ณ„๋กœ ์›์ฒœ ์ฝ”ํผ์Šค Annotation
  2. ์ง„ํ–‰ ์‹œ ๊ณ ๋ ค์‚ฌํ•ญ
    • ํ•œ๊ตญ์–ด ์–ธ์–ด์  ํŠน์„ฑ์„ ๋” ์ž˜ ๋ฐ˜์˜ํ•ด์•ผ ํ•  ๊ฒƒ
    • ์ •ํ™•ํ•œ annotation์ด ์ง„ํ–‰๋  ๊ฒƒ
    • ์œ ํ•ดํ•œ ์‚ฌํšŒ์  ํŽธ๊ฒฌ ์™„ํ™” ๋ฐ PII ์ œ๊ฑฐ

[Evaluation Metrics]

KLUE ๋‚ด ๋‹ค์–‘ํ•œ ํƒœ์Šคํฌ๋ฅผ ์œ„ํ•ด ์ ์ ˆํ•œ ํ‰๊ฐ€์ง€ํ‘œ๋ฅผ ๊ฐ๊ฐ ์„ ํƒํ•ด์•ผ ํ•จ.

  1. KLUE-TC (์—ฐํ•ฉ๋‰ด์Šค ํ† ํ”ฝ ๋ถ„๋ฅ˜ YNAT)
    • 7๊ฐ€์ง€ ํด๋ž˜์Šค๋กœ์˜ ๋‹ค์ค‘๋ถ„๋ฅ˜ ๋ฌธ์ œ๋กœ ์ •์˜
    • 7๋งŒ ๊ฐœ์˜ ํ—ค๋“œ๋ผ์ธ ์ฃผ์„์ฒ˜๋ฆฌ
    • macro F1 score
  2. KLUE-STS
    • ๋ฌธ์žฅ ์Œ ๊ฐ„ ์œ ์‚ฌ์„ฑ ๋“ฑ๊ธ‰ (0~5)
    • ์‹ค์ œ-์˜ˆ์ธก ์‚ฌ์ด์˜ ํ”ผ์–ด์Šจ ์ƒ๊ด€ ๊ณ„์ˆ˜
    • Parapharase detection : F1 score
  3. KLUE-NLI
    • SNLI๊ณผ MNLI๊ณผ ๊ฐ™์€ NLI ๋ฐ์ดํ„ฐ์…‹๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๋ถ„๋ฅ˜ ์ •ํ™•๋„ ์‚ฌ์šฉ
      • SNLI (Standford Natural Language Inference) : entailment, contradiction, neutral ๋ผ๋ฒจ๋ง ๋œ 570k์˜ ๋ฌธ์žฅ์Œ
      • MNLI (Multi-Genre Natural Language Inference) : entailment ๋ผ๋ฒจ๋ง๋œ 433k์˜ ๋ฌธ์žฅ์Œ
    • ๊ท ํ˜• ์žˆ๋Š” ํด๋ž˜์Šค ๋ถ„ํฌ ๊ฐ–๋„๋ก KLUE-NLI dev/test set ์ œ์ž‘
  4. KLUE-NER
    • BIO ํƒœ๊ทธ ์ถœ๋ ฅ
    • 6ํƒ€์ž… (์‚ฌ๋žŒ, ์œ„์น˜, ์กฐ์ง, ๋‚ ์ž, ์‹œ๊ฐ„, ์ˆ˜๋Ÿ‰)์œผ๋กœ ์นดํ…Œ๊ณ ๋ผ์ด์ง•
    • entity-level & character-level : F1 score
  5. KLUE-RE
    • ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ํƒœ์Šคํฌ
    • ๋‘ ๊ฐœ์˜ ์—”ํ„ฐํ‹ฐ ์žˆ๋Š” ํ•œ ๋ฌธ์žฅ -> 30 ํƒ€์ž…์˜ ๊ด€๊ณ„๋กœ ์ถœ๋ ฅ
    • ์˜๋ฏธ ์žˆ๋Š” ์œ ํ˜•(๊ด€๊ณ„์—†์Œ ์ œ์™ธ)๋งŒ ๊ณ ๋ คํ•œ macro F1 : ํ•œ์Œ์˜ ์—”ํ„ฐํ‹ฐ์—์„œ ์„ธ๋ถ„ํ™”๋œ ๊ด€๊ณ„ ์‹๋ณ„ํ•˜๋Š” NLU ์‹œ์Šคํ…œ ๋Šฅ๋ ฅ ํ‰๊ฐ€
    • AUPRC : ๊ด€๊ณ„ ์ถ”์ถœ ๋ชจ๋ธ ํ’ˆ์งˆ์— ๋Œ€ํ•œ ์‹œ๊ฐํ™”
  6. KLUE-DP
    • ์ข…์†์„ฑ ๊ตฌ๋ฌธ ๋ถ„์„ ํ‘œ์ค€ ๊ด€ํ–‰์— ๋”ฐ๋ผ UAS&LAS ์‚ฌ์šฉ
      • Unlabeled Attachment Score (UAS)
      • Labeled Attechment Score (LAS)
    • ๊ณต์‹(๋‰ด์Šค)&๋น„๊ณต์‹(๊ตฌ์–ด์ฒด ๋ฆฌ๋ทฐ) ํ…์ŠคํŠธ์— ์ฃผ์„ ๋‹ฌ์•„ ์—ฌ๋Ÿฌ ๋„๋ฉ”์ธ์— ๊ฑธ์ณ ์„ธ๋ถ„ํ™” ๋ถ„์„
  7. KLUE-MRC
    • KLUE-NER๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ span prediction problem
    • ๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋น„๊ต ์œ„ํ•œ EM
    • ROUGE-W : LCCS ๊ธฐ๋ฐ˜ F1 score ์‚ฌ์šฉ
  8. KLUE-DST (Wizard of Seoul)
    • multiple-sentence slot-value prediction
    • Joint goal accuracy : ๋ชจ๋“  ์Šฌ๋กฏ์ด ๋ฐ”๋ฅด๊ฒŒ ์˜ˆ์ธก๋˜์—ˆ๋Š”์ง€ ํ‰๊ฐ€
    • ํ‰๊ท  F1 score
    • ์„ธ๋ถ„ํ™”๋œ ๋ถ„์„์˜ ์šฉ์ด์„ฑ์„ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๋„๋ฉ”์ธ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌ์ถ•

[Baselines]

Model Task
KLUE-BERT KLUE-TC
(YNAT)
KLUE-DST
(Wos)
KLUE-RoBERTa KLUE-RE
KLUE_MRC
KoELECTRA BASE KLUE-STS
KLUE-NLI
KLUE-RoBERTa LARGE KLUE-NER
  1. PII ์ œ๊ฑฐ, ์„ฑ๋Šฅ ์ €ํ•˜์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ ๋‚ฎ์Œ
  2. ํ˜•ํƒœ์†Œ ๊ธฐ๋ฐ˜ ํ•˜์œ„ ๋‹จ์–ด ํ† ํฐํ™”, ํ˜•ํƒœ์†Œ ๋ ˆ๋ฒจ์—์„œ ํƒœ๊น…/๊ฐ์ง€/์ƒ์„ฑ ๊ด€๋ จ ์ž‘์—…์— ํšจ๊ณผ์ 

2. Sourc Corpora

๊ธฐ์กด ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์šฉ ์—†์ด ์ฒ˜์Œ๋ถ€ํ„ฐ ๊ตฌ์ถ•

 

2.1. Coprpora Selection Criteria

  1. ์ ‘๊ทผ์„ฑ : ์ตœ๋Œ€ํ•œ ์ œํ•œ X, ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  2. ๋‹ค์–‘์„ฑ : ๋‚ฎ์€ ํ’ˆ์งˆ์˜ ํ…์ŠคํŠธ๋Š” ์ œ๊ฑฐํ•˜๋ฉฐ ์ผ์ • ์ˆ˜์ค€์˜ ํ’ˆ์งˆ ๊ฐ€์ง€๋„๋ก ๊ณต์‹(๋ฌธ์–ด)&๋น„๊ณต์‹(๊ตฌ์–ด) ํ…์ŠคํŠธ ๊ฐ„ ๊ท ํ˜•

[Accessibility]

  1. ์‚ฌ์šฉ ์ œํ•œ ์—†์Œ : ์ƒ์—…&๋น„์ƒ์—… ๋ชจ๋‘ ํ—ˆ์šฉ
  2. ํŒŒ์ƒ : ์‚ฌ์šฉ์ž๋Š” ์ž์œ ๋กญ๊ฒŒ ์žฌ๊ฐ€๊ณตํ•˜์—ฌ ๋‹จ์ (์œค๋ฆฌ์  ๋ฌธ์ œ, ์ฃผ์„ ์˜ค๋ฅ˜ ๋“ฑ) ํ•ด๊ฒฐ
  3. ์žฌ๋ฐฐํฌ ๊ฐ€๋Šฅ

[Quality and Diversity]

  1. ์ ‘๊ทผ์„ฑ ๊ณ ๋ คํ•œ 20๊ฐœ Corpora Dataset ์ค‘ ๋‹ค์Œ ๊ธฐ์ค€ ๊ณ ๋ คํ•˜์—ฌ 10๊ฐœ ์„ ์ •
    • ๋‹ค์–‘์„ฑ : ์ข์€ ์˜์—ญ ํŠน์ • X
    • ํ’ˆ์งˆ : ํ˜„๋Œ€ ํ•œ๊ตญ์–ด๋กœ ์ž‘์„ฑ / ์‚ฌ์ƒํ™œ์ด๋‚˜ ์œ ๋…์„ฑ ์šฐ๋ ค ์ฝ˜ํ…์ธ  X
    • 8๊ฐ€์ง€ ํƒœ์Šคํฌ ์ค‘ ์ ์–ด๋„ ํ•˜๋‚˜์— ์ฃผ์„ ๋‹ฌ ์ˆ˜ ์žˆ์–ด์•ผ ํ•จ
    • ๊ณต์‹&๊ตฌ์–ด์ฒด ๋ชจ๋‘ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ๋ถ€๋ถ„ ์ง‘ํ•ฉ ์„ ํƒ
  2. ์ˆ˜์ง‘๋œ Corpora ์ž๋ฃŒ
    • ๋ณผ๋“œ์ฒด : ์ตœ์ข… ์„ ์ •๋œ Corpora Dataset
    • ์šฉ๋Ÿ‰ : Small (~1k) / Medium (1k~50k) / Large (50k~)

์ˆ˜์ง‘๋œ Corpora ์ž๋ฃŒ

 

2.2. Selected Corpora

๊ฐ๊ฐ์˜ ์ˆ˜์ง‘ ๋ฉ”์ปค๋‹ˆ์ฆ˜, ๊ธฐ๊ฐ„, ๋„๋ฉ”์ธ, ์Šคํƒ€์ผ, ๋ผ์ด์„ผ์Šค, ๋ฐฐ๊ฒฝ

Dataset ์„ค๋ช… ์ €์ž‘๊ถŒ ์ˆ˜์ง‘๊ธฐ๊ฐ„(๊ธฐ์ค€)
New Headline
(YNA)
์—ฐํ•ฉ๋‰ด์Šค ํ—ค๋“œ๋ผ์ธ,
๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜์— ์‚ฌ์šฉ
- 2016~2020
Wikipeida ๊ณต์‹ ๋ฌธ์ฒด ๊ฐ€์ง„ ๊ณต๊ฐœ ๋ฐฑ๊ณผ์‚ฌ์ „ CC BY-SA 3.0 2020.12.01
Wikinew ์ง‘๋‹จ ์ €๋„๋ฆฌ์ฆ˜,
๋ฌด๋ฃŒ ์ œ๊ณต ๋‰ด์Šค ๊ธฐ์‚ฌ ์•ฝ 500๊ฐœ
CC BY 2.5  
Wikitree Wikitree ์ œ๊ณต ๋‰ด์Šค ๊ธฐ์‚ฌ ๋ฐ์ดํ„ฐ์…‹,
2010๋…„์— ์‹œ์ž‘๋œ
ํ•œ๊ตญ ์†Œ์…œ ๋ฏธ๋””์–ด ๊ธฐ๋ฐ˜ ๋‰ด์Šค ํ”Œ๋žซํผ,
๊ด‘๊ณ ๋‚˜ ํด๋ฆญ ๋ฏธ๋ผ์šฉ ํ—ค๋“œ๋ผ์ธ ํ†ตํ•œ
๋ถ€์ ์ ˆํ•œ ํŽธ๊ฒฌ ํ‘œํ˜„๋˜๊ธฐ๋„ ํ•˜์ง€๋งŒ
๊ด‘๋ฒ”์œ„ํ•œ ์ฃผ์ œ๋ฅผ ์ปค๋ฒ„ํ•˜๊ณ  ์žˆ์–ด ํฌํ•จ
*2.2.1์— ์–ธ๊ธ‰๋˜๋Š” ์ถ”๊ฐ€์  ์กฐ์น˜ ์ง„ํ–‰
CC BY-SA 2.0 2016~2020
Policy News ํ•œ๊ตญ ๊ตญ๊ฐ€ ๋ถ€์ฒ˜, ๊ณต๊ณต๊ธฐ๊ด€ ๋ฐœํ–‰ ๋ฌธ์„œ,
์ •๋ถ€๊ธฐ๊ด€์˜ ๋ฐœ์–ธ/๊ณต์ง€/์–ธ๋ก ์ฐธ๊ณ ์‚ฌํ•ญ
KOGL Type 1 ~ 2020 ๋ง
ParaKQC ์Šค๋งˆํŠธํ™ˆ ๊ธฐ๊ธฐ ์œ„ํ•œ 10,000๊ฐœ์˜ ๋ฐ์ดํ„ฐ์…‹,
10๊ฐœ์˜ ์œ ์‚ฌ ์งˆ๋ฌธ์— ๋Œ€ํ•ด 1,000๊ฐœ ์˜๋„๋กœ ๊ตฌ์„ฑ,
์Šค๋งˆํŠธํ™ˆ ๊ธฐ๊ธฐ์™€ ๋Œ€ํ™” ์‹œ ๊ฐ€๋Šฅํ•œ ๋‹ค์–‘ํ•œ ์ฃผ์ œ
CC BY-SA 4.0 -
Airbnb Reviews Airbnb ํ™ˆํŽ˜์ด์ง€ ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ์…‹,
Airbnb์—์„œ ์ˆ˜์ง‘/์‚ฌ์ „์ฒ˜๋ฆฌ๋œ
๊ธฐ์กด ๋‹ค๊ตญ์–ด ๋ฆฌ๋ทฐ ์‚ฌ์šฉ,
ํ•œ๊ตญ์–ด ์ž‘์„ฑ ๋ฆฌ๋ทฐ ์ผ๋ถ€๋ฅผ ์ •๊ทœ ํ‘œํ˜„์œผ๋กœ ์‹๋ณ„
CC0 1.0 -
NAVER Sentiment
Movie Corpus
(NSMC)
NAVER Movies ์Šคํฌ๋žฉํ•œ ์˜ํ™” ๋ฆฌ๋ทฐ ๋ฐ์ดํ„ฐ์…‹,
์˜จ๋ผ์ธ ์œ ์ €๊ฐ€ ์ž‘์„ฑํ•œ ๋ฆฌ๋ทฐ๋กœ
ํ…์ŠคํŠธ ๋‚ด์šฉ๊ณผ 2๊ฐ€์ง€ ๊ฐ์ • ๋ผ๋ฒจ ์ œ๊ณต,
์ด 20๋งŒ๊ฐœ์˜ ๊ธ์ •/๋ถ€์ • ๊ท ํ˜•์ ์ธ ๋ฐ์ดํ„ฐ์…‹
CC0 1.0 -
Acrofan News
(ACROFAN)
๋ณด๋„์ž๋ฃŒ์™€ ์œ ์‚ฌํ•œ
๊ธฐ์—… ์‹ ์ œํ’ˆ/์ด๋ฒคํŠธ ์†Œ๊ฐœ ๋‰ด์Šค ๊ธฐ์‚ฌ,
์œ ์‚ฌ ์Šคํƒ€์ผ/ํ˜•์‹ ๊ฐ–์ถ˜ ๋‹ค์–‘ํ•œ ๋ฒ”์ฃผ ์ปค๋ฒ„
CC BY-SA 4.0
for KLUE-MRC by Contract
2020.12~2021.01
The Korea Economics
Daily News
ํ•œ๊ฒฝ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์ง‘,
๊ฒฝ์ œ/์ •์น˜/๋ฌธํ™”/IT ๋“ฑ์˜ ์ฃผ์ œ
CC BY-SA 4.0
for KLUE-MRC by Contract
2013.01~2015.12

 

2.2.1. Potential Concerns

๋ฐ์ดํ„ฐ ํ’ˆ์งˆ ๋ฐ ์‚ฌํšŒ์ /์œค๋ฆฌ์  ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ณ ๋ ค์‚ฌํ•ญ

 

[Toxic Contect]

  1. ๋‰ด์Šค ๊ธฐ์‚ฌ์— ๋ฐ˜์˜๋  ์ˆ˜ ์žˆ๋Š” ๊ธฐ์ž๋‚˜ ํŽธ์ง‘์ž๋“ค์˜ ํŽธ๊ฒฌ
    • ํŠนํžˆ, Wikitree๋Š” ์†Œ์…œ ๋ฏธ๋””์–ด ๊ธฐ๋ฐ˜์ด๋ผ๋Š” ํŠน์„ฑ์ƒ ๋‹ค๋ฅธ ๋‰ด์Šค ๊ธฐ์‚ฌ๋ณด๋‹ค ์ž ์žฌ์  ๋ฌธ์ œ ์š”์†Œ ํŒจํ„ด ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ ๋งŽ์Œ
      • TC ๊ตฌ์„ฑ ์‹œ, Wikitree์˜ ํ—ค๋“œ๋ผ์ธ ์‚ฌ์šฉ X
      • MRC์— Wikitree ๊ธฐ์‚ฌ ๋‚ด์šฉ ์‚ฌ์šฉ X
      • ๋ฌธ์žฅ์ด ์™„์ „ํ•˜๊ณ  ํ˜•์‹์ด ์ข‹์•„ ๋‹ค๋ฅธ ์ž‘์—…์— ์‚ฌ์šฉ
  2. ์ฃผ์„์ฒ˜๋ฆฌ ์‹œ ๋ฌธ์ œ ์žˆ๋Š” ๋ฌธ์žฅ ํ๊ธฐ
    1. ์˜จ๋ผ์ธ ๋ฆฌ๋ทฐ, ์œ ํ•ด ๋‚ด์šฉ ํฌํ•จ ๊ฐ€๋Šฅ์„ฑ ๋†’์Œ
      • Airbnb : ์•ˆ์ „ ์ ๊ฒ€ ์‹œ์Šคํ…œ์œผ๋กœ ์ธํ•œ ์œ ํ•ด์„ฑ ํฌํ•จ ๋ฆฌ๋ทฐ ๊ฑฐ์˜ X
      • NSMC : ์˜ํ™”/๋ฐฐ์—ญ/๊ฐ๋…์— ๋Œ€ํ•œ ๋ชจ์š•์  ๋ฐœ์–ธ ํฌํ•จ
      • ํ•œ๊ตญ์–ด ํ˜์˜ค ๋ฐœ์–ธ ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์šฉํ•œ ๊ฒ€์ถœ๊ธฐ ์‚ฌ์šฉํ•˜์—ฌ ์œ ํ•ด ์ปจํ…์ธ  ํ•„ํ„ฐ๋ง
    2. ํ•„ํ„ฐ๋ง ์ง„ํ–‰ ์ดํ›„ ์ฃผ์„์ฒ˜๋ฆฌ ๊ณผ์ •์—์„œ ๋ฌธ์ œ ์žˆ๋Š” ๋ฌธ์žฅ ํ๊ธฐ

[Personally Identifiable Information (PII)]

  1. ๊ณต์ธ์œผ๋กœ ๊ฐ„์ฃผ๋˜์ง€ ์•Š๋Š” ๊ฐœ์ธ ์‹๋ณ„ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ์ •๋ณด (ex. ์ด๋ฆ„, ์‚ฌํšŒ ๋ณด์•ˆ ๋ฒˆํ˜ธ, ์ „ํ™”๋ฒˆํ˜ธ, ๊ณ„์ขŒ๋ฒˆํ˜ธ ๋“ฑ)

2.3. Preprocessing

  1. Korean Sentence Splitter(KSS) v2.2.0.2.14 ์‚ฌ์šฉํ•˜์—ฌ ๋ฌธ์žฅ ๋ถ„ํ•  ํ›„ ์‚ฌ์ „ ์ฒ˜๋ฆฌ ์ง„ํ–‰
  2. KLUE ์ฃผ์„ ์ฒ˜๋ฆฌ ๋‹จ๊ณ„ ๋‚ด ์ˆ˜๋™ ๊ฒ€์‚ฌ ๋ฐ ํ•„ํ„ฐ๋ง ์‹œ ์ง„ํ–‰

[Noise Filtering]

๋…ธ์ด์ฆˆ ํ…์ŠคํŠธ๋‚˜ ํ•œ๊ตญ์–ด ์ด์™ธ์˜ ํ…์ŠคํŠธ ์ œ๊ฑฐ

  1. ํ•ด์‹œํƒœ๊ทธ, HTMLํƒœ๊ทธ, ์ž˜๋ชป๋œ ๋ฌธ์ž, ๋นˆ ๊ด„ํ˜ธ, ์—ฐ์† ๊ณต๋ฐฑ ๋“ฑ ์ œ๊ฑฐ
  2. ํ•œ์ž๋‚˜ ์ผ๋ณธ์–ด 10์ž ์ด์ƒ์˜ ๋ฌธ์žฅ ํ•„ํ„ฐ๋ง
  3. ๋‰ด์Šค ๊ธฐ์‚ฌ Corpora : ๋ฆฌํฌํ„ฐ/์–ธ๋ก /์ด๋ฏธ์ง€/์†Œ์Šค/์ €์ž‘๊ถŒ ํƒœ๊ทธ ์ •๋ณด ์ œ๊ฑฐ

[Toxic Content Removal]

์›์น˜ ์•Š๋Š” ๋‚ด์šฉ/์„ฑํ–ฅ ํšŒํ”ผ ๋ฐ ๋ถ€์ ์ ˆํ•œ ๋ฌธ์žฅ ์ œ๊ฑฐ ๋ชฉ์ 

  1. ํ•œ๊ตญ์–ด ํ˜์˜ค ์–ธ์–ด ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์šฉ
    • ์„ฑ๋ณ„ ์„ฑํ–ฅ, ํ˜์˜ค ์Œ์„ฑ ๊ฐ์ง€๊ธฐ ํ›ˆ๋ จ
      • 0.5 ์ด์ƒ์˜ ์˜ˆ์ธก ์ ์ˆ˜๋กœ ์„ฑ๋ณ„ ์„ฑํ–ฅ ๋ณด์ด๋Š” ๋ฌธ์žฅ ํ๊ธฐ
      • 0.9 ์ด์ƒ์˜ ์˜ˆ์ธก ์ ์ˆ˜๋กœ ํ˜์˜ค ๋ฐœ์–ธ ๋ณด์ด๋Š” ๋ฌธ์žฅ ํ๊ธฐ 
    • ์ž„๊ณ„๊ฐ’์€ ๊ฐ corpus์— ๋Œ€ํ•ด ์ˆ˜๋™์œผ๋กœ ๊ฒฐ์ •
    • ์˜จ๋ผ์ธ ๋ฆฌ๋ทฐ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹
      • ์˜จ๋ผ์ธ ํ…์ŠคํŠธ์— ์ ํ•ฉ
      • ๊ณต์‹์  ํ…์ŠคํŠธ์—๋Š” ์ ํ•ฉ X -> YNA / ACROFAN / ํ•œ๊ฒฝ ๋ฐ์ดํ„ฐ์…‹์— ์‚ฌ์šฉ X

[PII Removal]

  1. ๊ฐœ์ธ์ •๋ณด ํฌํ•จ ๋ฌธ์žฅ ์ œ๊ฑฐ
  2. ์ด๋ฉ”์ผ ์ฃผ์†Œ, URL, ์‚ฌ์šฉ์ž ์–ธ๊ธ‰ ํ‚ค์›Œ๋“œ์™€ ์ผ์น˜ํ•˜๋Š” ์ •๊ทœ์‹ ์‚ฌ์šฉํ•˜์—ฌ ๋ฌธ์žฅ ๊ฐ์ง€ ํ›„ ์ œ๊ฑฐ

2.4. Task Assignment

DST ์ œ์™ธํ•œ 7๊ฐ€์ง€ ํƒœ์Šคํฌ์— ๋Œ€ํ•ด ์œ„์˜ ๋ฐ์ดํ„ฐ์…‹ ์‚ฌ์šฉ

* DST, ํฌ๋ผ์šฐ๋“œ์›Œ์ปค์˜ ๊ฐ€์ƒ ๋Œ€ํ™”๋กœ ๊ตฌ์ถ•๋˜์–ด์ ธ ์œ„ ๋ฐ์ดํ„ฐ์…‹ ํ•„์š” X

 

Task Dataset ํ•ด๋‹น Dataset ์„ ํƒ ์ด์œ 
Topic Classification
(TC)
YNA ๋‹จ์ผ ๋ฌธ์žฅ ์ฃผ์ œ ๋ถ„๋ฅ˜
Semantic Textual Similarity
(STS)
AIRBNB, POLICY,
PARAKQC
๋‹ค์–‘ํ•œ ์˜๋ฏธ๋ก ์  ๋ฌธ๋งฅ ํฌํ•จ
* PARAKQC์˜ ์˜๋„ ์ฟผ๋ฆฌ์™€ ์ฃผ์ œ ์ •๋ณด, ์˜๋ฏธ๋ก ์  ์œ ์‚ฌ ๋ฌธ์žฅ ์Œ ์ƒ์„ฑ ์‹œ ์œ ์šฉ
Natural Language Inference
(NLI)
WIKITREE, POLICY,
WIKINEWS, WIKIPEDIA,
NSMC, AIRBNB
MNLI๊ณผ ๊ฐ™์ด ์—ฌ๋Ÿฌ ์†Œ์Šค
Named Entity Recognition
(NER)
WIKITREE, NSMC ๋ช…๋ช…๋œ ์—”ํ„ฐํ‹ฐ๊ฐ€ ์ž์ฃผ ๋“ฑ์žฅ
๊ณต์‹/๋น„๊ณต์‹ ์ž‘์„ฑ ์Šคํƒ€์ผ ํฌํ•จ
Relation Extraction
(RE)
WIKIPEDIA, WIKITREE,
POLICY
๊ณต์ธ ์ด๋ฆ„๊ณผ ๋‹ค์–‘ํ•œ ์กฐ์ง ๊ฐ„ ๊ด€๊ณ„๊ฐ€ ์ ํžŒ ๊ธธ๊ณ  ์™„์ „ํ•œ ๋ฌธ์žฅ
Dependency Parsing
(DP)
WIKITREE, AIRBNB ๊ณต์‹/๊ตฌ์–ด ์ž‘์„ฑ ์Šคํƒ€์ผ ๊ท ํ˜•์žกํžŒ ์ž˜ ์ž‘์„ฑ๋œ ๋ฌธ์žฅ
* NSMC๋ณด๋‹ค AIRBNB์˜ ๋ฌธ์žฅ์ด ๋” ์ž˜ ํ˜•์„ฑ๋˜์–ด์žˆ์Œ
Machine Reading Comprehension
(MRC)
WIKIPEDIA, ACROFAN,
The Korea Economy Daily
์œ ์šฉํ•œ ์ •๋ณด ๊ตฌ์ ˆ ์ œ๊ณต

3. KLUE Benchmark

  1. KLUE ๋ชฉ์ 
    • ์‹œ์Šคํ…œ์˜ ํ•œ๊ตญ์–ด ์ดํ•ด ๋Šฅ๋ ฅ ํ…Œ์ŠคํŠธ ์œ„ํ•œ ๊ณ ํ’ˆ์งˆ ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ์…‹ ๋ฐ ์ ํ•ฉํ•œ ์ž๋™ ๋ฉ”ํŠธ๋ฆญ ์ œ๊ณต
  2. Benchmark ๊ตฌ์„ฑ ๋ฐฉ์•ˆ ์„ค๋ช…
    • ์†Œ์Šค Corpus ์„ ํƒ ๋ฐฐ๊ฒฝ
    • ์ฃผ์„ ํ”„๋กœํ† ์ฝœ
    • ์ฃผ์„ ํ”„๋กœ์„ธ์Šค (* 1.1. Summary ๋‚ด ์ž ์žฌ์  ์œค๋ฆฌ ๋ฌธ์ œ ์ •์˜ ์ฐธ๊ณ )
    • ๋ฐ์ดํ„ฐ์…‹ ๋ถ„ํ•  ์ „๋žต
    • ๋ฉ”ํŠธ๋ฆญ ์„ค๊ณ„ ํ”„๋กœ์„ธ์Šค

3.1. Topic Classification (TC) ~ 3.8. Dialogue State Tracking (DST)

* Task Overview์— ๋‚˜์˜ค์ง€ ์•Š์€ ์ฃผ์š” ํŠน์ง• ์ •๋ฆฌ

Task ์ฃผ์š” ํŠน์ง• ํ‰๊ฐ€์ง€ํ‘œ
KLUE-TC ใ…‡ ๋ฐœํ–‰์ผ์ž ๊ธฐ์ค€์œผ๋กœ Train / Dev. / Test ๋ถ„๋ฆฌ
   - Train set : 2020๋…„ ์ด์ „ ๋ฐœํ–‰
   - Test set : 2020๋…„ ์ดํ›„ ๋ฐœํ–‰
ใ…‡ Macro F1 score
   - ๊ฐ ์ฃผ์ œ์— ๋™์ผํ•œ ์ค‘์š”๋„๊ฐ€ ๋ถ€์—ฌ๋œ topic-wise F1 score์˜ ํ‰๊ท 
KLUE-STS ใ…‡ ๋ฌธ์žฅ ์œ ์‚ฌ ์Œ ๋งค์นญ ์‹œ ๋™์ผ ์˜๋ฏธ ๋‹ค๋ฅธ ํ‘œํ˜„์˜ ๋ฌธ์žฅ ์–ป๊ธฐ ์œ„ํ•ด NAVER Papago ์‚ฌ์šฉํ•œ RTT(Round-Trip Translation) ์ „๋žต ์‚ฌ์šฉ
ใ…‡ Greedy sentence matching ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์‚ฌ์šฉ
   - ๋žœ๋ค์œผ๋กœ ๋ฌธ์žฅ ํ•˜๋‚˜๋ฅผ ๊ณจ๋ผ ํ•ด๋‹น ๋ฌธ์žฅ๊ณผ ROUGE ๊ฐ€์žฅ ๋†’์€ ๋ฌธ์žฅ ๋งค์นญ
   - ์Œ ์ด๋ค„์ง„ ๋ฌธ์žฅ ์ œ์™ธํ•œ corpus ๋‚ด์—์„œ ํ•ด๋‹น ๊ณผ์ • ๋ฐ˜๋ณต
ใ…‡ STS-b ํ‰๊ฐ€ ์ฒด๊ณ„์— ๋”ฐ๋ฅธ ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜
    - ์ˆ˜๋™ ๋ผ๋ฒจ๋งํ•œ ์œ ์‚ฌ์„ฑ ์ ์ˆ˜์™€ ๋ชจ๋ธ ์˜ˆ์ธก ์ ์ˆ˜ ๊ฐ„์˜ ์„ ํ˜• ์ƒ๊ด€๊ด€๊ณ„ ์ธก์ •

ใ…‡ F1 score
   - ํŒจ๋Ÿฌํ”„๋ ˆ์ด์ฆˆ ์—ฌ๋ถ€ ์ธก์ •๊ฐ’ 3์„ ์ž„๊ณ„๊ฐ’์œผ๋กœ ๋‘ ํด๋ž˜์Šค๋กœ ๋‚˜๋ˆˆ ํ›„ ์ด์ง„ํ™”๋œ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ธก์ •
KLUE-NLI ใ…‡ Train / Dev. / Test ๋ถ„๋ฆฌ ์‹œ KLUE-RoBERTa ์ด์šฉ
   - ๊ฐ€์„ค ๋ฌธ์žฅ๋งŒ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จ์‹œํ‚จ ํ›„ ๋ผ๋ฒจ ์˜ˆ์ธก๊ฐ’์ด 3๊ฐ€์ง€์— ๋™์ผ(์œ ์‚ฌ)ํ•œ ํ™•๋ฅ ๋กœ ๋‚˜์˜ค๋„๋ก ํ…Œ์ŠคํŠธ ์…‹ ๋ถ„๋ฅ˜
-
KLUE-NER ใ…‡ ํ•œ๊ตญ์–ด ํŠน์„ฑ์— ๋งž๋„๋ก character level tagging ใ…‡ ์—”ํ„ฐํ‹ฐ ์ˆ˜์ค€ macro F1 score (Entitty F1)
   - ์—”ํ„ฐํ‹ฐ ์ˆ˜์ค€์—์„œ์˜ ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ๋น„๊ต
   - ์„ฑ๋Šฅ ํ–ฅ์ƒ ์œ„ํ•ด ํ† ํฐํ™” ์‹ ๊ฒฝ์จ์•ผํ•จ
ใ…‡ ๋ฌธ์ž ์ˆ˜์ค€ macro F1 score (Char F1)
   - ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์˜ ๋ถ€๋ถ„ ์ค‘์ฒฉ ์ธก์ •
   - ํด๋ž˜์Šค๋ณ„ F1 ์ ์ˆ˜์˜ ํ‰๊ท 
KLUE-RE ใ…‡ ๋ฌธ์žฅ ๋‚ด ์—”ํ„ฐํ‹ฐ&๊ด€๊ณ„ ์Œ ๋„์ถœ (e_subj, r, e_obj)
ใ…‡ ๊ด€๊ณ„(30) : ์ธ๊ฐ„๊ด€๊ณ„(18), ์กฐ์ง๊ด€๊ณ„(11), ๊ด€๊ณ„X(1)
   - ์ž์ฃผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ฑฐ๋‚˜ ํ•œ๊ตญ ์ง€์—ญ์  ํŠน์ง•์— ๋งž์ง€ ์•Š๋Š” ๊ด€๊ณ„ ํด๋ž˜์Šค ์ œ๊ฑฐ ๋ฐ ํ†ตํ•ฉ
ใ…‡ ๊ด€๊ณ„ X ์ œ์™ธํ•œ macro F1 score
   - ๊ฐ ํ‘œ๋ณธ์— ๋™์ผ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌํ•˜์—ฌ ๋‹ค์ˆ˜ ํด๋ž˜์Šค์— ๋” ๋งŽ์€ ๊ฐ€์ค‘์น˜๊ฐ€ ๋ถ€์—ฌ๋˜๋„๋ก ์„ค์ •
ใ…‡ ๋ชจ๋“  ๊ด€๊ณ„ ํด๋ž˜์Šค์— ๋Œ€ํ•œ AUPRC
   - ์ค‘์š” positive ๊ฐ’์ด ๊ฑฐ์˜ ์—†๋Š” ๋ถˆ๊ท ํ˜• ๋ฐ์ดํ„ฐ์—์„œ ์œ ์šฉ
KLUE-DP ใ…‡ ๊ธฐ์กด TTA DP ๊ฐ€์ด๋“œ๋ผ์ธ ์ˆ˜์ •ํ•˜์—ฌ ์ฃผ์„์ฒ˜๋ฆฌ
   - TTA DP ๊ฐ€์ด๋“œ๋ผ์ธ, ๋ฌธ์–ด์ฒด๋งŒ ํฌํ•จ
   - ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ๋Š” ๊ตฌ์–ด์ฒด์™€ ์›น ๋ฐ์ดํ„ฐ๋„ ํฌํ•จํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•œ๊ตญ์–ด ๊ธฐ์ค€์— ๋งž์ถฐ ์ˆ˜์ •ํ•ด์„œ ์‚ฌ์šฉํ•จ
ใ…‡ Unlabeled Attachment score (UAS)
   - HEAD ์˜ˆ์ธก๋งŒ ์นด์šดํŠธ
   - HEAD ์˜ˆ์ธก์— ๋Œ€ํ•œ macro F1 score ๊ณ„์‚ฐ
ใ…‡ Labeled Attachment score (LAS)
  - HEAD์™€ DEPREL ๋ชจ๋‘ ์นด์šดํŠธ
   - HEAD ์˜ˆ์ธก์ด ์˜ฌ๋ฐ”๋ฅธ DEPREL์— ๋Œ€ํ•œ macro F1 score ๊ณ„์‚ฐ
   - DEPREL ๋ถ„ํฌ๊ฐ€ ๋น„๋Œ€์นญํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ˆ„์  ๋นˆ๋„ 1%์ธ ๋ผ๋ฒจ์˜ ์˜ˆ์ธก์„ ํ•˜๋‚˜์˜ ๋ผ๋ฒจ(OTHER)๋กœ ๊ฒฐํ•ฉ ํ›„ F1 score ๊ณ„์‚ฐ
KLUE-MRC ใ…‡ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์œผ๋กœ์˜ ํ™•์žฅ ์œ„ํ•œ ๋‰ด์Šค ๊ธฐ์‚ฌ ๋ฐ์ดํ„ฐ ํ™œ์šฉ ใ…‡ EM (Exact Match)
ใ…‡ ๋ฌธ์ž ์ˆ˜์ค€์˜ ROUGE-W (= LCCS ๊ธฐ๋ฐ˜ F1)
KLUE-DST (WoS) ใ…‡ 5๊ฐ€์ง€ ๋„๋ฉ”์ธ์— ํ•ด๋‹น๋˜๋Š” ์ฃผ๊ณ  ๋ฐ›๋Š” ๋Œ€ํ™” ๋ฐ์ดํ„ฐ์…‹
   - ์„œ์šธ ๊ด€๊ด‘๊ฐ๊ณผ ์—ฌํ–‰์‚ฌ ๊ฐ„์˜ ๋Œ€ํ™” ์‹œ๋ฎฌ๋ ˆ์ด์…˜
   - ๋„๋ฉ”์ธ : ํ˜ธํ…”, ์‹๋‹น, ๊ด€๊ด‘์ง€, ํƒ์‹œ, ์ง€ํ•˜์ฒ 
ใ…‡ JGA (Joint Goal Accuracy)
   - ์ด ๋Œ€ํ™”ํ„ด ์ˆ˜์—์„œ ์Šฌ๋กฏ-๊ฐ’ ์Œ๊ณผ ์‹ค์ œ๊ฐ’์ด ์ผ์น˜ํ•˜๋Š” ๋น„์œจ
ใ…‡ Slot micro F1 score
   - ์˜ˆ์ธก๋œ ์Šฌ๋กฏ-๊ฐ’ ์Œ๊ณผ ๊ทธ๋ผ์šด๋“œ-์‹ค์ œ๊ฐ’ ์Œ ์ธก์ •
   - ์‹ค์ œ๊ฐ’ None์ธ ๊ฒฝ์šฐ slot micro F1 score ๋ฌด์‹œ๋จ

4. Pretrained Language Models

  1. KLUE ์‚ฌ์šฉํ•œ ์—ฐ๊ตฌ์˜ ์šฉ์ด์„ฑ์„ ์œ„ํ•ด ๋ชจ๋“  ๋ฒค์น˜๋งˆํฌ ํƒœ์Šคํฌ์— ๋Œ€ํ•œ ๊ธฐ์ค€ ์ œ๊ณต
  2. BERT์™€ RoBERTa ํฌํ•จํ•œ ์–ธ์–ด ๋ชจ๋ธ pretrained ํ•˜์—ฌ ์ œ๊ณตํ•จ

4.1. Language Models

ํ›ˆ๋ จ ๊ตฌ์„ฑ ๋ณ€๊ฒฝ ํ†ตํ•œ ์—ฌ๋Ÿฌ ํ•œ๊ตญ์–ด ๋ชจ๋ธ ์‚ฌ์ „ ํ›ˆ๋ จ ์ง„ํ–‰

  1. KLUE-BERT, KLUE-RoBERTa ํ›ˆ๋ จ
  2. ์‚ฌ์ „ ํ›ˆ๋ จ ๋ง๋ญ‰์น˜, ์ „์ฒ˜๋ฆฌ ๊ณผ์ •, ํ† ํฐํ™” ์ „๋žต ๋“ฑ์˜ ๊ตฌ์„ฑ ๋ณ€๊ฒฝ

[Pretraining Corpora]

ํ•œ๊ตญ์–ด ๋ง๋ญ‰์น˜ ๋ฐ์ดํ„ฐ์…‹ 5๊ฐ€์ง€๋ฅผ ๊ฒฐํ•ฉํ•˜์—ฌ ์•ฝ 62GB์˜ ์ตœ์ข… pretrained ๋ง๋ญ‰์น˜ ๊ตฌ์ถ•

  1. MODU
    • ๊ตญ๋ฆฝ๊ตญ์–ด์›์—์„œ ๋ฐฐํฌํ•˜๋Š” ํ•œ๊ตญ์–ด ๋ง๋ญ‰์น˜ ๋ชจ์Œ
    • ๊ณต์‹ ๊ธฐ์‚ฌ์™€ ๊ตฌ์–ด์ฒด ํ…์ŠคํŠธ ๋ชจ๋‘ ํฌํ•จ
  2. CC-100-Kor
    • ๋‹ค๊ตญ์–ด ์›นํฌ๋กค๋ง ๋ง๋ญ‰์น˜๋กœ ์ด ์ค‘ ํ•œ๊ตญ์–ด ๋ง๋ญ‰์น˜๋งŒ ์‚ฌ์šฉ
    • XLM-R ํ•™์Šต์— ์‚ฌ์šฉ
  3. NAMUWIKI
    • ํ•œ๊ตญ์–ด ์›น ๊ธฐ๋ฐ˜ ๋ฐฑ๊ณผ์‚ฌ์ „
    • WIKIPEDIA์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ ๋œ ํ˜•์‹์ 
  4. NEWSCRAWL
    • ๋‰ด์Šค ์ง‘๊ณ„ ํ”Œ๋žซํผ์—์„œ ์ˆ˜์ง‘๋œ 2011~2020์— ๋ฐœํ–‰๋œ ๋‰ด์Šค ๊ธฐ์‚ฌ
  5. PETITION
    • 2017.08~2019.3์— ๋ฐœํ–‰๋œ ์‚ฌํšŒ ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ฒญ์™€๋Œ€ ๊ตญ๋ฏผ์ฒญ์› ๋ชจ์Œ

[Preprocessing]

  1. 2.3. ์—์„œ ์–ธ๊ธ‰ํ•œ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ๋…ธ์ด์ฆˆ ํ•„ํ„ฐ๋ง
  2. CC-100-Kor ์™€ NEWSCRAWL
    • ์ž˜ ํ˜•์„ฑ๋œ ๋ฌธ์žฅ ์œ ์ง€ ์œ„ํ•ด ํœด๋ฆฌ์Šคํ‹ฑํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ 200์ž ์ด์ƒ ๊ธธ์ด ๋ฌธ์žฅ ์œ ์ง€
    • ์œ ์‚ฌ์„ฑ ๊ฒ€์‚ฌ๋ฅผ ํ†ตํ•ด KLUE ๋ฒค์น˜๋งˆํฌ ๋ฐ์ดํ„ฐ์…‹์— ํฌํ•œ๋œ ๋ฌธ์žฅ ์ œ๊ฑฐ

[Ethical Considerations]

  1. ์‚ฌํšŒ ํŽธํ–ฅ ๋˜๋Š” ํ˜์˜ค ๋ฐœ์–ธ ์ œ๊ฑฐ X
    • ๋Œ€๊ทœ๋ชจ pretrained ๋ง๋ญ‰์น˜์— ๋Œ€ํ•ด์„œ ์ˆ˜๋™ ๊ฒ€์‚ฌ ๋ถˆ๊ฐ€๋Šฅ
    • ์ถ”ํ›„ ์‚ฌํšŒ ํŽธํ–ฅ ์ฝ˜ํ…์ธ ๋‚˜ ํ˜์˜ค ๋ฐœ์–ธ ์ž๋™ ํƒ์ง€ ์œ„ํ•œ ํ•ด๋‹น ๋ฐ์ดํ„ฐ ์œ ์ง€
  2. PII, KISA์˜ ์ง€์นจ์— ๋”ฐ๋ผ ์ •๊ทœ ํ‘œํ˜„ ์‚ฌ์šฉํ•˜์—ฌ 16๊ฐ€์ง€ ๋ฐ์ดํ„ฐ ์œ ํ˜• ํƒ์ง€ ํ›„ ๊ฐ€๋ช…ํ™” ์ฒ˜๋ฆฌ

[Tokenization]

  1. ์ƒˆ๋กœ์šด ํ† ํฐํ™” ๋ฐฉ๋ฒ• morpheme-based subword tokenization ์‚ฌ์šฉ
    1. ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ(Mecab-ko) ์‚ฌ์šฉํ•˜์—ฌ ํ˜•ํƒœ์†Œ ๋‹จ์œ„๋กœ ์›์‹œ ํ…์ŠคํŠธ ์‚ฌ์ „ ํ† ํฐํ™”
    2. ์ตœ์ข…์ ์œผ๋กœ BPE(wordpiece) ์ ์šฉ
  2. 32k ์‚ฌ์ด์ฆˆ์˜ Vocabulary ๊ตฌ์ถ•
  3. ์ดํ›„์—๋Š” ์‚ฌ์šฉ์„ฑ๊ณผ ์†๋„ ํ–ฅ์ƒ์„ ์œ„ํ•ด BPE ๋ชจ๋ธ๋งŒ ์‚ฌ์šฉ

[Training Configurations]

  1. ํ† ํฐ ์‹œํ€€์Šค : ์ตœ๋Œ€ 512๊ฐœ
  2. ์ •์ /๋™์  ๋งˆ์Šคํ‚น์„ ํ†ตํ•ด pretraining ์ง„ํ–‰ (๋งˆ์Šคํ‚น WWM)
    • ์ •์  ๋งˆ์Šคํ‚น : BERT์—์„œ ์ „์ฒ˜๋ฆฌ ์ง„ํ–‰ ์‹œ ๋žœ๋คํ•˜๊ฒŒ ๋งˆ์Šคํ‚น ํ† ํฐ ์ ์šฉํ•˜๋Š” ๋ฐฉ์‹
    • ๋™์  ๋งˆ์Šคํ‚น : RoBERTa์—์„œ ๋ชจ๋ธ์— ์ž…๋ ฅ ์‹œ ๋งˆ์Šคํ‚น ํ† ํฐ ์ ์šฉํ•˜๋Š” ๋ฐฉ์‹
  3. BERT, NSP ์ˆ˜ํ–‰

4.2. Existing Language Models

๋ฒค์น˜๋งˆํฌ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ๊ธฐ์กด ์–ธ์–ด ๋ชจ๋ธ (๋‹ค์ค‘ ์–ธ์–ด ๋ชจ๋ธ 2 + ํ•œ๊ตญ์–ด ์–ธ์–ด ๋ชจ๋ธ 2)

  1. mBERT : ํ•œ๊ตญ์–ด ํฌํ•จ 104๊ฐœ ์–ธ์–ด๋กœ MLM๊ณผ NSP ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šตํ•œ ๋‹ค์ค‘์–ธ์–ด BERT
  2. XLM-R : MLM ์‚ฌ์šฉํ•˜์—ฌ ๋Œ€๋Ÿ‰ ๋‹ค์ค‘ ์–ธ์–ด๋กœ RoBERTa ํ•™์Šต
  3. KR-BERT
    • BERT ๊ธฐ๋ฐ˜ ์Œ์ ˆ(character) ๋‹จ์œ„ ํ•œ๊ตญ์–ด ์–ธ์–ด ๋ชจ๋ธ ์˜คํ”ˆ์†Œ์Šค
    • KLUE : KR-BERT character WordPiece ์‚ฌ์šฉ (16,424 ํ† ํฐ)
  4. KoELECTRA
    1. MLM๊ณผ RTD๋กœ ํ•™์Šต๋œ ํ•œ๊ตญ์–ด ์–ธ์–ด ๋ชจ๋ธ ์˜คํ”ˆ์†Œ์Šค
    2. ํ•™์Šต ๋ฐ์ดํ„ฐ : ๋‰ด์Šค ํฌ๋กค๋ง ๋ฐ์ดํ„ฐ, MODU corpus

5. Fine-tuning Language Models

5.1. Task-Specific Architectures

8๊ฐ€์ง€์˜ KLUE ๋ฒค์น˜๋งˆํฌ, fine-tuning ์ „๋žต 4๊ฐ€์ง€๋กœ ๋ถ„๋ฅ˜ ๊ฐ€๋Šฅ

Task-Specific Architecture Task ์„ค๋ช…
Single Sentence
Classification
KLUE-TC ใ…‡ ๋งˆ์ง€๋ง‰ ์€๋‹‰์ธต, ๋ ˆ์ด๋ธ” ์ˆ˜์— ๋”ฐ๋ผ ์„ ํ˜• ๋งคํ•‘
ใ…‡Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
ใ…‡ ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ์ž‘์—…์œผ๋กœ ์ž…๋ ฅ์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ ํ•„์š” X
KLUE-RE ใ…‡ ๋ฌธ์žฅ ์—”ํ„ฐํ‹ฐ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•œ ์ฃผ์ œ์™€ ๊ฐ์ฒด ์—”ํ„ฐํ‹ฐ ์‹œ์ž‘๊ณผ ๋์— ์ž„๋ฒ ๋”ฉ์„ ํ†ตํ•ด ํ† ํฐ ์ถ”๊ฐ€
   - <subj> ์ฃผ์ œ ์—”ํ„ฐํ‹ฐ </subj>
   - <obj> ๊ฐ์ฒด ์—”ํ„ฐํ‹ฐ </obj>
Sentence Pair
Classification / Regression
KLUE-STS ใ…‡ [SEP] ๊ณผ ๊ฐ™์€ ํ† ํฐ์œผ๋กœ ์—ฐ๊ฒฐ๋œ ์ž…๋ ฅ๋œ ๋‘ ๋ฌธ์žฅ ์‚ฌ์ด์˜ ๊ด€๊ณ„ ๊ฒฐ์ • ใ…‡ ๊ฐ ๋ฌธ์žฅ ์Œ์€ ์œ ์‚ฌ๋„ ์‹ค์ œ๊ฐ’ [0,5]๋กœ ๋ผ๋ฒจ๋ง
ใ…‡ [CLS] ํ† ํฐ์˜ ์€๋‹‰์ธต์—์„œ ์‹ค์ˆ˜๋กœ ๋งคํ•‘ํ•˜์—ฌ MSE ์ตœ์†Œํ™”๋˜๋„๋ก ํ›ˆ๋ จ
KULE-NLI ใ…‡ ์ „์ œ-๊ฐ€์„ค ์Œ์œผ๋กœ 3๊ฐ€์ง€ ํด๋ž˜์Šค๋กœ ๋ผ๋ฒจ๋ง
ใ…‡ [CLS] ํ† ํฐ์˜ ์€๋‹‰์ธต์—์„œ 3์ฐจ์› ๋ฒกํ„ฐ๊ฐ’์— ๋งคํ•‘ํ•˜์—ฌ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
Multiple-Sentence
Slot-Value Prediction
KLUE-DST
(WoS)
ใ…‡ ์ฃผ์–ด์ง„ ๋Œ€ํ™” ๋งฅ๋ฝ์— ๋Œ€ํ•œ slot-value ์˜ˆ์ธก ํ…Œ์Šคํฌ
   - ๋‹จ์ผ ๋ฐœํ™” ์•„๋‹Œ ์—ฌ๋Ÿฌ๋ฒˆ(๋งฅ๋ฝ)์— ๊ฑธ์ณ ์˜ˆ์ธก๋˜์–ด์•ผํ•จ
ใ…‡ ๋ฐœํ™” ์ธ์ฝ”๋”, ์ƒํƒœ ์ƒ์„ฑ๊ธฐ, ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ ๋กœ ๊ตฌ์„ฑ๋œ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๋ชจ๋ธ ์‚ฌ์šฉ
   - ๋ฐœํ™” ์ธ์ฝ”๋” : GRU์—์„œ PLM์œผ๋กœ ๋ณ€๊ฒฝ
   - ์ƒํƒœ ์ƒ์„ฑ๊ธฐ : [CLS] ํ† ํฐ์„ ์ฒซ๋ฒˆ์งธ ๋””์ฝ”๋” ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์Œ
   - ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ : WoS์— MultiWOZ๋ณด๋‹ค Boolean ํƒ€์ž…์ด ๋งŽ์•„ ๋‘๊ฐœ์˜ ์Šฌ๋กฏ ๋ผ๋ฒจ๋กœ ์˜ˆ์ธก (Y/N)
   - ์ƒํƒœ ์ƒ์„ฑ๊ธฐ์™€ ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ์˜ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
Sequence Tagging KLUE-NER ใ…‡ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ ใ…‡ ํ† ํฐ ์ˆ˜์ค€์˜ ํƒœ๊น… : ๊ฐ ๋ฌธ์ž์— ๋ผ๋ฒจ ํ• ๋‹น๋˜๋ฉฐ ํ† ํฐํ™” ์‚ฌ์šฉ์— ์‹ ๊ฒฝ์จ์•ผํ•จ
ใ…‡ ์ƒํƒœ ๊ฐ๊ฐ์„ 12์ฐจ์› ๋ฒกํ„ฐ๊ฐ’(12๊ฐœ์˜ ์—”ํ„ฐํ‹ฐ ๋ฒ”์ฃผ)์— ๋งคํ•‘
KLUE-MRC ใ…‡ ์ฃผ์–ด์ง„ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ๋‹จ๋ฝ ๋‚ด์—์„œ ๋‹ต๋ณ€์— ๋Œ€ํ•œ ์‹œ์ž‘๊ณผ ๋ ํ† ํฐ ํƒœ๊น…
ใ…‡ ๊ฐ ํ† ํฐ์€ ์‹œ์ž‘ ํ† ํฐ์ธ์ง€ ๋ ํ† ํฐ์ธ์ง€์— ๋Œ€ํ•œ 2์ฐจ์› ๋ฒกํ„ฐ์— ๋งคํ•‘
   - ์งˆ๋ฌธ์— ๋Œ€๋‹ตํ•  ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ, [CLS]๊ฐ€ ์‹œ์ž‘&๋ ํ† ํฐ์œผ๋กœ ๊ฐ„์ฃผ
KLUE-DP ใ…‡ ์‹œํ€€์Šค ํƒœ๊น… ๋ฌธ์ œ
   - ์ž…๋ ฅ ๋ฌธ์žฅ ๋‚ด ๊ฐ ํ† ํฐ, ํƒœ๊ทธ 2๋ฒˆ(HEAD, ARC) ์ง€์ •
   - ํ† ํฐํ™” ์‚ฌ์šฉ์— ์‹ ๊ฒฝ์จ์•ผํ•จ
ใ…‡ PLM์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•˜์œ„ ๋‹จ์–ด ํ‘œํ˜„ ์ถ”์ถœ ํ›„ ๊ฐ ๋‹จ์–ด์˜ ์ฒซ ๋ฒˆ์งธ ๋ฐ ๋งˆ์ง€๋ง‰ ํ•˜์œ„ ๋‹จ์–ด ํ† ํฐ ํ‘œํ˜„ ์—ฐ๊ฒฐํ•˜์—ฌ ๋‹จ์–ด ๋ฒกํ„ฐ ํ‘œํ˜„ ํ˜•์„ฑ
ใ…‡ ์‚ฌ์šฉ ๋ชจ๋ธ
   - HEAD ์˜ˆ์ธก ์œ„ํ•œ biaffine attention
   - DEPREL ์˜ˆ์ธก ์œ„ํ•œ bilinear attention

5.1.1. Single Sentence Classification

๋”๋ณด๊ธฐ
  1. ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜์—์„œ๋Š” ๋ฏธ๋ฆฌ ์ •์˜๋œ ๋ ˆ์ด๋ธ” ์„ธํŠธ๋กœ ๋ถ„๋ฅ˜
    • ๋งˆ์ง€๋ง‰ ์€๋‹‰์ธต์€ ๋ ˆ์ด๋ธ” ์ˆ˜์— ๋”ฐ๋ผ ์„ ํ˜•์œผ๋กœ ๋งคํ•‘
    • Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
  2. KLUE-TC : ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ์ž‘์—…์œผ๋กœ ์ž…๋ ฅ์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ ํ•„์š” X
  3. KLUE-RE
    • ๋ฌธ์žฅ ์—”ํ„ฐํ‹ฐ ๋‚˜ํƒ€๋‚ด๊ธฐ ์œ„ํ•œ ์ฃผ์ œ์™€ ๊ฐ์ฒด ์—”ํ„ฐํ‹ฐ ์‹œ์ž‘๊ณผ ๋์— ์ž„๋ฒ ๋”ฉ์„ ํ†ตํ•ด ํ† ํฐ ์ถ”๊ฐ€
    • <subj> ์ฃผ์ œ ์—”ํ„ฐํ‹ฐ </subj>
    • <obj> ๊ฐ์ฒด ์—”ํ„ฐํ‹ฐ </obj>

5.1.2. Sentence Pair Classification / Regression

๋”๋ณด๊ธฐ
  1. ๋‘ ๋ฌธ์žฅ ์‚ฌ์ด์˜ ๊ด€๊ณ„ ๊ฒฐ์ •
    • ์ž…๋ ฅ ๋ฌธ์žฅ ์Œ ์ค‘๊ฐ„์— [SEP]๊ณผ ๊ฐ™์€ ํ† ํฐ์œผ๋กœ ์—ฐ๊ฒฐ๋จ
  2. KLUE-STS
    • ๊ฐ ๋ฌธ์žฅ ์Œ์€ ์œ ์‚ฌ๋„ ์‹ค์ œ๊ฐ’ [0,5]๋กœ ๋ผ๋ฒจ๋ง
    • [CLS] ํ† ํฐ์˜ ์€๋‹‰์ธต์—์„œ ์‹ค์ˆ˜๋กœ ๋งคํ•‘ํ•˜์—ฌ MSE ์ตœ์†Œํ™”๋˜๋„๋ก ํ›ˆ๋ จ
  3. KLUE-NLI
    • ์ „์ œ-๊ฐ€์„ค ์Œ์œผ๋กœ 3๊ฐ€์ง€ ํด๋ž˜์Šค๋กœ ๋ผ๋ฒจ๋ง
    • [CLS] ํ† ํฐ์˜ ์€๋‹‰์ธต์—์„œ 3์ฐจ์› ๋ฒกํ„ฐ๊ฐ’์— ๋งคํ•‘ํ•˜์—ฌ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ

5.1.3. Multiple-Sentence Slot-Value Prediction

๋”๋ณด๊ธฐ
  1. DST(WoS)
    • ์ฃผ์–ด์ง„ ๋Œ€ํ™” ๋งฅ๋ฝ์— ๋Œ€ํ•œ slot-value ์˜ˆ์ธก ํ…Œ์Šคํฌ
    • ๋‹จ์ผ ๋ฐœํ™” ์•„๋‹Œ ์—ฌ๋Ÿฌ๋ฒˆ(๋งฅ๋ฝ)์— ๊ฑธ์ณ ์˜ˆ์ธก๋˜์–ด์•ผํ•จ
  2. ๋ฐœํ™” ์ธ์ฝ”๋”, ์ƒํƒœ ์ƒ์„ฑ๊ธฐ, ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ ๋กœ ๊ตฌ์„ฑ๋œ ์ธ์ฝ”๋”-๋””์ฝ”๋” ๋ชจ๋ธ ์‚ฌ์šฉ
    • ๋ฐœํ™” ์ธ์ฝ”๋” : GRU์—์„œ PLM์œผ๋กœ ๋ณ€๊ฒฝ
    • ์ƒํƒœ ์ƒ์„ฑ๊ธฐ : [CLS] ํ† ํฐ์„ ์ฒซ๋ฒˆ์งธ ๋””์ฝ”๋” ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์Œ
    • ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ : WoS์— MultiWOZ๋ณด๋‹ค Boolean ํƒ€์ž…์ด ๋งŽ์•„ ๋‘๊ฐœ์˜ ์Šฌ๋กฏ ๋ผ๋ฒจ๋กœ ์˜ˆ์ธก (Y/N)
    • ์ƒํƒœ ์ƒ์„ฑ๊ธฐ์™€ ์Šฌ๋กฏ ๊ฒŒ์ดํŠธ์˜ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ

5.1.4. Sequence Tagging

๋”๋ณด๊ธฐ
  1. KLUE-NER
    • ํ† ํฐ ์ˆ˜์ค€์˜ ํƒœ๊น… : ๊ฐ ๋ฌธ์ž์— ๋ผ๋ฒจ ํ• ๋‹น๋˜๋ฉฐ ํ† ํฐํ™” ์‚ฌ์šฉ์— ์‹ ๊ฒฝ์จ์•ผํ•จ
    • ์ƒํƒœ ๊ฐ๊ฐ์„ 12์ฐจ์› ๋ฒกํ„ฐ๊ฐ’(12๊ฐœ์˜ ์—”ํ„ฐํ‹ฐ ๋ฒ”์ฃผ)์— ๋งคํ•‘ํ•˜์—ฌ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
  2. KLUE-MRC
    • ์ฃผ์–ด์ง„ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ๋‹จ๋ฝ ๋‚ด์—์„œ ๋‹ต๋ณ€์— ๋Œ€ํ•œ ์‹œ์ž‘๊ณผ ๋ ํ† ํฐ ํƒœ๊น…
    • ๊ฐ ํ† ํฐ์€ ์‹œ์ž‘ ํ† ํฐ์ธ์ง€ ๋ ํ† ํฐ์ธ์ง€์— ๋Œ€ํ•œ 2์ฐจ์› ๋ฒกํ„ฐ์— ๋งคํ•‘ํ•˜์—ฌ Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ
      • ์งˆ๋ฌธ์— ๋Œ€๋‹ตํ•  ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ, [CLS]๊ฐ€ ์‹œ์ž‘&๋ ํ† ํฐ์œผ๋กœ ๊ฐ„์ฃผ
  3. KLUE-DP 
    • ์‹œํ€€์Šค ํƒœ๊น… ๋ฌธ์ œ
      • ์ž…๋ ฅ ๋ฌธ์žฅ ๋‚ด ๊ฐ ํ† ํฐ, ํƒœ๊ทธ 2๋ฒˆ(HEAD, ARC) ์ง€์ •
      • ํ† ํฐํ™” ์‚ฌ์šฉ์— ์‹ ๊ฒฝ์จ์•ผํ•จ
    • PLM์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•˜์œ„ ๋‹จ์–ด ํ‘œํ˜„ ์ถ”์ถœ ํ›„ ๊ฐ ๋‹จ์–ด์˜ ์ฒซ ๋ฒˆ์งธ ๋ฐ ๋งˆ์ง€๋ง‰ ํ•˜์œ„ ๋‹จ์–ด ํ† ํฐ ํ‘œํ˜„ ์—ฐ๊ฒฐํ•˜์—ฌ ๋‹จ์–ด ๋ฒกํ„ฐ ํ‘œํ˜„ ํ˜•์„ฑ
    • ์‚ฌ์šฉ ๋ชจ๋ธ
      • HEAD ์˜ˆ์ธก ์œ„ํ•œ *biaffine attention
        • * ์ธ์ฝ”๋”ฉํ•œ ํ† ํฐ์— ๋Œ€ํ•˜์—ฌ Header์™€ Modifier ๊ฐ๊ฐ ๋”ฐ๋กœ ์ถ”์ƒํ™”ํ•˜๊ณ  Header : Modifier ์กฐํ•ฉ์— ๋Œ€ํ•œ Attention Scoring
      • DEPREL ์˜ˆ์ธก ์œ„ํ•œ bilinear attention
      • Cross-Entropy ์ตœ์†Œํ™”ํ•˜๋„๋ก ํ›ˆ๋ จ

5.2. Fine-Tuning Configurations

  1. Huggingface Transformer์™€ PyTorch-Lightning ์‚ฌ์šฉ
  2. Hyperparameter
    • AdamW optimizer : ํ•™์Šต๋ฅ  {10−5 , 2 × 10−5 , 3 × 10−5 , 5 × 10−5}
      • AdamW : ๊ฐ€์ค‘์น˜ ์ฆ๊ฐ€ ์ œํ•œ์„ ๋‘” Adam Opitimizer
    • warm-up ratio : {0., 0.1, 0.2, 0.6}
      • warm-up ratio : ํŒŒ๋ผ๋ฏธํ„ฐ ์ดˆ๊ธฐํ™” ๊ณผ์ •์—์„œ์˜ ๋žœ๋ค๊ฐ’ ๋ถ€์—ฌ๋กœ ์ธํ•œ ํ•™์Šต ์ง„ํ–‰ ์˜ํ–ฅ ์ตœ์†Œํ™” ์œ„ํ•œ ์กฐ์น˜๋กœ ์ดˆ๊ธฐ์— ์ž‘์€ ํ•™์Šต๋ฅ  ์ ์šฉํ•œ ํ›„ ์•ˆ์ •๋œ ํ›„ ์ดˆ๊ธฐ ํ•™์Šต๋ฅ ๋กœ ์ „ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•
    • weight decay coefficient : {0.0, 0.01}
    • batch size : {8, 16, 32}
    • epochs : {3, 4, 5, 10}
  3. Dev. ๋ฐ์ดํ„ฐ์…‹ ์„ฑ๋Šฅ ๊ธฐ๋ฐ˜ ์ตœ์  ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ’ ์ž‘์„ฑ

5.3. Evaluation Results

๋‹ค๋ฅธ NLU ๋ฒค์น˜๋งˆํฌ์™€ ๋‹ค๋ฅด๊ฒŒ ๋ชจ๋“  ํƒœ์Šคํฌ์˜ ์„ฑ๋Šฅ ํ‰๊ท ๊ฐ’ X

KLUE ๋ฒค์น˜๋งˆํฌ์— ๋Œ€ํ•˜์—ฌ KLUE-PLM๋“ค๊ณผ ๊ธฐ์กด PLM ํ‰๊ฐ€ ๊ฒฐ๊ณผ

  1. KLUE-BERT BASE : YNAT, WoS
  2. KLUE-RoBERTa BASE : KLUE-RE, KLUE-MRC
  3. KoELECTRA BASE : KLUE-STS, KLUE-NLI
  4. ์ฃผ๋ชฉํ•  ๋งŒํ•œ ์ 
    • KLUE-BERT LARGE
      • ํ…Œ์ŠคํŠธํ•œ ๋ชจ๋ธ ์ค‘ ๊ฐ€์žฅ ํฐ ๋ชจ๋ธ
      • KLUE-NER์—์„œ ์ข‹์€ ํšจ๊ณผ๋ฅผ ๋ณด์ž„
      • ๋ชจ๋ธ ์‚ฌ์ด์ฆˆ๊ฐ€ ์„ฑ๋Šฅ๊ณผ ๊ด€๊ณ„ ์žˆ๋‹ค๋Š” ๊ฒƒ์œผ๋กœ ์ถ”ํ›„ ๋” ๋งŽ์€ ํ•™์Šต ์‹œํ‚ค๋ฉด ๋” ์ข‹์€ ํšจ๊ณผ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋จ
    • ๋‹จ์ผ ์–ธ์–ด ๋ชจ๋ธ์ด ์œ ์‚ฌ ํฌ๊ธฐ์˜ ๋‹ค๊ตญ์–ด ๋ชจ๋ธ๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ์„ฑ๋Šฅ ๋›ฐ์–ด๋‚จ

5.4. Analysis of Models

pretraining ์œ„ํ•œ ๋ง๋ญ‰์น˜์™€ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ์—์„œ ์‚ฌ์šฉํ•œ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์˜ํ–ฅ

 

[Corpus Pseudonymization]

  1. ๊ฐ€๋ช…ํ™” ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•œ ๋…ธ์ด์ฆˆ, ์•ฝ๊ฐ„์˜ ์„ฑ๋Šฅ ๊ฐ์†Œ๋ฅผ ๋ณด์ด์ง€๋งŒ ํฌ์ง€ ์•Š์Œ
  2. ์ตœ์†Œํ•œ์˜ ๊ฐ€๋ช…ํ™” ์ž‘์—… = ์—…๋ฌด ์ˆ˜ํ–‰๊ณผ ๊ฐœ์ธ ์ •๋ณด ์œ ์ถœ ์œ„ํ—˜ ๊ท ํ˜• ๋งž์ถœ ์ˆ˜ ์žˆ๋Š” ์ข‹์€ ๋ฐฉ๋ฒ•

 

[Tokenization Strategy]

  1. ํ˜•ํƒœ์†Œ ๊ธฐ๋ฐ˜ ํ•˜์œ„ ๋‹จ์–ด ํ† ํฐํ™”์™€ BPE ๋น„๊ต
    • ํ•˜์œ„ ๋‹จ์–ด ์ƒ์‹๋ ฅ : ๋‹จ์–ด ๋‹น ์ƒ์„ฑ๋˜๋Š” ํ‰๊ท  ํ•˜์œ„ ๋‹จ์–ด ์ˆ˜ ์ธก์ •
    • ์—ฐ์† ๋‹จ์–ด ๋น„์œจ
    • [UNK] ๋น„์œจ 
  2. ํ† ํฐํ™” ๋น„๊ต ๊ฒฐ๊ณผ
    • BPE๋ณด๋‹ค ๋†’์ง€๋งŒ ์—ฐ์† ๋‹จ์–ด ๋น„์œจ๊ณผ ํ•จ๊ป˜ ๊ณ ๋ คํ–ˆ์„ ๋•Œ ๋ฐ˜๋น„๋ก€ ํ•จ
    • >> ๊ฐ€๋Šฅํ•œ ์›๋ž˜ ๋‹จ์–ด๋ฅผ ์œ ์ง€ํ•˜๋ฉฐ ํ•„์š”ํ•  ๋•Œ๋Š” ๊ฐ ๋‹จ์–ด๊ฐ€ ์ž ์žฌ์ ์œผ๋กœ ๋” ๋งŽ์€ ํ•˜์œ„ ๋‹จ์–ด๋ฅผ ์ƒ์„ฑํ•œ๋‹ค๋Š” ์˜๋ฏธ
    • >> ์–ดํœ˜ ํฌ๊ธฐ ์ง€์ •(32k) ์‹œ, BPE๋ณด๋‹ค [UNK] ํ† ํฐ ๋” ์ ๊ฒŒ ์ƒ์„ฑํ•˜์—ฌ ์„ฑ๋Šฅ์— ์˜ํ–ฅ


* ์œ„์™€ ๋™์ผํ•œ ๋‚ด์šฉ ์žฌ์ •๋ฆฌ ์ˆ˜์ค€์œผ๋กœ [6. Ethical Considerations ~ 9. Conclusion] ์ƒ๋žต

 

๋Œ“๊ธ€