taemin6697 / Dacon_OCR_Competition

πŸ† 2023 ꡐ원그룹 AI μ±Œλ¦°μ§€ πŸ† μ˜ˆμ„  Public 8μœ„/Private 8μœ„ λ³Έμ„  Public 9μœ„/Private 8μœ„

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dacon 2023 ꡐ원그룹 AI μ±Œλ¦°μ§€ <μ˜ˆμ„ > Private 9λ“±

νŒ€μ›

ν•œμ„±λŒ€ν•™κ΅_κΉ€νƒœλ―Ό, LastStar, 3o3

μ‚¬μš©ν•œ λͺ¨λΈ 및 GPU

Trocr-large-handwritten(https://arxiv.org/abs/2109.10282).

(https://huggingface.co/microsoft/trocr-large-handwritten)

RTX4090

python=3.7 Windows anaconda

μ£Όμš” μ„€μΉ˜ 라이브러리

pip install transformers
pip install pandas
pip install Pillow
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install scikit-learn
pip install datasets
pip install evaluate
pip install tqdm
pip install imgaug
pip install matplotlib
pip install imageio
pip install jiwer
pip install imagecorruptions

Dacon 2023 ꡐ원그룹 AIμ±Œλ¦°μ§€<μ˜ˆμ„ > μ½”λ“œ μ‚¬μš©λ²•

-ν›ˆλ ¨ 데이터 경둜./train
-ν…ŒμŠ€νŠΈ 데이터 경둜./test
-ν•™μŠ΅ μ½”λ“œ : ./train1~4.py
-ν…ŒμŠ€νŠΈ μ½”λ“œ : ./infer1~4.py
-앙상블 μ½”λ“œ : ./ESNB.py

μ²˜μŒλΆ€ν„° ν•™μŠ΅

-train1~4κΉŒμ§€μ˜ ν•™μŠ΅ νŒŒμΌμ„ μ‹€ν–‰ 
μ‹€ν–‰ μ‹œ 파일이 ./first ./second ./three ./four 폴더가 μƒμ„±λ˜κ³  κ·Έ μ•ˆμ— μˆ˜λ§Žμ€ checkpoint-xxxx 폴더가 생성
κ·Έ ν›„ infer1~4κΉŒμ§€μ˜ λͺ¨λ“  νŒŒμΌμ„ μ‹€ν–‰ 
μ‹€ν–‰ μ‹œ csv파일이 총 6개 λ‚˜μ˜€λŠ”λ° μ΄λ•Œ ESNBνŒŒμΌμ„ μ‹€ν–‰ν•˜μ—¬ μ΅œμ’… submission_esnb_sota_FINAL.csv을 생성

κ°€μ€‘μΉ˜ νŒŒμΌλ‘œλΆ€ν„° μΆ”λ‘ 

(first zipν΄λ”λŠ” ꡬ글 λ“œλΌμ΄λΈŒλ₯Ό 톡해 링크λ₯Ό λ°›μœΌμ‹œλ©΄ λ©λ‹ˆλ‹€.)
first의 압좕을 ν’€μ–΄ first,second,three,four 폴더λ₯Ό λ‹€μš΄λ°›κ²Œ 되면 μ•„λž˜ 파일 ꡬ쑰에 맞게 λ‹€μš΄λ°›μ€ νŒŒμΌμ„ λ„£κ³ 
ν•™μŠ΅μš© trainκ³Ό testλ₯Ό μ•„λž˜ 폴더 ꡬ쑰에 맞게 λ„£μ–΄μ€λ‹ˆλ‹€.  
infer1~4κΉŒμ§€μ˜ λͺ¨λ“  νŒŒμΌμ„ μ‹€ν–‰ 6개의 csv파일이 λ‚˜μ˜€λ©΄ ESNB νŒŒμΌμ„ μ‹€ν–‰ν•˜μ—¬ μ΅œμ’… submission_esnb_sota_FINAL.csv을 생성

파일 ꡬ쑰

β”œβ”€β”€ git
β”‚   β”œβ”€β”€ train1.py
β”‚   β”œβ”€β”€ train2.py
β”‚   β”œβ”€β”€ train3.py
β”‚   β”œβ”€β”€ train4.py
β”‚   β”œβ”€β”€ infer1.py
β”‚   β”œβ”€β”€ infer1_2.py
β”‚   β”œβ”€β”€ infer2.py
β”‚   β”œβ”€β”€ infer3.py
β”‚   β”œβ”€β”€ infer3_2.py
β”‚   β”œβ”€β”€ infer4.py
β”‚   β”œβ”€β”€ ESNB.py
β”‚   β”œβ”€β”€ train
β”‚   β”œβ”€β”€ test
β”‚   β”œβ”€β”€ first
β”‚   β”œβ”€β”€ second
β”‚   β”œβ”€β”€ three
β”‚   └── four

About

πŸ† 2023 ꡐ원그룹 AI μ±Œλ¦°μ§€ πŸ† μ˜ˆμ„  Public 8μœ„/Private 8μœ„ λ³Έμ„  Public 9μœ„/Private 8μœ„


Languages

Language:Python 100.0%