boostcampaitech6 / level1-bookratingprediction-recsys-06

level1-bookratingprediction-recsys-06 created by GitHub Classroom

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Book Rating Prediction

๐Ÿ“Œ ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

project_info

์ฑ…๊ณผ ๊ด€๋ จ๋œ ์ •๋ณด์™€ ์†Œ๋น„์ž์˜ ์ •๋ณด, ๊ทธ๋ฆฌ๊ณ  ์†Œ๋น„์ž๊ฐ€ ์‹ค์ œ๋กœ ๋ถ€์—ฌํ•œ ํ‰์ ์„ ํ™œ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž๊ฐ€ ์ฃผ์–ด์ง„ ์ฑ…์— ๋Œ€ํ•ด ์–ผ๋งˆ๋‚˜ ํ‰์ ์„ ๋ถ€์—ฌํ• ์ง€์— ๋Œ€ํ•ด ์˜ˆ์ธกํ•ฉ๋‹ˆ๋‹ค.

ํ•ด๋‹น ๊ฒฝ์ง„๋Œ€ํšŒ๋Š” ์†Œ๋น„์ž๋“ค์˜ ์ฑ… ๊ตฌ๋งค ๊ฒฐ์ •์— ๋Œ€ํ•œ ๋„์›€์„ ์ฃผ๊ธฐ ์œ„ํ•œ ๊ฐœ์ธํ™”๋œ ์ƒํ’ˆ ์ถ”์ฒœ ๋Œ€ํšŒ์ž…๋‹ˆ๋‹ค.

๐Ÿฅ‡ ํ”„๋กœ์ ํŠธ ๊ฒฐ๊ณผ

Public

Public leader board

Private

Private leader board

๐Ÿ“‹ ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰ ์ ˆ์ฐจ ๋ฐ ๋ฐฉ๋ฒ•

EDA

  • ์ „์ฒด ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ
  • ์‚ฌ์šฉ์ž ์—ฐ๋ น๋Œ€ ๋ถ„์„
  • ๋„์„œ ISBN ์ฒ˜๋ฆฌ
  • ๋„์„œ ์ €์ž ์ฒ˜๋ฆฌ
  • ์ถœํŒ ๋…„๋„ ์ฒ˜๋ฆฌ
  • ์ถœํŒ์‚ฌ ์ฒ˜๋ฆฌ
  • ์ด๋ฏธ์ง€ URL ์ฒ˜๋ฆฌ
  • ๋„์„œ ์นดํ…Œ๊ณ ๋ฆฌ ์ฒ˜๋ฆฌ
  • ์ฑ… ์š”์•ฝ ์ •๋ณด ์ฒ˜๋ฆฌ

๋ชจ๋ธ๋ง

  • ๋ชจ๋“  Feature๋ฅผ Categorizationํ•˜์—ฌ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • Categorical features์— ํšจ๊ณผ์ ์ธ Gradient Boosting Library์ธ Catboost๋ฅผ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
  • HPO(Hyper Parameter Optimization)๋Š” Optuna๋ฅผ ํ™œ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

Catboost ๋ชจ๋ธ๋ง

  • CatBoostPruningCallback์„ ํ™œ์šฉํ•˜์—ฌ HPO ๋„์ค‘ ๋ถˆํ•„์š”ํ•œ ์‹คํ—˜์„ ์ค‘๋‹จํ•˜๋Š” ๊ธฐ๋ฒ•์„ ์ ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. (GPU ์ง€์› x)
  • ์ผ๋ฐ˜์ ์ธ Regression ๋ฌธ์ œ์—์„  ์—ฐ์†ํ˜• Label์— ๋Œ€ํ•ด Starfield K-Fold๋ฅผ ์ง€์›ํ•˜์ง€ ์•Š์ง€๋งŒ, ๋ณธ ํ”„๋กœ์ ํŠธ์˜ Rating์ด ์ด์‚ฐํ˜•์œผ๋กœ ๋˜์–ด์žˆ์–ด ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.
  • ํŠนํžˆ, Rating ๊ฐ’์˜ ๋ถ„ํฌ ์ฐจ์ด๊ฐ€ ์ปค์„œ Starfield K-Foldํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค๊ณ  ํŒ๋‹จํ–ˆ์Šต๋‹ˆ๋‹ค.

CNN_FM, DeepCoNN ๋ชจ๋ธ๋ง

  • ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ(์ด๋ฏธ์ง€, ํ…์ŠคํŠธ)๋ฅผ ํ™œ์šฉํ•˜๋Š” CNN_FM๊ณผ DeepCoNN ๋ชจ๋ธ์„ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ์ถ”ํ›„์—, Catboost๋ชจ๋ธ๊ณผ ์•™์ƒ๋ธ” ์ง„ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ํ‰๊ฐ€ ๋ฐ ๊ฐœ์„ 

  • Optuna๋ฅผ ํ™œ์šฉํ•˜์—ฌ HPO ์ˆ˜ํ–‰ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ๋‹ค์–‘ํ•œ Feature Engineering์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.

Public Score Linechart

๐Ÿค– ํŒ€์›

๋…ธ๊ด€์˜ฅ ๋ฐ•๊ฒฝ์› ์ด์„๊ทœ ์ด์ง„์› ์žฅ์„ฑ์ค€
 

๐Ÿ“š Report & Presentation

Wrap-up Report (PDF)

ํ”„๋กœ์ ํŠธ ์ˆ˜ํ–‰ ์ ˆ์ฐจ, ๋ฐฉ๋ฒ•, ๊ฒฐ๊ณผ, ์ตœ์ข… ํ‰๊ฐ€, ํŒ€์›๋ณ„ ํšŒ๊ณ ๋Š” wrap-up report์—์„œ ๋” ์ž์„ธํžˆ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Presentation (PPT)

ํ”„๋กœ์ ํŠธ ๊ฒฐ๊ณผ ๋ฐœํ‘œ ์ž๋ฃŒ์ž…๋‹ˆ๋‹ค.

About

level1-bookratingprediction-recsys-06 created by GitHub Classroom


Languages

Language:Jupyter Notebook 99.4%Language:Python 0.6%