boostcampaitech4recsys1 / level1_bookratingprediction_recsys-level1-recsys-03

๐Ÿ“• Boostcamp AI Tech 4th - Book Recommedation Contest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

README.md

[RecSys] Book Rating Prediction

์‚ฌ์šฉ์ž์˜ ์ฑ… ํ‰์  ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‚ฌ์šฉ์ž์˜ ํ‰์ ์„ ์˜ˆ์ธก

๐Ÿ’ป Wrap-up Report ๋ฐ”๋กœ๊ฐ€๊ธฐ

๐Ÿ—’๏ธ ๋ฐ์ดํ„ฐ ์ถœ์ฒ˜

Contents

Team Members

Project Introduction

Architecture

Score Record (RMSE)

Getting Started

Team Members

๊ฐ•์ˆ˜ํ—Œ_T4003 ๋ฐ•๊ฒฝ์ค€_T4076 ๋ฐ•์šฉ์šฑ_T4088 ์˜คํฌ์ •_T4129 ์ •์†Œ๋นˆ_4196
Github Github Github Github Github
soso6079@naver.com rudwns708.14564@gmail.com oceanofglitta@gmail.com ohhj1999@gmail.com sobing98@gmail.com

Project Introduction

ํ”„๋กœ์ ํŠธ ์ฃผ์ œ ์‚ฌ์šฉ์ž์˜ ์ฑ… ํ‰์  ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‚ฌ์šฉ์ž๊ฐ€ ์–ด๋–ค ์ฑ…์„ ๋” ์„ ํ˜ธํ• ์ง€ ์˜ˆ์ธก
ํ”„๋กœ์ ํŠธ ๊ฐœ์š” ๋ถ€์ŠคํŠธ์บ ํ”„ Level1-U stage ๊ฐ•์˜๋ฅผ ํ†ตํ•ด ๋ฐฐ์šด ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ, ๋ชจ๋ธ์„ ์„ค๊ณ„ํ•˜๊ณ  ํ•™์Šตํ•˜๋ฉฐ ์ถ”๋ก ์„ ํ†ตํ•ด ๋‚˜์˜จ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ˆœ์œ„ ์‚ฐ์ •ํ•˜๋Š” ๋ฐฉ์‹
ํ™œ์šฉ ์žฅ๋น„ ๋ฐ ์žฌ๋ฃŒ          โ€ข ์„œ๋ฒ„: Tesla V100, 88GB RAM Server
โ€ข ๊ฐœ๋ฐœ IDE: Jupyter Notebook, VS Code
โ€ข ํ˜‘์—… Tool: Notion, Slack, Zoom
Metric RMSE Score
Dataset โ€ข books.csv : 149,570๊ฐœ์˜ ์ฑ…(item)์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
โ€ข users.csv : 68,092๋ช…์˜ ๊ณ ๊ฐ(user)์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ
โ€ข train_ratings.csv : 59,803๋ช…์˜ ์‚ฌ์šฉ์ž(user)๊ฐ€ 129,777๊ฐœ์˜ ์ฑ…(item)์— ๋Œ€ํ•ด ๋‚จ๊ธด 306,795๊ฑด์˜ ํ‰์ (rating) ๋ฐ์ดํ„ฐ
๊ธฐ๋Œ€ ํšจ๊ณผ ์‚ฌ์šฉ์ž์˜ ์ฑ… ํ‰์ ์„ ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ณ , ์ด ๋ชจ๋ธ์ด ์‚ฌ์šฉ์ž์—๊ฒŒ ์ฑ…์„ ์ถ”์ฒœํ•  ๋•Œ ์ข‹์€ ๊ธฐ์ค€์ด ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ๋„

ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ๋„

๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๋„

๋ฐ์ดํ„ฐ ๊ตฌ์กฐ๋„

Architecture

๋ถ„๋ฅ˜ ๋‚ด์šฉ
์•„ํ‚คํ…์ฒ˜ FactorizationMachineModel + FieldAwareFactorizationMachineModel + DeepCrossNetworkModel
LB์ ์ˆ˜(8/14๋“ฑ)           โ€ข public : 2.1407
โ€ข private : 2.1409
Training Feature user_id, isbn, age, publisher, language, location country, year of publication, book author, category
(book title, city, state๋ฅผ ์ œ์™ธํ•˜๊ณ  ๋‚˜๋จธ์ง€๋ฅผ ํ•™์Šต์— ์‚ฌ์šฉํ•จ)
๋ฐ์ดํ„ฐ โ€ข user_id: ๊ณ ์œ ๋ฒˆํ˜ธ
โ€ข location: city์ด์šฉํ•ด state, country๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ
โ€ข age: pseudo labeling๋กœ ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ
โ€ข publisher, language: isbn์ด์šฉํ•ด ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ
์•™์ƒ๋ธ” ๋ฐฉ๋ฒ• 1๋ฒˆ ํ•™์Šต ๋ฐฉ๋ฒ•์œผ๋กœ FM+FFM+HOFM+DCN์„ optimal_weighted๋กœ ๋ฌถ๊ณ  2๋ฒˆ ํ•™์Šต ๋ฐฉ๋ฒ•์œผ๋กœ FM+FFM+DCN์„ optimal_weighted๋กœ ๋ฌถ๊ณ  (1+2)/2 ๋ฐฉ์‹์œผ๋กœ ์•™์ƒ๋ธ”์„ ์ง„ํ–‰ํ•จ.

Score Record (RMSE)

private board evaluation

private board evaluation

public board evaluation

public board evaluation

Getting Started

  • requirements : install requirements
pip install -r requirements.txt
  • train & Inference : main.py
python main.py --MODEL FM --DATA_PATH data

options

options

About

๐Ÿ“• Boostcamp AI Tech 4th - Book Recommedation Contest


Languages

Language:Jupyter Notebook 97.2%Language:Python 2.7%Language:Rich Text Format 0.1%