꿈 많은 사람의 이야기

세로형

Notice

[contact] 컨택 정보 공지

Recent Posts

Recent Comments

Link

04-20 00:01

« 2024/04 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Tags more

Archives

Today

Total

관리 메뉴

꿈 많은 사람의 이야기

추천 시스템 논문 리뷰는 아래와 같은 순서로 진행할 예정입니다. 오늘은 그 첫번째 글인 DeepFM입니다.

참고한 자료는 다음과 같습니다.

DeepFM 논문
- https://arxiv.org/pdf/1703.04247.pdf
FM 논문
- https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf

포스팅 본문

DeepFM 논문 리뷰는 아래와 같은 순서로 진행합니다.

1. DeepFM 핵심 요약

2. DeepFM 논문 리뷰

DeepFM 핵심 요약

Wide-Deep과 FM의 장점을 합침

• FM Component와 Deep Component 구조를 가지고 있음

• 기존 모델들의 단점을 극복하고 장점을 결합

DeepFM

• CTR을 예측하는 모델

• Low 및 high-order interaction을 반영할 수 있음

• Feature engineering이 필요하지 않음

• Pre-training이 필요하지 않음

와 같이 DeepFM을 간단히 요약해볼 수 있습니다.

이제 논문을 하나하나 리뷰해봅니다.

DeepFM 논문 리뷰

Introduction

CTR(Click-Through Rate)의 중요성

• 많은 recommender system의 목표가 maximize the number of clicks임

• 사용자에게 return 되는 item들은 ranked by estimated CTR 될 수 있음

CTR에는 implicit feature interaction이 있음

• App category와 time-stamp와 관계 ( 2-order interaction )

• 사용자들은 food delivery app을 meal-time에 다운로드 함

• 이 2개의 feature가 signal for CTR임

• App category, user gender, age와 관계 ( 3-order )

• 남자 10대 청소년들은 RPG와 shooting game을 좋아함

이러한 feature interaction을 고려해야 함

• Low 및 high-order interaction을 고려해야 함

Key challenge

• 효과적으로 feature interaction을 모델링하는 것

• 어떤 feature interaction은 이해하기 쉬워서 (이전 예처럼) 설계할 수 있음

• 그러나, 많은 경우 feature interaction을 식별 또는 모델링하기 하기 어려움

• 너무 많은 feature가 있어서 이들간의 관계를 살펴보는 것

• 전문가의 관찰이 아닌 data에서 발견되는 것들 ( Machine Learning을 통해 발견되는 것들 )

위 연구들의 특징은!

• Biased to low or high order feature interaction

• Rely on feature engineering

그래서 본 논문에서는!

• All orders의 feature interaction을 배울 수 있는 모델 제안

• Without feature engineering

• DeepFM을 제안

• FM part와 Deep Part를 integrate

• FM part에서는 low feature interactions

• Deep Part에서는 high-order feature interaction

• Share the same input 하며 embedding vector도 마찬가지로 공유

Approach

Dataset 설정

• n개의 instances를 가진 (χ, y) training set이 있음

• χ : m개의 fields를 가지고 있음

• Categorical fields(gender, location 등)

• One-hot으로 표현

• Continuous fields(age 등)

• 그 값 자체나 discretization를 하고 one-hot

• y : 0과 1 의 값

• 1은 clicked the item

따라서 Data는

• (x, y)로 표현할 수 있음

• x :[x_field1,x_field2,…x_filedm]으로 표현할 수 있음

• 각각의 x_filedj는 χ에서의 j 번째 field vector representation

목적

• y^= CTR_model(x)

• 클릭할 확률을 추정

DeepFM 파트의 목표

• Low and high order feature interaction을 배우는 것

• 따라서 FM component와 Deep component 2개로 나누어짐

• Same input을 공유함

• For feature i에 대해서

• Scalar w_i는 1차원 중요도를 측정함

• Latent vector V_i는 other features간의 interaction의 영향을 측정

• V_i는 FM에 들어가서 2차원 interaction을 모델링, deep component에서는 고차원 interaction 모델링

• y^=sigmoid(y_FM+y_DNN)의 꼴로 됨

• y^∈(0, 1) : predict ctr

• y_FM: output of FM component

• y_DNN : output of DNN component

FM Component

• [Randle, 2010]이 제안한 Factorization Machine

• Feature 간의 linear interaction외에도 pairwise feature interaction도 고려

• Inner product of respective feature latent vectors를 활용

• 2-order feature interaction을 dataset이 sparse할 때에도 효과적으로 포착함

• Product의 latent vector v_i, v_j 를 Inner product해서 측정하기 때문

• 〖(<w, x>)〗 는 importance of order-1 feature를 반영

Deep Component

• Feed-forward neural network

• Embedding layer를 도입해서 input vector를 low-dimensional로 압축

• High-order interaction을 학습

• 2가지 interesting한 feature가 있음

• 서로 다른 input field vector의 길이가 다를 수 있지만 embedding은 동일한 크기

• FM의 latent feature vector (V)가 network weights를 제공

• Input field vector를 embedding vector로 압축하는데도 사용

• [Zhang, 2016]과 다른 부분

• FM을 initialization으로 사용하지 않고 FM을 overall learning architecture 중 하나로 포함

• 즉, pre-training의 필요성을 제거하고 jointly train the overall network을 진행

• y_DNN = σ(W^l a^l+b^l)

• FM과 Deep은 same feature embedding을 공유함으로써 2개의 이득을 취함

• 1. low and high order feature interaction from raw data

• 2. no need for expertise feature engineering of the input

Other neural networks와의 관계

Experiment

Dataset

• Creiteo dataset

• Company Dataset

• Verify the performance of DeepFM in real industrial CTR prediction

• User click records from the game center o the Company App Store

Evaluation metrics

• AUC

• Logloss(cross entropy)

Model Comparison

• LR(does not consider feature interaction), FM, FNN, PNN, Wide & Deep, DeepFM

Parameter Setting

• Dropout 0.5, network structure : 400-400-400

• Optimizer : Adam

• Activation function : tanh, relu

• 등 각 모델마다 다르게

Efficiency Comparison

• Training time of deep CTR model / Training time of LR

• Results

• Pre-training of FNN은 less efficient

• IPNN과 PNN이 GPU에서 매우 빠른 속도로 동작하지만 여전히 computationally expensive

• Inner product operations 때문

• DeepFM은 efficient한 모습을 보여줌

Effectiveness Comparison

• Feature interaction을 학습하는 것은 CTR prediction model의 향상을 이끌어 냄

• LR은 다른 model보다 성능이 떨어짐

• DeepFM은 다른 모델보다 좋은 성능

• Learning low and high order feature interaction은 CTR prediction 향상

• Only low feature interaction : FM

• high-order feature interaction : FNN, IPNN, OPNN, PNN

• DeepFM이 이들보다 더 좋은 성능을 보여 줌

• With sharing the same feature embedding은 CTR 성능 향상에 도움을 이끌어 냄

• Low 및 high order feature interaction using separate feature embedding

• LR & DNN, FM & DNN

• DeepFM은 이들보다 좋은 성능이 나옴

Activation Function

Dropout

300x250

Number of Neuron per Layer

Number of Hidden Layers

Network Shape

Conclusions

DeepFM을 제안

• Factorization-Machine based Neural Network for CTR Prediction

• Deep component와 FM component를 jointly train

Advantage

• 1) not need any pre-training

• 2) learn high and low order feature interaction

• 3) sharing strategy of feature embedding으로 feature engineering을 피함

Experiment

• AUC와 Logloss에서 SOTA 달성

• Efficiency of DeepFM을 확인할 수 있었음

마무리

이번 포스팅은 추천 시스템 논문 중 하나인 DeepFM 논문을 리뷰해보았습니다.

다음 포스팅에서도 추천 시스템 글로 찾아뵙겠습니다.

도움이 되시길 바랍니다.

감사합니다.

그리드형

저작자표시 동일조건

'추천시스템' 카테고리의 다른 글

추천 시스템 논문 리뷰 - Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network ( VECF ) (0)	2022.06.13
추천 시스템 논문 리뷰 - User Diverse Preference Modeling by Multimodal Attentive Metric Learning (MAML) (0)	2022.06.03
추천 시스템 논문 리뷰 - neural collaborative filtering (paper review about recsys) (2)	2021.04.05
추천 시스템 평가는 어떻게 하면 좋을까? - Evaluation Metrics for Recommender Systems (0)	2021.01.17
파이썬 케라스(Python Keras)를 활용한 간단한 책 추천 시스템(recommender system) 구현하기 (8)	2020.12.16

공유하기 링크

페이스북
카카오스토리
트위터

'추천시스템' Related Articles

Comments

꿈 많은 사람의 이야기

꿈 많은 사람의 이야기

추천 시스템(recsys) deepfm 논문 리뷰 - A Factorization Machine based Neural Network for CTR Prediction 본문

추천 시스템(recsys) deepfm 논문 리뷰 - A Factorization Machine based Neural Network for CTR Prediction

포스팅 개요

포스팅 본문

DeepFM 핵심 요약

DeepFM 논문 리뷰

Introduction

Approach

• FM과 Deep은 same feature embedding을 공유함으로써 2개의 이득을 취함

• 1. low and high order feature interaction from raw data

• 2. no need for expertise feature engineering of the input

Experiment

Conclusions

마무리

'추천시스템' 카테고리의 다른 글

티스토리툴바