Kaggle-Foursquare
Here is my solution for Kaggle Foursquare Location Matching.
This solution received a Silver Medal without any ensembling or complicated feature engineering.
The pipeline is is simple:
- Train xlmroberta with ArcFace Loss
- Use the cos sims from the xlmroberta + coordinate distance to extract match candidates
- Add features (cos sim, distance, lcs, tfidf, etc…)
- Train a lightgbm model (with flaml hyperparameter optimization) to select the correct candidates as a binary classification task.
- Do 2-3 on the test data and inference with lgbm