Clothes similarity search provides ranked recommendations based on the description of the clothing provided from the database. This repository contains the source code for:
- Data Scraper
- Sentence Encoder
- Similarity Function
Data scraper makes use of the data available on H&M website.
The scraped data is further feature-engineered and preprocessed to generate a wordSoup for representational encoding and is stored in the column cleaned_text
Encoder uses Tfidf-Vectorizer to generate sentence embeddings for every product description of size 1 x MAX_FEATURES (default = 1000).
Similarity Function utilizes cosine similarity by scikit-learn to generate a similarity measure between the product description and the database and provide n-ranked results to the user.
The similarity function can be accessed by making a post request
Node.js
const axios = require('axios');
let data = JSON.stringify({
"description": "a white hat made of cotton",
"limit": 5
});
let config = {
method: 'post',
maxBodyLength: Infinity,
url: 'https://asia-south1-clothes-similarity.cloudfunctions.net/clothes-similarity-noauth',
headers: {
'Content-Type': 'application/json'
},
data : data
};
axios.request(config)
.then((response) => {
console.log(JSON.stringify(response.data));
})
.catch((error) => {
console.log(error);
});
The similarity function takes 2 arguments as parameters:
- description(required): description of the clothing article
- limit(optional): no. of clothing articles to return, default = 10.
Returns a json in the following format:
{
"similarItems": ["list of product descriptions(in json)"]
}
To use util functions and scripts such as the scraper and the encoders, clone this repository to your local environment to get started:
git clone https://github.com/Tushar-K24/clothes-similarity-search.git
After creating a virtual environment in the cloned repository, install the dependencies as:
pip install -r requirements-script.txt
A simple interactive user-interface for the same function can be accessed from this link