vndee / visee

Just a typical search engine in this universe :fire::fire::fire:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

forthebadge CI CodeFactor DepShield Badge

VISEE is a system that combine both full-text search and visual search (base on image) together. Our system focus on Vietnam e-commerce product, which was collected from Tiki, Lazada, Shopee, Sendo. VISEE is completely dockerization.

Installation

Requirements

  • docker, docker-dompose, nvidia-docker

Quickstart

To run all containers and services:

./dev.sh up

Stop all services:

./dev.sh down

You can use docker-compose command alternatively. Especially when a service is running, its code were mount directly from host machine to docker container. So just edit your code and restart container, you will see your changes.

Configurations

List of environment variables can be use to config VISEE. All variables define in .env.

Variable Description Deafult value
API_KEY Authorization key for REST API h$+wt&%3BtH*6rA^KfPzMKDm**GdH_wQaQebd&X9!h=nNVjrt+pn8GNB5%-_ug-U
API_HOST REST API host binding (docker internal network) 0.0.0.0
API_PORT REST API port binding (docker internal network) 7070
KAFKA_HOSTS Kafka hosts [visee_kafka:9092]
KAFKA_USER Kafka user None
KAFKA_PASSWORD Kafka password None
KAFKA_NUM_PARTITION Kafka number of partitions 10
KAFKA_LINK_TOPIC Kafka topic for links scraper Link item
KAFKA_CONSUMER_GROUP Kafka consumer group default
REDIS_HOST Redis host (docker internal network) visee_redis
REDIS_PASSWORD Redis password None
REDIS_CATEGORIES_DB Redis database for website categories 0
REDIS_LINK2SCRAPE_DB Redis database for link to scraper 1
REDIS_DB_IDX_FIRST Redis first database for DualRedisConnector 2
REDIS_DB_IDX_SECOND Redis second database for DualRedisConnector 3
MILVUS_HOST Milvus host (docker internal network) visee_milvus
MILVUS_PORT Milvus port 19530
MILVUS_TABLE_NAME Milvus table name visee
ELASTIC_HOSTS Elasticearch hosts (docker internal network) [visee_elasticsearch]
ELASTIC_PORT Elasticsearch port 9200
ELASTIC_USER Elasticsearch username elastic
ELASTIC_PASSWORD Elasticsearch password changeme
ELASTIC_INDEX Elasticsearch index visee
EFFNET_WEIGHT EfficientNet weights path (in container) /visee/static/eff_b7.pth
CHROME_DRIVER_PATH Path to chrome driver (in container) /visee/static/chromedriver
IMAGE_SIZE Image downloaded size 1000
DOWNLOAD_IMAGE Download image or not True

Libraries and frameworks

  • Crawler: Selenium, BeatifulSoup, Apache Kafka, Redis.
  • Indexer: PyTorch, Apache Kafka, Redis.
  • Search Engine: Elasticsearch, Milvus.
  • RESTful Services: Flask, Nginx, Gunicorn.
  • User Interface: NodeJS, Nginx, HTML + CSS + JS.
  • Logging System: ELK+ Stack (Elasticsearch, Logtash, Kibana, Beats).

System Architecture and Technical Stack

Developers: Duy V. Huynh, Hoang N. Truong, Linh Q. Tran

About

Just a typical search engine in this universe :fire::fire::fire:


Languages

Language:C++ 55.5%Language:Python 20.4%Language:JavaScript 14.3%Language:HTML 3.5%Language:Shell 2.5%Language:CSS 2.1%Language:Dockerfile 1.2%Language:CMake 0.5%Language:Go 0.1%