dineshpiyasamara / VectorMass

VectorMass vector database

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VectorMass

VectorMass vector database

Vector databases are used to store vector embeddings for fast retrieval, similarity search, and other operations like crud operations. Simply, embedding is a numerical array that includes a huge number of features. So, using vector databases we can perform a lot of useful things on that numerical representation.

In traditional databases like MySQL, PostgreSQL, and SQL Server we are usually querying for rows in the database where the value matches our input query. In vector databases, we apply a similarity metric to find a vector that is the most similar to our query. There are a lot of dedicated vector databases out there such as VectorMass, Pinecone, Qdrant, Chroma DB, etc.

So, let’s learn how we can use VectorMass vector database…

# install vectormass library
pip install VectorMass
import VectorMass
import numpy as np

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-mpnet-base-v2')

# Create a VectorStore instance
vector_store = VectorMass.Client()

# Define your sentences
sentences = [
    "I eat mango",
    "mango is my favorite fruit",
    "mango, apple, oranges are fruits",
    "fruits are good for health",
]
ids = ['id1', 'id2', 'id3', 'id4']

# create a collection
collection = vector_store.create_or_get_collection("test_collection")

# add ids, documents and embeddings to the collection
collection.add(
    ids= ids,
    documents=sentences,
    embedding_model=model
)

# retrive data from the collection
# result = collection.get_all()
# print(result)

# querying
res = model.encode(['healthy foods', 'I eat mango'])
result = collection.query(query_embeddings=res)
print(result)

Embeddings

Embeddings, in the context of machine learning and natural language processing (NLP), refer to numerical representations of words, sentences, or documents in a high-dimensional space. In VectorMass databse, use Sentence Transformer embeddings as default embeddings. Upto now, it supports only embedding models which is in Sentence Transformer.

Examples

Notebook 01: Open In Colab

License

Apache 2.0

About

VectorMass vector database

License:Apache License 2.0


Languages

Language:Python 100.0%