[AGE-287] Add semantic similarity evaluator

Question

[AGE-287] Add semantic similarity evaluator

mmabrouk opened this issue 4 months ago · comments

Issue from user:

I would like to add semantic similarity as an evaluator. Here is a candidate code

| from sentence_transformers import SentenceTransformer
from sentence_transformers.util import pytorch_cos_sim
def semantic_similarity(row: pandas.core.series.Series, expected: str, response_column_name: str = "response") -> float:
if len(expected) == 1:
logging.warn("Expected should be a list of strings." + "You may have passed in a single string")

doc1 = expected
doc2 = row[response_column_name]
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
embedding_1 = model.encode(doc1, convert_to_tensor=True)
embedding_2 = model.encode(doc2, convert_to_tensor=True)

return pytorch_cos_sim(embedding_1, embedding_2).item()

Notes: We can have this implemented to use the huggingface API if the API key is not provided and run it locally in case it is not.

_{From SyncLinear.com | AGE-287}