jelmerk / hnswlib-spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TypeError: 'JavaPackage' object is not callable

mdrijwan123 opened this issue · comments

Hey Thankyou for this spark integration.
I am getting one issue as I am trying to run this on databricks and I am getting above error:
I am installing packages and using module as:

%pip install pyspark-hnsw==1.1.0 findspark
hnsw = HnswSimilarity(identifierCol='product_id', queryIdentifierCol='user_id', featuresCol='Embedding_items',
                      distanceFunction='cosine', numPartitions=10, excludeSelf=True, k = 10)

Running above I am getting below error:
TypeError: 'JavaPackage' object is not callable

I am using spark: 3.4.1 and using DBR 13.3 LTS.
But above code running good in Colab. Please let me know if you have some information for fix. Thankyou.

Check which version of scala and spark your databricks runtime uses

image

Then add the correct version of hnsw to your cluster libraries.

image

for spark 3.4.x and scala 2.12 you would use com.github.jelmerk:hnswlib-spark_3_4_2.12:1.1.1

You don't have to pip install the module on dbr

After that it should work

image

Awesome, Thankyou so much.