big-data-europe / docker-spark

Apache Spark docker image

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to connect via ODBC to Spark?

felixbaron opened this issue · comments

Expected Behavior
I want to connect via Windwos ODBC to the docker-spark container.

Actual Behavior
I receive an SSL error.

Steps to Reproduce the Problem

  • Download and install SIMBA Spark ODBC driver, e.g. from Cloudera
  • Add new DSN
    image
  • Choose Simba Spark
    image
  • Edit connection
    image
  • I see an error
    image

Version:

  • I am using Spark 3.0.0 for Hadoop 3.2 with OpenJDK 8 and Scala 2.12 version
  • To start docker-spark I have:
git clone https://github.com/big-data-europe/docker-spark
cd docker-spark
docker-compose up

Platform:

  • Windows 10

@GezimSejdiu are you the Windows/ODBC master here ;-)?

I think you need to start up the Thrift Server to connect with ODBC? This particular image does not do that by default, I was trying to connect to Spark through JDBC from outside the container as well.

From within the master node:

cd /spark/bin && /spark/sbin/../bin/spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal

After that, it might work