big-data-europe / docker-spark

Apache Spark docker image

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build scala application from the template

arruw opened this issue · comments

commented

I'm trying to build scala template docker image (from bde2020/spark-scala-template). I have created Dockerfile and extended base image like shown on following image:

image

After running docker build . I got following output (truncated):

$ docker build .
Sending build context to Docker daemon  24.16MB
Step 1/2 : FROM bde2020/spark-scala-template
# Executing 5 build triggers
 ---> Using cache
 ---> Using cache
 ---> Using cache
 ---> Running in 40bc8e95c9dd
[info] Loading settings for project app-build from plugins.sbt ...
[info] Loading project definition from /app/project
[info] Updating ProjectRef(uri("file:/app/project/"), "app-build")...
[info] Done updating.
[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[info] Loading settings for project app from build.sbt ...
[info] Set current project to SparkScalaTest (in build file:/app/)
[success] Total time: 0 s, completed Nov 10, 2019 5:52:38 PM
[info] Updating ...
[info] Done updating.
[warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings.
[info] Compiling 1 Scala source to /app/target/scala-2.11/classes ...
[info] Non-compiled module 'compiler-bridge_2.11' for Scala 2.11.12. Compiling...
[info]   Compilation completed in 41.73s.
[info] Done compiling.
[error] 112 errors were encountered during merge
[error] java.lang.RuntimeException: deduplicate: different file contents found in the following:
[error] /root/.ivy2/cache/org.apache.arrow/arrow-vector/jars/arrow-vector-0.10.0.jar:git.properties
[error] /root/.ivy2/cache/org.apache.arrow/arrow-format/jars/arrow-format-0.10.0.jar:git.properties
[error] /root/.ivy2/cache/org.apache.arrow/arrow-memory/jars/arrow-memory-0.10.0.jar:git.properties
[error] deduplicate: different file contents found in the following:
[error] /root/.ivy2/cache/javax.inject/javax.inject/jars/javax.inject-1.jar:javax/inject/Inject.class
[error] /root/.ivy2/cache/org.glassfish.hk2.external/javax.inject/jars/javax.inject-2.4.0-b34.jar:javax/inject/Inject.class

... see attached file for more ...

Whole output file: docker-template-build.log

FYI: I'm new to Scala/Spark.

I am running into similar issues, I have only 2 merge errors, but I could find none of them in your log file. Have you resolved your issue since @matjazmav?
My output:

[info] Loading settings for project app-build from plugins.sbt ...
[info] Loading project definition from /app/project
[info] Loading settings for project app from build.sbt ...
[info] Set current project to spark-processing (in build file:/app/)
[success] Total time: 0 s, completed Apr 7, 2020 7:52:58 PM
[info] Compiling 1 Scala source to /app/target/scala-2.12/classes ...
[info] Non-compiled module 'compiler-bridge_2.12' for Scala 2.12.11. Compiling...
[info]   Compilation completed in 6.496s.
[info] Done compiling.
[error] 2 errors were encountered during merge
[error] java.lang.RuntimeException: deduplicate: different file contents found in the following:
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-handler/4.1.45.Final/netty-handler-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport-native-epoll/4.1.45.Final/netty-transport-native-epoll-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-common/4.1.45.Final/netty-common-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-buffer/4.1.45.Final/netty-buffer-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport/4.1.45.Final/netty-transport-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-codec/4.1.45.Final/netty-codec-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport-native-unix-common/4.1.45.Final/netty-transport-native-unix-common-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-resolver/4.1.45.Final/netty-resolver-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] deduplicate: different file contents found in the following:
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.10.0/jackson-databind-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-csv/2.10.0/jackson-dataformat-csv-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/datatype/jackson-datatype-jdk8/2.10.0/jackson-datatype-jdk8-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.10.0/jackson-annotations-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-core/2.10.0/jackson-core-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/module/jackson-module-paranamer/2.10.0/jackson-module-paranamer-2.10.0.jar:module-info.class
[error]         at sbtassembly.Assembly$.applyStrategies(Assembly.scala:143)
[error]         at sbtassembly.Assembly$.x$1$lzycompute$1(Assembly.scala:25)
[error]         at sbtassembly.Assembly$.x$1$1(Assembly.scala:23)
[error]         at sbtassembly.Assembly$.stratMapping$lzycompute$1(Assembly.scala:23)
[error]         at sbtassembly.Assembly$.stratMapping$1(Assembly.scala:23)
[error]         at sbtassembly.Assembly$.inputs$lzycompute$1(Assembly.scala:68)
[error]         at sbtassembly.Assembly$.inputs$1(Assembly.scala:58)
[error]         at sbtassembly.Assembly$.apply(Assembly.scala:85)
[error]         at sbtassembly.Assembly$.$anonfun$assemblyTask$1(Assembly.scala:244)
[error]         at scala.Function1.$anonfun$compose$1(Function1.scala:49)
[error]         at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
[error]         at sbt.std.Transform$$anon$4.work(Transform.scala:67)
[error]         at sbt.Execute.$anonfun$submit$2(Execute.scala:281)
[error]         at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:19)
[error]         at sbt.Execute.work(Execute.scala:290)
[error]         at sbt.Execute.$anonfun$submit$1(Execute.scala:281)
[error]         at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:178)
[error]         at sbt.CompletionService$$anon$2.call(CompletionService.scala:37)
[error]         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error]         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[error]         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error]         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[error]         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[error]         at java.lang.Thread.run(Thread.java:748)
[error] (assembly) deduplicate: different file contents found in the following:
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-handler/4.1.45.Final/netty-handler-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport-native-epoll/4.1.45.Final/netty-transport-native-epoll-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-common/4.1.45.Final/netty-common-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-buffer/4.1.45.Final/netty-buffer-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport/4.1.45.Final/netty-transport-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-codec/4.1.45.Final/netty-codec-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-transport-native-unix-common/4.1.45.Final/netty-transport-native-unix-common-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/io/netty/netty-resolver/4.1.45.Final/netty-resolver-4.1.45.Final.jar:META-INF/io.netty.versions.properties
[error] deduplicate: different file contents found in the following:
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.10.0/jackson-databind-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/dataformat/jackson-dataformat-csv/2.10.0/jackson-dataformat-csv-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/datatype/jackson-datatype-jdk8/2.10.0/jackson-datatype-jdk8-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.10.0/jackson-annotations-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/core/jackson-core/2.10.0/jackson-core-2.10.0.jar:module-info.class
[error] /root/.cache/coursier/v1/https/repo1.maven.org/maven2/com/fasterxml/jackson/module/jackson-module-paranamer/2.10.0/jackson-module-paranamer-2.10.0.jar:module-info.class
` ``

nvm had to just dig into assembly docs

May you try my scala example that worked:
https://github.com/peterkovgan/docker-spark-ex
Read the readme too.

I added this to build.sbt

assemblyMergeStrategy in assembly := {
  case PathList("META-INF","services",xs @ _*) => MergeStrategy.filterDistinctLines
  case PathList("META-INF",xs @ _*) => MergeStrategy.discard
  case "application.conf" => MergeStrategy.concat
  case _ => MergeStrategy.first
}

you also need to add
plugins.sbt and add this there:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.15.0")