LogicalSpark / docker-tikaserver

Apache Tika Server as a Docker Image

Home Page:http://logicalspark.github.io/docker-tikaserver

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Release 1.22 Tika server to dockerhub

chibenwa opened this issue · comments

I saw several vulnerabilities affecting the currently shipped version of Tika, all of them resolved in 1.22 version.

Namely:

  • [CVE-2019-10088] OOM from a crafted Zip File in Apache Tika's RecursiveParserWrapper
  • [CVE-2019-10093] Denial of Service in Apache Tika's 2003ml and 2006ml Parsers
  • [CVE-2019-10094] StackOverflow from Crafted Package/Compressed Files in Apache Tika's RecursiveParserWrapper

When do you think the docker image will be available on dockerhub?

Well, not really.

I don't want to rfly on a latest image thus I would like the 1.22 tag to be explicit.

+1 on this. Any way I can help? Also tagging @dameikle

@dameikle would it help if we try to come up with a GitHub Action that could build the image and push it to the Docker hub whenever a release is tagged in this repo?

Hi Folks. Sorry I've been silent on this for a while, its been a busy time offline. I agree with you @epugh it would be good to do this from the official Apache Tika project, and retire this altogether. Let me take a look at the PR @mpdude has kindly provided and see if we can incorporate some of this into an official Tika one. I can do a parallel inclusion here and there until ready to retire this.

As far as I am aware of a docker container ships non apacheV2 license content thus having already built container as a delivery of an Apache project is controversial, and they were ongoing discussion on this topic at ApacheCon Berlin. (However shipping a dockerfile is fine).

It's similar in concept to a convenience binary build, it needs to be called out explicitly and made clear that it is not the official distribution (that is always the source). You'll see a few projects using this approach now with the blessing of INFRA.

I've started a repos here:
https://github.com/apache/tika-docker

Requested Travis-CI build and DockerHub access, so hopefully can wire this to make an image publish part of the release ceremony.

Will port some of the improvements back here, and acknowledge those who provided PRs over there once I complete the README.

Let us know if we can help

As mentioned elsewhere, I've started the work to move this over to the ASF (I'm a Tika PMC member).

It's starting to take shape, with historical images published here:
https://hub.docker.com/r/apache/tika

Got some more tweaks to do to update the Git repos and then will start pointing users from this repos there.