USCDataScience / sparkler

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Home Page:http://irds.usc.edu/sparkler/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Debugging Elasticsearch Connection

Kefaun2601 opened this issue · comments

Task Description

Most of the Elasticsearch implementation has already been written. There are still two major problems that need to be resolved:

  1. ElasticsearchResultIterator needs to implement deserialize(). We are having an issue with creating an instance of a generic type.
  2. Debug data persistence to make sure that the data in Elasticsearch is being updated properly.

Updates will be posted as progress is made.

Related PR
#225