yungtiec / poc-reghub

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Config

Elasticsearch

Jupyter notebook

Getting Started

To start an elasticsearch server on the local machine, cd to the unzipped elasticsearch directory and run

bin/elasticsearch

To use the kibana interface, cd to the unzipped kibana directory and run

bin/kibana

Kibana is available at localhost:5601. By default, kibana requires input data to run and has some sample data available.

Alternatively, you can index your own pdf document. To do so, install anaconda and run the following command in the poc-reghub directory:

jupyter notebook

Jupyter notebook interface will be available at localhost:8888 by default.

Navigate to tika.ipynb and run the file to index a pdf document pulled from the RegHub airtable database.

You can check whether you've successfully added the document to Elasticsearch via Kibana. screenshot of kibana index management

Writing queries

Kibana has a dev tool feature that lets you query your search index. screenshot of kibana dev tool If you've successfully index a pdf document in tika.ipynb, you can try the following queries

GET /poc-reghub/_search
{
  "query": {
    "match": {
      "text": "another jurisdiction"
    }
  }
}

GET /poc-reghub/_search
{
  "query": {
    "match": {
      "text": "Japan VASP"
    }
  }
}

About


Languages

Language:Jupyter Notebook 63.3%Language:HTML 36.7%