californianseabass / webparse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Installation
pyenv global 3.4.5
pyenv -m venv myenv
source myenv/bin/activate
pip install -r requirements.txt
Setting up a local elasticsearch instance

[Download and install the latest elasticsearch][elasticsearch-installation] Add elasticsearch to your path In order to start your elasticsearch cluster:

elasticsearch

Create the sites index:

[Creating a new index.][create index] Add the index to store url pages.

curl -XPUT 'http://localhost:9200/pages?pretty' -d '{ "settings" : { "index" : { "number_of_shards" : 3, "number_of_replicas" : 2 } } }'

Connecting to postgres via psql

psql -d webparse -U webparse_user -h localhost

Checking Elasticsearch

You can do this from kibana.

kibana

After starting, navigate to [web ui][web ui].

  • list indices
GET /_cat/indices?v
  • delete pages index
DELETE /page?pretty
  • read everything in the index q=:
GET page/_search?pretty=true&q=*:*

create index elasticsearch-installation

web ui

About


Languages

Language:Python 98.3%Language:Shell 1.7%