holderdeord / hdo-transcript-search

Visualize language use in the Norwegian parliament

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hdo-transcript-search

Build Status Code Climate

Visualize language usage in the Norwegian parliament. See it in action at tale.holderdeord.no.

image

This project consists of two parts:

  • indexer/ - download and index Stortinget transcripts in ElasticSearch
  • webapp/ - web frontend to present / visualize the data

Running with docker-compose

$ docker-compose up -d es webapp
$ docker-compose run --rm indexer

Requirements

  • elasticsearch
  • node.js
  • ruby

indexer

Download and index transcripts (requires a local elasticsearch):

$ cd indexer/
$ gem install bundler
$ bundle install
$ bundle exec ruby -Ilib bin/hdo-transcript-indexer

Re-create the index. This is necessary when a mapping is changed:

$ bundle exec ruby -Ilib bin/hdo-transcript-indexer --create-index

Convert a single XML transcript to indexable JSON:

$ bundle exec ruby -Ilib bin/hdo-transcript-converter transcript.xml

webapp

Start the webapp in dev mode:

$ cd webapp
$ npm install
$ npm run dev
# open your browser at http://localhost:7575/

Caveats

  • Because of deficiencies in the transcripts, we don't know the correct time for all speeches. The "time" field will in these cases be set to midnight.

About

Visualize language use in the Norwegian parliament

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:JavaScript 65.2%Language:Ruby 26.1%Language:Less 3.7%Language:SCSS 2.4%Language:Handlebars 1.9%Language:Shell 0.3%Language:Python 0.2%Language:Dockerfile 0.2%