jumpt57 / allocine-parser

Program to retrieve, process and save data from Allociné (the French IMDB).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allociné Parser

Allociné Parser is a project designed to play with different elements of the Java language in version 8 as well as other libraries and frameworks (such as multi-threading, data mapping via Jsoup, persistence with mongodb, lambdas, streams etc.) in a development context of a program to retrieve data from Allociné (the French IMDB).

The program will parse Allociné movie cards to transform them into BSON object which will then be saved in a NoSQL database (MongoDb as of today).

Because Allociné contains hundreds of thousands of films, it is not possible to recover them one by one. The goal is to set up a multiple recovery via the Java multi-threading API.

Built With

  • Java 8 - Programming language
  • Apache Maven - Build automation tool
  • Jsoup - Java library for working with real-world HTML
  • Mongodb - Free and open-source cross-platform document-oriented database program
  • Morphia - Transparently map your Java entities to MongoDB documents and back.
  • Apache Commons Lang - The standard Java libraries fail to provide enough methods for manipulation of its core classes. Apache Commons Lang provides these extra methods.
  • IntelliJ IDEA - Java integrated development environment (IDE)

About

Program to retrieve, process and save data from Allociné (the French IMDB).


Languages

Language:Java 100.0%