lrowe / collective.blueprint.wikipedia

importing wikipedia articles into plone (using collective.transmogrifier)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Import blueprint is not yet done so please help import wikipedia to Plone.

  1. clone project:

    % git clone git://github.com/garbas/collective.blueprint.wikipedia.git
  2. run buildout:

    % cd collective.blueprint.wikipedia
    % python boostrap.py
    % bin/buildout
  3. run plone and create plone site with id Plone
  4. download wikipedia articles and untar it:

    % wget http://dumps.wikimedia.org/simplewiki/latest/simplewiki-latest-pages-articles.xml.bz2
    % bunzip2 simplewiki-latest-pages-articles.xml.bz2
  5. make sure that you point to right xml in config:

    % vim simplewiki.cfg
  6. run import:

    % bin/instance run import.py simplewiki.cfg Plone

TODO

  • Currently it fails around 20.000 items when trying to import ".htaccess"
  • recognize language wiki links (for now we are stripping them out)

About

importing wikipedia articles into plone (using collective.transmogrifier)


Languages

Language:Python 100.0%