BioStar is a Python and Django based Q&A web software modeled after the StackOverflow Q&A engine.
Our primary goal is to create a simple, generic, flexible and extendeable Q&A framework.
The software requires only Python (version 2.7) to run. All other libraries are included in the distribution. The code will run with no changes on any operating system that supports Python. The software runs the BioStar Bioinformatics Q&A site at: http://www.biostars.org
Get the source from github
$ git clone https://github.com/ialbert/biostar-central.git
There are a few dependencies such as Django, Docutils and South that are also included with Biostar. These only need to be installed if you don't already have them on your system. Switch to the libs directory and unpack the depot.zip :
$ cd libs
$ unzip depot.zip
For faster loading performance may also want to unzip the entire libraries.zip
file located in the libs folder.
You will need to initialize a number of environment variables then run the server:
$ source conf/default.env
$ ./biostar.sh init import run
Visit the http://localhost:8080 to view your site. Enjoy!
Note The Windows version of the biostar.sh manager has not yet been written. The site will work just fine on Windows but for now users will need to manually invoke the commands present in the biostar.sh run manager.
In general you will first need to initialize environment variables
then start the server. There are a number of example environments in the conf
folder.
The default environment can be initialized via:
$ source conf/default.env
There is a run manager in the root directory:
$ ./biostar.sh
Execute it with no parameters for information on usage. This run manager can take one or more commands. For example to initialize the database then populate it with the test data and to run the server one would invoke it in the following way:
$ ./biostar.sh init
$ ./biostar.sh import
$ ./biostar.sh run
Alternatively one may run all these commands all at once:
$ ./biostar.sh init import run
Note: If database models change you must reset and reinitialize the database, note that this will remove all existing content! The database re-initialization is database specific, for the default sqlite deployment you can use:
$ ./biostar.sh delete init import run
The biostar.sh run manager to pulls in environment variables to allow you to customize locations/test fixtures, etc. Edit the biostar.sh script to override the various settings.
Search requires indexing that is disabled during a migration or import. To enable the search you will need to manually trigger the indexing via::
$ ./biostar.sh index
The default server will bind the all IP adapters (0.0.0.0) and port 8080. Visit http://localhost:8080 to see interact with your version of the test server.
There are commands to support the Postgresql database. The commands are:
pgdrop pgdump pgreset pgimport <your-sql-file>
Most operations are customized via environment variables. To show their current settings use::
$ ./biostar.sh env
Custom settings should import all settings as from main.settings import *
then overide
those values that need to change. For an example see the conf/demo.env
and conf/demo.py
files.
A number of helper methods can be found in the main/bin
directory, when the python import paths
are set properly these can be invoked as:
python -m main.bin.postadd
to import posts into the main sitepython -m main.bin.postmod
modifies a post in the main sitepython -m main.bin.useradd
to add userspython -m main.bin.usermod
to edit userspython -m main.bin.extract
extracts the value of a settings parameterpython -m main.bin.patch
is used to change varios values after migrationpython -m main.bin.sitemap
generates a sitemap into the export directory
The best practice is to establish separate environments and settings files.
The python -m main.bin.extract
command can be used to fill an environment variable from
a django settings file. See the conf/demo.env
file. Typically use is to source conf/default.env
then source a second smaller file that overrides just a few parameters.
To add a new user and a post by this user you can do a:
$ source conf/default.env
$ python -m main.bin.useradd -u newuser -e john@gmail.com -n 'John Doe'
*** creating user newuser
$ python -m main.bin.postadd -e john@gmail.com -t 1 import/forum/rnseq-tool.txt
*** adding 165, 104, john@gmail.com, RNA-SeqQC quality control
BioStar uses the common versioning nomenclature that consists of three numbers called major.minor.revision
See the NEWS.md file for release numbers and version migration information.
API Testing can be initiated via the biostar.sh
run manager:
./biostar.sh test
A reports
directory will be created in the root directory
that contains html reports on the code coverage by the tests. View the report/index.html
file.
Selenium tests can be run via:
./biostar.sh selenium
Please note that for this to work properly the python selenium library bindings must be installed moreover
the SELENIUM_TEST_LOGIN_TOKEN
variable must be set in your Django settings file. See the conf/selenium.env
file:
SELENIUM_TEST_LOGIN_TOKEN = "somepasswordgoeshere"
In addition both the test site and the command line above must make use of the same settings file.
Posts may be formatted in Markdown.
User reputation is a sum of all upvotes and accepted answers that a user accumulates. Note that multiple answers may be accepted on a question, in effect this provides the author of a question to reward twice the excellent answers.
In Biostar there are four types of users: anonymous users, registered users, moderators and administrators.
- Anonymous users: May browse all content of a site.
- Registered users: In addition to the privileges that anymous users have registered users may create new posts if their reputation exceeeds a limit (the default is zero), may vote and post answers and comments.
- Editors: In addition to the privileges that registered users have Editors may edit, close and delete posts, edit user information (other than email) and may also suspend and reinstate users. All the actions of the Editors may be followed via the Moderator Log page (see About BioStar page for a link)
- Administrators: In addition to the privileges that moderators have administrators may promote/demote users from having moderator roles. Administrators also have access to the django admin interface where they may perform more database actions than those offered via the BioStar interface..
There are some dependencies that need to be installed (notable the Unix gettext
utility) to run
the django makemessages command.
Template content needs to be tagged with the Django Translation framework. The new message compilation then will be run via:
./biostar.sh messages
The settings file needs to specify the language (see the fileas named conf/ch.env
and conf/ch.py
). For an example site
in either Chinese run the following:
source conf/default.env
source conf/ch.env
./biostar.sh run
To override what pages get loaded in add a new template directory in a location and override the template loader order.
For an example see the conf/ch.py
settings file. In this example a different widget will be loaded
from the corresponding templates in conf/custom-html/widgets/page.share.html
.
Content may be deleted (marked invisible to users) or destroyed (removed from the database).
A post submitted for deletion will be destroyed only if the author requests the deletion and the post does not have any followups (answers/comments) associated with it. Deleted top level posts are marked invisible to regular users.
The Python code, templates, static content (css, images, javascript) and default database are found in the main directory. There is partial datadump of the existing BioStar content in the import folder. The import command will load this data into the current database.
Biostar is built with open source libraries. The following software packages are used and if necessary included and distributed with BioStar:
- Bootstrap as a CSS framework
- JQuery for javascript programming
- Less used for syntactically awesome css
- markitup as rich text javascript editor.
- python-markdown2 python library to convert Markdown to HTML
- docutils is used to convert ReST_ to HTML
- django_openid_auth and python_openid for openid authentication
- whoosh provides fast full text searching
- coverage is used to measure code coverage during testing
- prettify is used for syntax highlighting
- html5lib provides html parsing
To load content from a StackExchange 1 XML datadump one needs to migrate the data
into the new schema. This is accomplished via the migrate
command::
$ ./biostar.sh migrate
This command in turn invokes the main/migrate.py
script. Run this script
(note that the Django settings need to be properly set beforehand)
with the -h flag to see the flags it can take.::
$ python -m main.migrate.py -h
.. note:: The migrate
command used via the biostar.sh
run manager makes use
of an in memory database as specified in the conf/memory.env
and conf/memory.py
files.
The result of a data migration is a compressed json data fixture file that, in turn, may be used via the import command::
$ ./biostar.sh init import index
There is an automatic account migration based on the email provided by the OpenID provider. Only the information from a subset of well known OpenID providers are trusted enough to allow automatic account merging. Accepted providers are: Google, Yahoo, Myopenid, LiveJournal, Blogspot, AOL, and Wordpress. For other users manual migration of accounts will be required. Users listed in the Django ADMINS settings will have full administration privileges.
There is a postgresql database management script in conf/pg-manager.sh
that is
used to facilitate data dumps and restoration.
Environment variables may be used to customize the behavior:
DJANGO_SETTINGS_MODULE
: the configuration module for DjangoPYTHON
: the python executable that is to be invokedFIXTURE
: output path to the (gzipped) file that will contain the data fixtureMIGRATE_PATH
: path to the directory that stores the StackExchange XML dumpMIGRATE_LIMIT
: the number of records to load from the XML dump
For a current Biostar run with about 4K users, 30K posts, 40K edits, 60K votes generates about 300K database entries of various kinds. Data migration into a fixture takes about 1 hour and 10Gb of RAM. This is an area that we could do a lot better job (possibly orders of magnitude better).
The resulting data fixture is database independent and can now be loaded into type database: sqlite, mysql, postgresql supported by Djano. For example when loading into postgresql it takes about 2 hours and 2Gb of RAM.
Note that the databases can be dumped and restored with far fewer resources. Exporting directly into/from postgresql for example takes less than a few minutes.