This package provides a Python interface for working with the Allen Human Brain Atlas (AHBA) microarray expression data.
In 2013, the Allen Institute for Brain Science released the Allen Human Brain Atlas, a dataset containing microarray expression data collected from six human brains. This dataset has offered an unprecedented opportunity to examine the genetic underpinnings of the human brain, and has already yielded novel insight into e.g., adolescent brain development and functional brain organization.
However, in order to be effectively used in most analyses, the AHBA microarray expression data often needs to be (1) collapsed into regions of interest (e.g., parcels or networks), and (2) combined across donors. While this may potentially seem trivial, there are numerous analytic choices in these steps that can dramatically influence the resulting data and any downstream analyses. Indeed, Arnatkevičiūte et al., 2018 ([1]) provided a thorough treatment of this in a recent manuscript, demonstrating how the techniques and code used to prepare the raw AHBA data have varied widely across published reports.
The current Python package, abagen
, aims to provide a reproducible pipeline
for processing and preparing the AHBA microarray expression data for analysis.
If you'd like more information about the package, including how to install it
and some example instructions on its use, check out our documentation!
This package has been largely developed in the spare time of a single graduate student (@rmarkello) with help from some incredible contributors. While it would be ✨ amazing ✨ if anyone else finds it helpful, given the limited time constraints of graduate school, the current package is not currently accepting requests for new features.
However, if you're interested in getting involved in the project, we're thrilled to welcome new contributors! You shouldstart by reading our contributing guidelines and code of conduct. Once you're done with that, take a look at our issues to see if there's anything you might like to work on. Alternatively, if you've found a bug, are experiencing a problem, or have a question, create a new issue with some information about it!
While this package was initially created in early 2018, many of the current
functions in the project were inspired by the workflow laid out in
Arnatkevičiūte et al., 2018. As such, if you use this code it would be good
to (1) provide a link back to the abagen
repository with the version of the
code used, and (2) cite their paper:
[1] | (1, 2) Arnatkeviciute, A., Fulcher, B. D., & Fornito, A. (2018). A practical guide to linking brain-wide gene expression and neuroimaging data. bioRxiv, 380089. |
This codebase is licensed under the 3-clause BSD license. The full license can
be found in the LICENSE file in the abagen
distribution.
Reannotated gene information located at abagen/data/reannotated.csv.gz is taken from [1] and is separately licensed under the CC BY 4.0; these data can also be found on figshare.
All trademarks referenced herein are property of their respective holders.