mozillascience / code-research-object

Project between GitHub, figshare and Mozilla Science Lab.

Home Page:https://mozillascience.github.io/code-research-object/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Want to help us test? Add your info here.

kaythaney opened this issue · comments

We're looking for publishers and researchers to help us test new ways of giving credit to code. Want to help? Add your info here.

Scott Edmunds @gigascience

Amye Kenall @OpenDataBMC

Varsha Khodiyar @F1000Research

Phil Bulsink @pbulsink

Rob Davidson @bobbledavidson

Hilmar Lapp @hlapp

Aron Ahmadia @ahmadia (And I've got some wild ideas about integration with HashDist and IPython Notebook).

I'd be interested. @zeckalpha

Morteza Milani @milani

Raniere Silva @r-gaia-cs

Fabian Schreiber @fabsta

Rémi Emonet @twitwi

Ed Borasky - @znmeb on Twitter

Count me in @drchriscole

Simon Li @manics

Ethan White @ethanwhite.

We have bootstrapped version of the core idea for some of our code. I'd be happy to move over to the new system. See:
https://github.com/weecology/METE
http://figshare.com/articles/METE_Software_for_Analyzing_Harte_et_al_s_Maximum_Entropy_Theory_of_Ecology/815905

Thomas Arildsen @ThomasA.

Robert M Flight @rmflight

Haiyan Meng @hmeng-19

Cristóvão D. Sousa @cdsousa

Peter Ansell @ansell

Andrew Barr @wabarr

Thomas Henderson tom@mathpunk.net

B. Arman Aksoy @armish

Stuart Mumford @Cadair

Ben Fields @gearmonkey

August Muench @augustfly

Matthew Turk @matthewturk

Yoshiki Vazquez-Baeza @ElDeveloper.

hi all, sending out a mail today with a call for feedback. unable to find a few of your emails. @augustfly @Cadair @juandesant @ansell @hmeng-19 @drchriscole @manics @fabsta @bobbledavidson - could you send me your emails? (kaitlin at mozillafoundation dot org). thanks!

Hi -
My name of github account is hmeng-19.
My email is: hmeng@nd.edu

Thanks.
Haiyan

On Tue, Mar 4, 2014 at 2:27 PM, kaythaney notifications@github.com wrote:

hi all, sending out a mail today with a call for feedback. unable to find
a few of your emails. @augustfly https://github.com/augustfly @Cadairhttps://github.com/Cadair
@juandesant https://github.com/juandesant @ansellhttps://github.com/ansell
@hmeng-19 https://github.com/hmeng-19 @drchriscolehttps://github.com/drchriscole
@manics https://github.com/manics @fabsta https://github.com/fabsta
@bobbledavidson https://github.com/bobbledavidson - could you send me
your emails? (kaitlin at mozillafoundation dot org). thanks!

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-36663492
.

David Bowler @MillionAtomMan

James Howison @jameshowison

Perhaps of interest, some research papers in which we discuss (among other things) some of the downsides with the current practice of referring to code via "software publications." Not particularly part of the solution, but perhaps a useful way of shifting people's practices.

Howison, J., and Herbsleb, J. D. 2011. “Scientific software production: incentives and collaboration,” in Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW ’11, Hangzhou, China, pp. 513–522. http://doi.acm.org/10.1145/1958824.1958904
http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf

Howison, J., and Herbsleb, J. D. 2013. “Incentives and integration in scientific software production,” in Proceedings of the ACM Conference on Computer Supported Cooperative Work, San Antonio, TX, February 23, pp. 459–470. http://doi.acm.org/10.1145/2441776.2441828
http://james.howison.name/pubs/IncentivesAndIntegration-p459-howison.pdf

Susheel Varma @susheel

Chris Hartgerink @chartgerink

Daisie Huang @daisieh

Valentin Churavy @Wallnuss

Andrew Straw @astraw

commented

I'd love to help!

@bookhling

-sung

Matt Shirley @mdshw5

Gerard Gorman @ggorman
Yes I want to play. Lots of software and data.

I'd love to help. At the COIN-OR Foundation (www.coin-or.org), we've been talking about how we can make code into a peer-reviewed, citable publication for years, but this is the most positive step I've seen to that end. We would love to be a partner in making this happen.

So much that is being overlooked is not simply " get me the code + data " ... but the environment on top of that. How many researchers here do we have with native packages running on their machines? How deep do the dependencies go? One HUGE advantage of cloud computing here is that we've got CLEAN boxes to ensure ALL dependencies are installed. "Just the data" and "just the code", is only half the headache of getting the desired outputs.

It seems like we're overlooking the substantial work that goes into provisioning the environments.
^^ THIS is what I want to help with.

I would expect anyone experienced in porting non-trivial codes to agree with you. I also think there is a lot of milage in this idea but it has not quite been universally received. Here is an interesting short commentary on the subject:
http://www.recomputation.org/blog/2013/07/16/on-virtual-machines-considered-harmful-for-reproducibility/

@ggorman VMs are substantial in size and as stated in the article: black boxes. We don't want that. We want install processes to be determined in code and executed on instancing of the box.

There is often a gap between what we want in an ideal world, and the pragmatic decisions we have to make progress. I agree with the position you are taking (although I would be interested to understand how this would be done since there are many package managers and no universal agreement on the right way of doing this) - but I also believe we should support other circumstances with a history that would require massive (unavailable) resources to re-engineer.

Grabbing a piece of software and porting it to a system is not always easy. In fact it can be a massive challenge in itself. I can list several large packages that are doing really good science that have many 10's of dependencies on other substantial libraries. As systems and compilers are updated any part of this can break in lots of interesting (i.e. infuriating) ways. I regularly run into problems porting code where something that used to be accepted by a compiler is no longer acceptable. Again, this is no big deal if you are working on one small project - but if you talking about something substantial with possibly millions of lines of code in many different languages from across the globe it becomes a herculean task if not impossible due to finite human resources.

I would be happy to help test.

@carlcrott and @ggorman #2 seems a good place to bring up virtualization.

I can list several large packages that are doing really good science that have many 10's of dependencies on other substantial libraries.

I think these sound like smart places to start. Also is everyone here working for free? If they're big packages there should be some financial support sourcing specifically from the primary owners of the code.

I would start with chef as their primary offering is an attempt to abstract away OS differences. As stated above its a hard task, however they've got $$$ ... So its a smart bet to take that they're not going to go evaporating.

Great initiative! @mickeypash

I'd also like to test. @mbjones We're assembling a similar system for the KNB and an open API for DataONE for sharing these objects, and so would love to discuss interoperability with you.

James Manton @ajdm

J. Richard Snape Twitter:@snapey1979

I am currently running a course on Coursera called "Massive Teaching: New skills required". This course is a meta-MOOC, talking about the skills associated to MOOCs, and engaging instructors to think about the consequences of teaching on open/commercial MOOC platforms. In the upcoming week, I want to engage my 7000 students in the following topics: open science, open data, open source, open software, crowdsourcing, crowdstorming of datasets, and "code as research object". I guess I have finally found the right place on the web to tie all those ideas together, if anyone is interested. You could simply watch the videos, but the forums are much more fun, and you will probably want to participate.

https://class.coursera.org/massiveteaching-001

David Ketcheson @ketch

Richard Littauer.

Vinod Pahuja

Chris Bogart

Jose Barrios @BarriosJose (Tweeter)