RDI2 Workshop: Reproducibility in Experimental and Computational Science Research - A Rutgers 250 Event

Date: October 10, 2016

Location: CoRE Auditorium, Busch Campus

Join us for an interactive, open-ended discussion on reproducibility in large-scale experimental and computational research as representatives of computational science and domain science communities present a wide range of perspectives and priorities. The agenda will focus on two themes: (1) what it means for computational science to be reproducible and best practices for reproducibility, (2) why reproducibility is hard in the computational context and the available tools, technologies, and practices that can support the practice of reproducibility at Rutgers University and beyond.

·       Familiarize participants with the concept, importance, and best practices of reproducibility in experimental and computational science research

·       Present concrete guidance, tools, and learning for conducting reproducible research across broad scientific domains and disciplines

·       Provide a setting for participants to learn new skills and tools, and to improve their own research

·       Socialize RDI2’s mission to deliver a robust cyberinfrastructure that enables reproducible research at scale as part of its promotion of a culture of reproducibility in computational science

8:00am—9:00am:      REGISTRATION

9:00am—9:15am:      Opening Remarks | Manish Parashar, Rutgers

9:15am—10:00am:    Toward Really Reproducible Research: Policies and Practices | Victoria Stodden, Univ of Illinois

The reproducibility gap in computationally enabled research is a well-known problem. This talk will frame the issue, including interpretations of reproducibility, and outline steps to resolving this collective action problem. Steps being taken by various stakeholders including researchers, institutions and libraries, funding agencies and the federal government, journals and publishers, and scientific societies will be highlighted. Finally, gaps, barriers, and successes will be summarized.

10:00am—10:30am:  Reproducibility of Structural Biology Research: The Role of the Protein Data Bank | Helen Berman, Rutgers

The Protein Data Bank (PDB) was the first open access digital repository in biology. The role the PDB has played to help ensure reproducible research in structural biology will be presented.

10:30am—11:00am: Reproducibility in the Field Sciences: Liberating Data and Samples | Kerstin Lehnert, Columbia

Access and preservation of data and samples and other scholarly products in the field sciences is essential for transparency, reproducibility, and use of unique observations for future science, but is still encountering many cultural, financial, and technical barriers. This session will focus on approaches and initiatives, in which funders, publishers, scientific societies, and others are responding to change the data and sample ecosystem in the field sciences.

11:00am—11:15am:  BREAK

11:15am—11:45am   Putting Reproducibility into Practice: Workflows and Case Studies | Ryan Womack, Rutgers

Ryan Womack, Data Librarian, will illustrate how reproducibility can be achieved through the use of specific coding and authoring practices, code and data sharing platforms, and data citation. Case studies will highlight these pracices at various scales from single author research to big data, team science projects, and will explore the support tools available along the way.

11:45am—12:15pm:  Provenance for Computational Reproducibility and Beyond | Juliana Freire, NYU

The need to reproduce and verify experiments is not new in science. While result       verification is crucial for science, improving these results helps science to move forward. This session will discuss the importance of maintaining detailed provenance for both data and computations, and present methods and systems for capturing, managing and using provenance for reproducibility. We also explore benefits of provenance that go beyond reproducibility and present emerging applications that leverage provenance to support reflective reasoning, collaborative data exploration and visualization, and teaching.

12:15pm—1:15pm:    LUNCH

1:15pm—1:45pm:      Let’s Do It Again: Systems Support and Community Incentives for Replicability | Chaitanya Baru, NSF

Scientific results are expected to be reproducible.  With ubiquitous use of computing systems and digital data, there is a presumption that the results are also replicable—running the same computer program with the same data inputs ought to produce exactly the same results.  However, this notion is confounded by issues related to hardware-dependent implementations, complex software environments, parallel computing, use of probabilistic methods, and use of Big Data sets that may be changing and evolving. How can software environments be designed to be usable, effective, and debuggable for replicability? What does the community really want and need, in terms of replicability? Are incentives needed to produce replicable results? If so, what would these incentives be? This talk will attempt to set the stage and initiate a discussion on these topics.

1:45pm—2:00pm:      BREAK

2:00pm—3:30pm:      PANEL Creating a Culture of Reproducibility-Best Practices & Pitfalls | Panelists: V. Stodden, J. Freire, C. Baru, R. Womack, K. Lehnert., M. Lesk, S. Burley, Moderated by: M. Parashar, H. Berman

3:30pm—5:00pm:      NETWORKING RECEPTION