Help for Scientists Who Code

More and more, research requires some level of computing to analyze data, whether retrieving a large dataset from a repository, using R and WinBUGS to perform occupancy modeling analyses, or writing Python scripts to run spatial analyses. However, most researchers and students have limited exposure to programming techniques. We often "hack" something together, leaving the potential for undetected errors in increasingly complex analyses. One popular reference source is stackoverflow, a robust, crowd-based resource for answering basic questions like "how to create a data frame from a matrix in R" or "how to create a loop". While this can be effective in completing a particular problem, it does not address best practices for programming, resulting in code that is inefficient, and difficult to share and incorporate into subsequent analyses. A small investment honing programming skills can pay dividends in efficiency down the road.

Enter Software Carpentry, a volunteer network focused on teaching coding principles to scientists, allowing computers to fulfill the promise of enabling analyses too tedious to tackle otherwise. The most useful section of the website are the lesson plans, whose topics include:

  • Version Control with Git -> keep track of changes and prevent data loss
  • Programming with Python
  • Programming with R

By increasing the quality of scientific code and sharing it on sites like figshare, we can make greater strides in taking advantage of large datasets and computing-intensive statistical techniques while lowering barriers to powerful analyses within and across disciplines.