Projects Investigations Resume About Home

Data Structures and Analyses Using R

The goal of this Silent Spring Institute project is to design and implement an integrated data system.

Silent Spring Institute's multidisciplinary team employs a variety of information including survey, laboratory, open source and questionnaire data as well as a range of meta data. Our innovative approach requires a customized system that maximize the potential of this non-standardized data while keeping costs low and maintaining versatility. The most recent generation of data practices moves away from the previous Excel, Access and SAS foundation to an R based system that uses many of the available cross applications packages.

Selection of relevant R packages:

This system is designed to minimize error and human maintenance by including automatically updating QAQC files using Sweave (R with LaTex) to deal with evolving decision rules and data refining. Furthermore, the use of R and Sweave increases transparency and reproducibility as well as providing a strong historical reference and encouraging best practices. Additionally, R allows scientists to visualize data quickly and easily using the iPlots package while also permitting them to create highly customizable quality figures.

Data processes and systems are under constant revision in response to researchers' needs and available technology.

Role: Primary role in designing and implementing R based data management system. Lead in researching and seeking out new technologies to streamline data processes, including monitoring R mailing lists for useful packages and creating sample code to introduce relevant functions to other researchers. Lead in transferring data from old to new data system and updating system as new types of data are acquired. Partner in designing data management protocols, selecting data for inclusion, and data collection.