We are living in an age of “big data.” Insights from big data touch almost every part of our livesfrom the way we navigate in our cars to the way we shop. Big data has also arrived in biodiversity research due to rapid change in the types and volume of data that researchers can use to ask and answer their scientific questions. The Data Science Lab works with Smithsonian researchers to use big data techniques, such as deep machine learning, to generate insights from their data, whether they are derived from genome sequencing, ecological sensors, or mass digitization of museum objects. These techniques require computational expertise in hardware and software to both build new algorithms and to implement the emerging tools that are developed outside the Smithsonian.

The Data Science Lab is housed in Research Computing, part of the Smithsonian Office of the Chief Information Officer.


Rebecca Dikow, Research Data Scientist

Rebecca Dikow is a Research Data Scientist and co-lead of the Smithsonian Institution Data Science Lab. She has a B.S. in Biology from Cornell University and a Ph.D. in Evolutionary Biology from the University of Chicago. Her dissertation research focused on using whole-genome data to build evolutionary trees (phylogenies). After the completion of her Ph.D., she was the Biodiversity Genomics postdoctoral fellow at the Smithsonian.


Paul Frandsen, Research Data Scientist

Paul received his PhD in Entomology from Rutgers University. He is interested in machine learning, phylogenetics, and the development of bioinformatics tools for genome analysis.