Hi, my name is Rob Crystal-Ornelas and I’m one of the interns for Project #2: Supporting Synthesis Science with DataONE. I’m looking forward to the experience of working on a systematic review with a team of researchers, including my co-intern this summer, Giancarlo Sadoti. This summer, we’ll be working on a systematic review of ecology and earth science articles. Giancarlo and I will identify the data used in the articles we find, and explore whether these data are available within the dataONE network.
This week we had three main goals: decide on search terms, identify studies for the systematic review, and conduct a literature review.
For a systematic review, it is important to decide on the search terms (and search strategy) used to locate articles ahead of time. This way, we will not be biased in the types of articles selected for the systematic review and to ensure the search is repeatable by other researchers. The project #2 team decided to take two approaches to finding articles. In the first approach, Giancarlo searched Web of Science using a variety of keywords related to ecology, earth science and data. I’m sure he’ll provide more detail in his blog post! For my search efforts, instead of searching based on keywords, I searched for any article that cited a widely used (12,000+ citations) dataset called WorldClim. These data have been used to conduct research in many ecosystems across the globe. The rationale behind this decision was that articles citing WorldClim may incorporate additional datasets into their analysis.
The search strategy that used WorldClim citations identified 2,000 studies from 2015-2017. I attempted to extract data from a handful of articles the determine the other data incorporated in each article’s analysis. I found that many of these articles brought in other datasets (ranging from 1-5). Moreover, some of the datasets could be found within DataONE’s network. I’m looking forward to exploring these articles in greater depth.
Lastly, I conducted a literature review of articles related to data aggregation within environmental sciences/ecology. I found that since the early 2000s researchers have been grappling with the idea that more data is being produced than can be analyzed or appropriately archived (Bell et al., 2009). The production of so much data means that some data may be used once and stored away on a hard drive or in the cloud and never accessed again. Even just several years after publication it becomes substantially harder to track down data due to broken email addresses or technological difficulties (Vines et al., 2014). There are advantages to open data including increased citation (Piwowar and Vision, 2013). And there is a sea change in that more researchers than ever are making data publicly available (Murillo, 2014). Despite these shifts, it is important to determine how often researchers use open access data. We will use the cohort of studies in earth science and ecology and compare data aggregated for these articles to data held within DataONE’s permanent and publicly accessible network.