Supporting Data Management Early in the Research Cycle: New Directions for the Arctic Data Center
By: Jasmine Lai, Projects Data Coordinator, National Center for Ecological Analysis and Synthesis (NCEAS); Amber E. Budden, Director of Learning and Outreach, (NCEAS); Matthew B. Jones, Principal Investigator, NCEAS
Since 2016, the Arctic Data Center has been supporting Arctic researchers in the discovery, access, curation, and preservation of research data, in addition to providing support, guidance, and training on data management and reproducible research practices. In those first five years of operations, the Arctic Data Center has preserved the work of over 2900 researchers in over 6400 individual datasets totaling 56 terabytes, and has become one of only 27 repositories in the US to be CoreTrustSeal certified. Through engagement with researchers at national and international Arctic and domain-focused events, and through discussions at our reproducible research training events and Arctic conferences, we have developed an increased understanding of the data management and reuse needs of Arctic researchers that informs our focus for the next five years of operations.
As of 1 May 2021, the National Science Foundation (NSF) reaffirmed its commitment to data archiving and sharing in the Arctic by investing $ 6 million to continue funding of the Arctic Data Center through 2026. This continued investment will allow the Center to increase capabilities in a number of critical areas, in addition to maintaining and enhancing the current services valued across the community. We will scale our repository to support preservation of much larger, terabyte-scale datasets, which are increasingly being produced by researchers using, for example, remotely operated aircraft and automated sensor networks. In addition, we will increase the features available within our customizable data portals. The data portals service, which was launched during the first award, enables researchers to create a custom, branded portal for their research topic or lab group that spans datasets in the Arctic Data Center.
These branded portals provide users a convenient, readily customized way to communicate their research to the broader community. Since the service launched, over 20 portals have been created and this new award will enable us to increase the capability for customization with the addition of user-created custom search queries, groups and filters to aid discovery, and the ability to embed interactive data visualizations such as shiny apps. We will also introduce new group collaboration features that enable research projects to collaborate more effectively early in their project, long before data are ready to be archived. These early life cycle features will include team-focused data sharing, streamlined data management, and new data quality assessment services.
Data reuse will drive additional new functionality for building derived data products that are useful for multiple user communities. The Arctic Data Center preserves data from a myriad of disciplines with multitudes of formats, models, and protocols of varying compatibility. While the research community downloads and uses these datasets frequently, we also recognize the need to integrate these heterogeneous data into more uniform derived products that span spatial, temporal, and project boundaries. We plan a new derived data workflow service that will enable researchers from diverse disciplines to contribute new data products that assemble existing data in useful ways for various research, community, and management uses, among others. We will then be able to maintain and extend these derived products as new data of that type are archived over time.
Providing first-class support for data management challenges that face the Arctic research community remains a focus, and future work will increase our emphasis on supporting data management issues for social science disciplines. Planned work includes continuing discussions with social science researchers initiated in during the first award, designing targeted social scientific resources and learning opportunities, increased representation at social science meetings, applying the custom portal infrastructure to specific collections of social science data, modifying our data submission and curation systems to better address requirements of social science data management, and—to be announced shortly—recruiting a Fellow focused on understanding and supporting the needs of social science researchers and data. The Arctic Data Center broadly engages the Arctic research community, and will contribute to the important dialogue surrounding Indigenous data sovereignty through ongoing collaborations with projects such as ELOKA and the NNA Community Office that work directly with Indigenous groups. In doing so, we seek to increase awareness and appropriate adoption of open research and data archiving across various social science disciplines, and maintain and promote the FAIR (Findable, Accessible, Interoperable, and Reusable) and CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) principles.
In support of open science and capacity for good data management practices in the Arctic community, we are excited to be able to double the number of training offerings across the next 5 years. Our investment in training Arctic researchers includes not only increased delivery of training events, but also an expanded curriculum with materials planned for development that focus on managing sensitive data, qualitative data, Indigenous Knowledge, co-production, and CARE principles (among others). We welcome community input on these topics and will be reaching out repeatedly during the course of the award as we develop new curricula.
Community input is critical to the strategic direction of the Arctic Data Center and ensuring that we continue to meet the needs of the community. Through engagement at conferences, feedback from researchers using our services, community workshops, and training activities we stay connected with the challenges and opportunities encountered by Arctic researchers. In addition, the Arctic Data Center Science Advisory Board provides advice and leadership on goals and strategic priorities, and supports evaluation of the Center's deliverables and services. The current board brings together expertise in atmospheric, social, terrestrial, oceanographic, earth, and environmental sciences and as part of a planned rotation, we have an open call for nominations to the Science Advisory Board. We value your role in facilitating the success of the Arctic Data Center as a community resource and encourage you to nominate yourself or a colleague to serve on the advisory board.
Full details about the NSF Arctic Data Center partners, leadership, and scientific advisory board can be found on the Arctic Data Center's website
This material is based upon work supported by the National Science Foundation under Grant No. 2042102 for the Arctic Data Center and 1927720 for the Permafrost Discovery Gateway.
J. Obu et al., 2019. Northern Hemisphere permafrost map based on TTOP modelling for 2000–2016 at 1 km2 scale. Earth-Science Reviews 193:299–316, doi:10.1016/j.earscirev.2019.04.023.
About the Authors
Jasmine Lai is a Projects Data Coordinator at the Arctic Data Center. She has a bachelor of science from the University of British Columbia. At the Arctic Data Center, Jasmine helps archive research data and contributes to open source software. Through her work, she hopes to make science open, inclusive and accessible to a wide audience.
Amber E Budden is Director of Learning and Outreach at the National Center for Ecological Analysis and Synthesis, and co-PI of the Arctic Data Center. Amber leads data science training activities within the NCEAS Learning Hub and supports users of data infrastructure through community building, training, and user-focused design. Amber holds a PhD in Behavioral Ecology and a joint BSc in Psychology and Zoology.
Matthew B. Jones iis Director of Research and Development at the National Center for Ecological Analysis and Synthesis, and PI of the Arctic Data Center. Matt’s work focuses on both supporting efficient synthesis through scientific computing and on building new advanced infrastructure to support data sharing, preservation, analysis, and modeling.