Alongside sharing and publishing data sets, there are a variety of ways to publish accompanying journal articles to provide a “data description” that either includes or refers to a specific dataset. This is a way to offer narrative context beyond standard metadata, such as describing the motivation and process behind compiling the dataset being described. Additionally, this type of publication can offer formal recognition for all team members involved in creation of the dataset.
Here are several examples of journal articles about datasets:
- Yeager et al 2017 Marine Socio-Environmental Covariates – This is an Ecology Data Paper and was an outcome from a SESYNC postdoc project, which included a shiny app for querying and downloading subsets of the data. The paper comprises an abstract and a lengthy metadata document describing in detail how each variable was calculated. The metadata also links to a Github repository that contains all of the code used to process the data. This link describes Ecology data papers in more detail. The metadata is a Word document that describes many different aspects and details of the data but is less standardized than what is typically expected for a domain-specific repository.
- Stanley et al. 2016 Ecology of methane in streams and rivers – This paper is in Ecological Monographs and describes (and cites) a dataset that is hosted separately on the LTER’s data portal. The paper has many descriptive statistics, interpretation, and conceptual analysis. The dataset is a series of linked .csv files and is described formally with Ecological Metadata Language.
- Salguero-Gomez et al. 2015 The COMPADRE Plant Matrix Database – This is a Journal of Ecology paper describing a version of the COMPADRE plant matrix database which is hosted at https://www.compadre-db.org/. The supporting information for the paper has several appendices that include user guides and R scripts for accessing the data and doing common operations.
- Appling et al. 2018 The metabolic regimes of 356 rivers in the United States – this is an example from Scientific Data, which is in the Nature family of journals and open access. This journal does not host the data described in the publications, however, their submission process is integrated with several data respositories such as FigShare and Dryad. Their website includes guidelines and recommendations for both domain specific and generalist data repositories. You can see more examples from the Ecology subject area here.