Spend less time researching articles with accessible datasets

Author: Iain Hrynaszkiewicz, Director, Open Search Solutions, PLOS

We are testing a new experimental open science feature intended to promote data sharing and reuse across the PLOS journal portfolio. A subset of PLOS articles linked to shared research data in a repository will display a prominent visual cue designed to help researchers find accessible data and encourage best practices in data sharing.

Explore incentives and increase data accessibility

In a project funded by the Wellcome Trust, we are experimenting with solutions designed to increase the sharing and discovery of research data. and the second of these solutions will be launched this week. Both solutions aim to promote the use of data repositories for sharing data and linking data to publications. Sharing research data that underpins published articles is considered a best practice as it promotes data discovery and reuse and aligns with FAIR (Findable, Accessible, Interoperable and Reusable) data principles.

The first solution we introduced was PLOS pathogens‘ with the Dryad Repository, which was launched as a pilot in October 2021, and aims to increase the use of the repository by introducing a small (optional) step in the manuscript submission process that enables seamless deposit of data on Dryad.

This second solution, experimental, is an “Accessible data” functionality deployed on articles linked to research data in one of the many repositories. The feature will appear on over 3,000 PLOS articles, starting this week. It will also appear on newly published items that qualify for the feature during the experiment, which is expected to run through the end of 2022.

We’re not the first publisher to experiment with prominent visual links to research data, and some readers will see similarities between this feature and Center for Open Science badges. Since the introduction of our Data Availability Policy in 2014, every published PLOS research paper includes a Data Availability Statement (DAS) and an increasing proportion of these include links to data repositories in the DAS. This solution builds on the results of our long-standing data availability policy and aims to increase the visibility and reuse of publications that share data in repositories, while encouraging more researchers to adopt this approach.

An important difference with this experience is in the questions we ask. Previous projects have explored the role of badges in encouraging more researchers to share data according to best practices, with encouraging results. But our solution is intended to promote both data sharing and access – by connecting directly to data in the repository and offering a visual cue (or “reward”) about the published article. We suspect that making linked research data more visible on article pages will increase its usage, helping readers spend less time searching for articles with accessible datasets.

Another difference in our approach is automation. Articles eligible for the feature will automatically display the feature according to simple rules, without additional human intervention from authors or editors. This provides recognition to authors who have incorporated good research data practices into their publication workflow, without adding to the burden on authors, editors or peer reviewers.

Our experimentation with these solutions is informed by research involving PLOS authors which suggests that many researchers have difficulty accessing research data that they can reuse – something we believe could be addressed in part by using wider reference frame. The feature has gone through several rounds of testing with researchers and design experts to optimize the visual design and text language.

Scope of experience

The Accessible Data feature is limited in its scope in this experimental phase*. The feature will appear on PLOS items that:

  • Are published since 2016, and
  • Include a unique link to a repository’s data in their data availability statement, and
  • The link directs to a single record in Dryad, Figshare or Open Science Framework (OSF)

See an example here.

We have limited the scope for several reasons. First, by keeping the scope no larger than necessary to test our hypotheses, we can experiment faster and more cheaply. Second, we must be able to measure whether we are succeeding in achieving the goal of increasing access to data. Dryad, Figshare, and OSF provide readily available usage data on deposited datasets, which will help us monitor any correlation between reader engagement with links and dataset usage (we chose the use/engagement with datasets as a more readily available proxy – and a prerequisite – for data reuse). Third, these repositories are relatively well used by PLOS authors, providing a sufficient cohort of articles to experiment with.

There are many other repositories used by PLOS authors, which are equally valuable to the researchers, communities and institutions they support. If the solution is successful in achieving a measurable increase in data usage, we will expand it in the future to handle more complexity – more data repositories, unique identifier types, and more data availability statements. complex. There are also other potential future directions to explore, such as integrating data linking with data citations to increase credit for data sharing; with data discovery tools; and with an open infrastructure for sharing and reusing information about the links between data and articles.

Looking for the “what” and the “why”

In addition to monitoring feature usage and observed datasets, we conduct research to help us understand why the solution is working — or not. Readers of PLOS articles who use the feature will be invited to participate in a survey and a subset will be invited for interviews. Authors submitting manuscripts after the feature’s launch will be asked if the feature had an impact on how they shared data. This research will complement the one we are also conducting with users of the PLOS pathogens Integration of Dryad. As experimenters, we are prepared to challenge our ideas and designs when analyzing the results, which will, of course, be shared with the community in the future.

*If readers notice any issues with the Accessible Data feature, please let us know by emailing [email protected]

Amanda J. Marsh