The second in this week’s series of five blog posts focuses on persistent identifiers (PIDs) for projects. Like the other posts, it is based on the key findings of a recent focus group that explored how project IDs could be integrated into the research ecosystem more effectively via the Jisc PID strategy.
Uniquely among the five priority PIDs we explored, many of the stakeholders involved in the discussion had only a passing acquaintance with “Project IDs”. Consequently, much of the workshop and focus group work involved discussing and coming to some agreement on a working definition of “projects”. What emerged was an understanding that, as well as covering more “standard” research projects, which may involve grants, working groups, experiments, data and publications, the term “projects,” in this case, also includes practice-based research, exhibitions, performances, and building projects.
In all of these contexts, and for the purposes of research integrity, project management, career reporting, impact assessment, and other critical administrative and analytical functions, understanding the ‘envelope’ can be as important as understanding any of the individual entities contained within it. So, to identify a project, we need a PID that can capture multiple entities and the relationships between them as a ‘compound object’.
The Research Activity Identifier (RAiD)
The Research Activity Identifier (RAiD) is just such a compound object identifier. RAiD was established in 2017 to serve as an actionable research activity monitoring record. It was envisaged as an identifier (a Handle) with associated metadata, which would capture the individuals, organisations, funding grants, equipment, and other research entities linked to the creation, curation, and preservation of a dataset.
It quickly became clear that RAiDs could operate very well as a project identifier, since capturing the entities, roles, and relationships around a uniquely identified activity profile can both reflect and complement traditional project management tools. In the context of open research, it also provides a record of the context for and contributors to each output, which is vital for research integrity and reproducibility.
RAiD is managed by the Australian Research Data Commons (ARDC), and is in active use in seven Australian organisations at the time of writing, with a further 24 having access to the system but no live integration as yet. 5,366 live RAiDs have been minted. RAiD records hold associations with a range of other identifier systems, including ORCID iDs for people; ROR, ISNI, and GRID IDs for organisations; DOIs for datasets and articles; and Handles for instruments. The RAiD team are in discussions with partners in the USA and the Netherlands who are planning pilot RAiD projects.
RAiD is in the process of being certified as an official international standard with ISO, which should be published by May 2021.
The Project IDs focus group consisted of university librarians, PID experts, research managers, and infrastructure providers. As noted, very few felt they had a high level of knowledge about this type of PID, so some time was spent investigating and clarifying its rationale. The group also discussed issues of governance, sustainability, and research(er) evaluation.
Similar to the discussions about other PIDs, the group agreed that there is a need for clarity on which sorts of organisation would be required to govern project PIDs themselves. In addition, given the multi-stakeholder nature of projects, specific concerns were raised over who has control of managing the metadata associated with a given project. What would the boundaries be? How would control be passed on over time? There were also questions around how to scale.
It was generally agreed that mandating specific policies, such as the compulsory use of PIDs in certain circumstances, would probably be less effective than developing a global policy, concordat, or best practice that institutions, organisations, and even individuals could sign up to (something like the FAIR Data Commitment Statement was suggested). This approach would then frame emerging practices as more than ‘mere’ administration and present a better opportunity for academic and research manager buy-in.
One participant suggested that a funder mandate could be accompanied by a call for other stakeholders, such as tool-makers, to engage alongside them on implementing project IDs. This point has been taken forward in the design of the use cases below, especially with respect to the practice-based research case.
More work needs to be done to clarify what project IDs represent, as well as their potential usefulness. Ultimately they could cut down administrative burden, and build trust in a greater range of research outputs as whole projects, their connected outputs, people, and grants. Even more powerfully, the full provenance remains clear in the longer term, vastly increasing the ability to track impact and engagement over time, and vastly diminishing the risks of broken links and orphan outputs.
Next step projects to build the use case are likely to include: working with the Practice Research Advisory Group (PRAG-UK) and Westminster University (Haplo) on using RAiD as a way of building portfolios for practice based researchers, with the Science and Technology Facilities Council (STFC) to use RAiD as a way to track the impact of projects across infrastructure, and with UK Data Service (UKDS) on impact and engagement. Then, a technical phase is planned, which may involve Oxford University’s Bodleian Digital Libraries (Data Futures on Annotations), the University of Manchester (integration in RDM / DMP workflow) and Birkbeck University/CoSector (on repository integration). (1)
1. CoSector is a digital services provider within the University of London.