Institutions up and down the UK are thinking hard about how they can capture their academics’ research outputs onto their repositories at the time of acceptance. They need to do this in order to comply with the policy on OA and the REF.
How can Jisc help? In this post Steve Byford looks at one initiative that is designed to address this.
Back in June, my colleague Neil Jacobs set out some of the steps we’d been taking. He highlighted the Jisc Publications Router (formerly Repository Junction Broker), a service that could at least partly automate this process. This post is an update on what has happened since then.
The policy places responsibility to deposit firmly with the author, and we mustn’t lose sight of that. Any service that we offer is intended to help authors and their institutions to fulfil their responsibility – not to change it.
To address the challenge of automated delivery to repositories, we need to know
- What institutions need in order to make this work for them
- What publishers, for example, are willing and able to provide to help with this
- What is realistically possible, taking into account technical and other practical considerations
Answers to these questions are helping us work out what should be prioritised in order to provide the most useful support within a realisable timescale.
Capturing the content at the institution
To help populate their repositories, especially at acceptance stage, what information do institutions need? How would they like to receive it? What would be ideal or, failing that, useful?
These were some of the questions we asked in a survey during July of members of ARMA, UKCoRR, RLUK and SCONUL, with the kind collaboration of these organisations.
The respondents had a wide range of views on how best to receive information. We asked them about three options: having it pushed to their systems (using SWORD, for example), pulling it using an API or having it sent to them using an email attachment. The first two were preferred, but there was no agreement about which would be best.
Many (about a third of them) were uncertain: it is likely perhaps that they lacked the technical expertise to judge. We may well ask those who were open to further follow-up to consider further questions along these lines jointly with their technical colleagues.
The majority considered these options as at least workable. Maybe the Router should aim to offer a combination of alternative approaches.
Dealing with duplicates
We also explored these questions with repository managers, library colleagues and others who attended the Repository Fringe conference in Edinburgh at the end of July. We held a workshop session on the Router led by Muriel Mewissen of the project team at EDINA.
This highlighted the question of de-duplication. As the REF policy requires capture at acceptance stage, when the metadata will necessarily be incomplete, it follows that the Router will need to send a subsequent update upon official publication, if the full citation and licensing details are to be captured. Will university systems (repositories and CRISs) be able to marry them to the original?
But this is not a new problem created by the Router – still less by the REF OA policy: many institutions capture metadata or content from a variety of overlapping sources already. That is likely to continue, and already presents a challenge for them.
As a result, we have concluded that de-duplication is best handled locally, at the institutional level. The Router’s role will be to try to present its records in a way that makes de-duplication as painless as possible. The team will be listening carefully to the institutions who have signed up to its services to make sure we learn what works best for them.
As part of a package of measures to support repositories, we will be working with the Repository Support Project, which is based at the University of Nottingham, to provide a workshop on de-duplication. This will be one of a series focusing on technical issues, the dates of which will be announced soon, building on our co-ordination with the vendors of the systems that institutions use.
What can publishers provide?
Would publishers be willing to provide this information for us? Could they provide it at acceptance stage?
We have been discussing this with a range of different publishers, representing both those that are predominately subscription-based and some that are wholly open access.
The good news is that nearly all of them have said that they are open in principle to providing something at acceptance stage, or soon afterwards, for at least some of their content. This will depend on the detailed technical considerations. They use a range of different workflows, and these present some substantial technical challenges, both for the publishers and for the Router.
Several of them also indicated that they are still deliberating on the policy implications, and that may affect the extent to which they feel able to engage with it. On the whole, though, the response to the ‘in-principle’ discussions were positive – even enthusiastic in some cases.
Between them, these publishers represent a range of content from around the world and across the academic spectrum, so this is encouraging. We will be following up with more substantial technical discussions and proposals soon.
Priorities – what is most useful and achievable?
The Router predates the policy on OA and the REF. Its initial priority was to try to capture the full text, so content that was available under Gold OA terms seemed the low-hanging fruit. As a result, the Router entered into a relationship with Europe PubMed Central, for example.
The REF policy was a major game-changer, however, and has stimulated a substantial rethink of the Router’s priorities and direction. It would be unrealistic to imagine that the Router could provide a single overall solution to satisfy the policy’s requirements. But how can we help maximise repositories’ chances of capturing as much content as possible in time for the start of the policy in April 2016?
We hope that content providers such as publishers will be willing and able to supply full-text authors’ accepted manuscripts upon acceptance. But feedback from the Repository Fringe workshop, for example, suggests that institutions would find even metadata-only records helpful upon acceptance, alerting them to contact the author for the full-text manuscript, for example.
(The REF policy will provide a major incentive for researchers to deposit their work onto repositories, and this ‘notification’ approach could help institutions to engage them.)
We will take this into account as we aim to broaden the range of providers both of full-text and of metadata-only records. While we will still aim to transmit full-text content, such as that supplied by Europe PubMed Central, we will also aim to expand our sources to enable us to deliver usable records of as many relevant articles as we can.
This means that we are looking at sources that cover metadata across multiple publishers, as well as the publishers themselves. We are investigating a number of such sources, and we are considering the SHARE and CHORUS initiatives in the USA to see whether there are opportunities or lessons to be learned in supporting requirements in the UK.
If publishers can realistically let us have full text at or soon after acceptance as well, then that will be even better: it’s something we’ll continue to explore.