Ouvrir la Science

Open identifiers for open science
2019
Comity notes
A large number of identifiers coexist, essential to distinguish scientific objects on the Internet and find them. Good practice guides and action plans are being developed at the international level. The Open Science Committee defines a strategy to develop and adopt these identifiers with the objective of openness.

Open identifiers for open science

A concerted action for the benefit of researchers and institutions

English translation by Jean-François Nominé –Translation Unit, Inist-CNRS.

IdHal, OrcID, WosId, ArXivID, DOI, ISSN, ISBN, Handle, IdRef, VIAF, ISNI, etc.

Why so many identifiers? For whom? What for?

Almost the whole of science output is now indexed or available over the web. Millions of scientific objects (publications, data and other digital objects) produced by as many authors or contributors affiliated to hundreds of thousands of organizations can now be found through numerous and varied permanent identification systems.

These systems have been developed in recent years to meet the specific needs of each community or usage, but most are still under consolidation or even development phase.

The best known projects today are conducted by non-profit organisations but supported by private funds.

To achieve the goals of an open science, i.e. to ensure long-term availability of scientific information which is free for everyone, it is necessary to ensure that identifiers are based on an open, documented, free architecture and that they are endorsed and driven by and for the scientific communities.

The establishment of the Committee for Open Science paved the way for discussions led by French higher education and research operators aimed at improving the structuring of the most useful identifiers, accelerating their adoption by communities, and making them more open and sustainable.

What’s in an identifier?

An identifier is an opaque or explicit number or alphanumeric label which is machine or human readable. It uniquely and permanently identifies and retrieves an object, a document, person, place, organization, or any entity, in the real world and on the Internet.

Currently, the most well-known permanent identifiers or “PIDs” (Permanent IDentifiers) are DOIs (Digital Object Identifiers, https://doi.org/) for articles, book chapters, general documents or data sets, ORCIDs (Open Researcher or Contributor IDs, https://orcid.org/) for authors of publications, or ISSNs (International Standard Serial Numbers, https://issn.org) for journals. Handles (https://www.handle.net/) are widespread in some scholarly communities.

For researchers and scientific output producers, several identification systems are available the use of which varies according to countries or communities (ORCID, ScopusID, WosID, IdHal, ArXivID, etc.). As a result of the work of information specialists, they tend to be already rather well connected – or aligned – to one another compared to other types of identifiers, through large repositories like IdRef (https://idref.fr) in France or ISNI (https://isni.org) on an international scale.

However, to date there is no standard international identifier for organizations or affiliations. These entities are listed in national and international registries (iRNSR and AURéHAL structures in France or ROR, GRID, Ringgold,etc. globally) that are still imperfectly updated and connected, even if the reference systems used for alignments (IdRef, ISNI) help progress every day.

Despite these limitations, every day new research communities, institutions, and more and more individual researchers can be seen to adopt identifiers to facilitate the retrieval of their publications and data, on their own initiative or at the request of publishers or funders, and increasingly to ease open access to research results.

Identifiers wanted in research

The development of open science and FAIR principles for findable, accessible, interoperable, reusable research outputs go hand-in-hand. Each of these principles involves 4 requirements. For the fist principle, its first requirement specifically provides for the assignment of a unique identifier to the data  (“F1. (meta)data are assigned a globally unique and eternally persistent identifier”). Therefore, the role of unique identification may not be underestimated.

The European Union supports transnational projects aimed at developing the coordinated use of identifiers by researchers and institutions in member countries. Freya – Connected open identifiers for discovery, access, and use of research resources (https://www.project-freya.eu), is one of these identification systems in which France is represented in its Freya “ambassadors” programme. Freya seems to have an important role to play in defining the use of identifiers in the European Open Science Cloud (https://www.eosc-portal.eu), a European platform that is going to federate tools enabling researchers to store, manage, analyze and reuse large amounts of research data.

Since 2012, the Wikidata database, which was created by the Wikimedia foundation that supports Wikipedia, has gradually become the global hub for open identifiers. The main identification systems described above are aligned or upload their own data to Wikidata.

What can be done at the national level?

The national and international context provides an enabling environment for concerted action on identifiers for open science:

  • International discussions on useful identifiers for research (mainly those relating to people, organisations, publications, data) have led communities to work out best practice guidelines or even action plans.
  • In France, the national plan for open science, in a demonstration of strong political engagement, has prompted a concerted movement towards French membership in ORCID, support for Open Citations based on DOIs, and endorsing the FAIR principles, which imply adopting unique identifiers. The plan also provided for the creation of the Open Science Committee, which can now both draw up operational objectives and coordinate a national action plan involving field actors.
  • The experience gained in the development of publication identifiers (DOIs) and research data, the still partial progress made but in the adoption of people identifiers (ORCID and others), and the trial and error process in the identification of organizations, have helped to understand the importance of political action in this area. These field actors have acquired a fairly extensive hindsight into the strengths and limitations of PIDs and registries and want new services to gain better leverage to use them.

The year 2019 marks the launch of a national coordinated action for the development of sustainable identifiers for open science in France which will lead to the implementation of a real blueprint for action.

Considering the need and relevance of adopting and developing international ID systems, including for people, organisations, publications and research data, the Open Science Committee has defined several strategic goals:

    1. set up a coordinated national strategy which is very strongly in line with the international landscape, especially to make French research more visible,
    2. accelerate the adoption of identifiers by researchers, laboratories and institutions, and by the digital service providers of the innovation and higher education and research systems,
    3. identify secure business models to ensure sustainable deployment of identifiers,
    4. improvethe  interoperability and standardization of identifiers while ensuring control by the scientific community,
    5. contribute to the management and development of identification systems to secure long-term openness and independence.

Practically, the aims of this coordinated national action are to

  • foster the development of useful services based on real-life use cases to demonstrate the advantages of PIDs
  • channel bespoke communication toward researchers to guide them in their daily use of identifiers

As a result of the needs identified and the level of maturity of the solutions, the strategy is divided into 4 distinct parallel action lines to be implemented in an order of urgency in the following fields:

  • Entity and organization identifiers, based on the National Directory of Research Structures (RNSR)
  • Researcher identifiers, based on national membership in ORCID
  • Publication identifiers, by expanding the use of DOIs
  • Data software and digital object identifiers, based on the input of RDA France node, CoSo colleges and their working groups.

For more information: Des identifiants ouverts pour la science ouverte : note d’orientation.