The Open Science Research Data Awards
For the first year, the Ministry of Higher Education and Research will be awarding the Open Science Research Data Awards. These awards highlight projects, teams and young researchers committed to best practices in the management, dissemination and reuse of research data.
The Open Science Research Data Awards were part of the second National Open Science Plan and reward young researchers, projects and research teams working on the management, description, dissemination and reuse of research data. A jury of experts chaired by Anne Laurent (Data Science Institute, University of Montpellier) selected the award winners.
There are three categories of awards:
- The “reuse of data” category which rewards young researchers who use data which is already available in their research projects and make the data they produce themselves reusable;
- The “creating the right conditions for reuse” category rewards research teams carrying out exemplary work in managing research data to make it reusable;
- The jury’s special prize category rewards ideas and projects that are exemplary in terms of opening or sharing data.
The award winners receive a trophy designed by Alix Nadeau, Rose Vidal, Hugo Bijaoui and Lorris Sahli, students of the Ecole des Arts Décoratifs in Paris inspired by the values of sharing and the common good of open science. Each trophy has a unique shape which makes reference to the description of the winning project and is generated by open source software code.
The “reuse of data” category – young researchers’ prize
- Victor GAY is currently an assistant professor and research fellow at Toulouse 1 Capitole University and defended his PhD thesis in economics at the University of Chicago in 2018. His project entitled “TRF-GIS: Un Système d’Information Géographique de la France de la Troisième République (1870-1940)” (TRF-GIS: A geographic information system for France under the Third Republic) traces the annual evolution of maps and statistics within the administrative structures of this period. The data resulting from his work have been published and have been the subject of data papers. Victor GAY is also the Scientific Advisor to the University Data Platform of Toulouse (PUD-T) of the PROGEDO data infrastructure.
- Naomi TRUAN is a post-doctoral fellow and linguistics researcher at the University of Leipzig working in collaboration with the Sorbonne University Linguistics Centre in Paris. Her project “FR-PARL, DE-PARL, UK-PARL” is based on a corpus of transcripts of French, German and British parliamentary debates. The data have been processed and enriched with linguistic and socio-political annotation, thus in turn enriching their analysis and making them easily reusable. The data is available on the ortolang.fr website and has already been reused and also indexed by CLARIN (Common Language Resources and Technology Infrastructure), the dedicated European infrastructure for language resources.
The “creating the right conditions for reuse” category
- The ‘EMM Survey Registry‘ is a social sciences project. The Ethnic and Migrant Minority Survey (EMM) Registry is a free online tool for searching and characterising quantitative surveys of ethnic and migrant minorities run in 34 European countries. The project involves a team of 12 people at the Centre for European Studies and Comparative Politics (Sciences Po/CNRS) and the National Institute for Demographic Studies (INED). It received financing from COST Action 16111 – EthmigSurveyData, the Horizon 2020 SSHOC project and the ANR Open Science project FAIRETHMIGQUANT. The project was submitted by Laura Morales (University Professor at Sciences Po Paris).
- The ‘NORINE nonribosomal peptides database and analysis tools‘ project is in the bioinformatics field. The project provides a knowledge base and software platform dedicated to “non-ribosomal peptides (NRPs)” for the biology, biochemistry, phytochemistry, pharmacology or marine biology communities. NRPs are produced by bacteria and fungi and can be used for antibiotics, anti-transplantation drugs, anti-inflammatory drugs or anti-cancer drugs, with penicillin being one of the best known examples. All data from the project are freely available. The project involves a team of 7 people (academics, research engineers, document specialists) from the CNRS CRIStAL joint research unit (University of Lille, CNRS, Centrale Lille), the Charles Viollette Regional Research Institute, the BioEcoAgro joint research unit (INRAE, University of Lille, University of Liège, University of Picardie), the BiLille platform, the UMS PLBS (University of Lille, CNRS, Inserm, Pasteur Lille, CHU Lille), and the Inria. The project was submitted by Areski Flissi of CRIStAL (CNRS/University of Lille).
- ‘MOBILISCOPE – interactive maps and graphs for the geovisualisation of population variations‘ is a geography project. Mobiliscope, the ‘all day towns project’, provides a geovisualisation tool based on data from major public surveys. Interactive maps and graphs show population variations in 10,000 French cities hour by hour and all day long. A long process of data anonymisation and processing of several data sources was required to achieve this. The project involves a team of 6 people led by Julie Vallée of the Géographie-Cités joint research unit (CNRS, Panthéon-Sorbonne University, Paris Cité University and the School of Advanced Studies in the Social Sciences (EHESS).
The “jury’s special prize” category
- The ‘French Archaeological Mission of Amathonte‘, an archaeological excavation mission on the island of Cyprus, is a history project. The project enabled the publication of data on the archaeological sites discovered during prospective surveying of the Amathonte territory between 1988 and 1992. It relies on the Huma-Num very large digital humanities infrastructure and uses Nakala, its data sharing, publication and dissemination tool. It integrates a vocabulary linked to the standards of the discipline in compliance with the FAIR principles aimed at making data findable, accessible, interoperable and reusable. The project involves a team of 4 people led by Anna Cannavò from the HiSoMA Laboratory (CNRS).
- The ‘MouseTube – mouse vocalisation recording files‘ project contributes to study of animal behaviour through its website dedicated to sharing mouse vocalisation recordings for research into these animals’ ultrasonic communication. Sharing this data enriched with metadata enables its 650 user laboratories to work effectively despite the disparity of practices while also reducing the number of mice used in experiments. The project was presented by Nicolas Torquet from the Institute of Genetics and Molecular and Cellular Biology (CNRS, INSERM, University of Strasbourg).
- ‘YAGO – A High-Quality Knowledge Base is a computer studies project that has been underway for 15 years. It is one of the pioneers of the semantic Web and is viewed as a reference in the field. Its researchers have constructed a knowledge base based on Wikipedia and Wikidata which can be used by many tools to provide knowledge about the real world, particularly in the context of artificial intelligence applications. Methods have been integrated to process and clean the input data and thus provide a very highly interoperable quality service that is a core component of many Linked Open Data projects. The project involves a team of 3 people from Telecom Paris and the Max Planck Institute for Informatics in Germany led by Fabian Suchanek (Telecom Paris).
ouvrirlascience. (2022, 8 juillet). Manufacturing of the Open Science Research Data Awards 2022 trophy. [Video]. Canal-U. https://www.canal-u.tv/131738.