Research software as a pillar of open science
In February 2022, the Committee for Open Science created the Source Codes and Softwares College from the “Free and Open Source Software” group, acknowledging the consideration of software at the same level as publications and data, to which it is closely linked in the research process.
The college is currently divided into five working groups, each of which focusing on the complexity of software practices in research from a different angle:
- Identifying and highlighting software production in higher education and research;
- Tools and technical and social best practices;
- Valorization and sustainability;
- Liaison and national, European, and international animation; and
- Recognition and careers.
It affects all research fields
Software is an essential part of research processes in practically all research fields, including mathematics, physics, chemistry, biology, humanities ans social sciences and, of course, informatics. Depending on whether the program is a tool, a result, or an object of study, the approach may vary from one community to another, but its function is still crucial.
This activity is still largely undervalued and underappreciated. The extent of software production in research, both in quantitative and qualitative terms, is extremely difficult to evaluate.
One of the main goals of the “Source code and software” college is to highlight these productions and their importance for research.
Free and open source software has always had strong connections with the scientific world. A lot of software developed in research laboratories is distributed under free software licenses.
The free and open source approach is intrinsically intertwined with the need for openness in science and the associated digital commons, in order to ensure reproducibility but also to encourage contributions and the emergence of community software.
Identifying the software research object is crucial
In order to build and maintain a national catalog of research software, which is also the task of the Source codes and software college, it is necessary to clearly define the object we are talking about. Thus, the college focused on developing a standard definition of a research software. Following discussions with relevant communities, the college came up with the following definition:
“Research software is developed to meet specific scientific needs. It is designed, maintained, and used by scientists (researchers and engineers) and research institutions, possibly on an international scale. It can result from research work as well as support it, notably through publications before/on/around/with the software. It can be formalized in different ways (a platform, a middleware, a workflow or a library, a module or plugin of another software) and thus be in interaction in an ecosystem or on the contrary be more autonomous.”
This definition will contribute to define the scope of the catalog. Other criteria must be identified, to ensure that all the situations encountered in the research world are covered. In particular, the context of development can vary greatly, from software created as part of a thesis, scripts used to analyze data, or prototypes, up to community codes.
Finally, the catalog can only be relevant if the information collected is adequate. Finding the most relevant metadata to describe a software item is yet another issue.
The college intends to rely on the existing initiatives, and in particular on the Software Heritage and HAL platforms, whose teams have already put a lot of time and effort on the subject.
Supporting the communities
The referencing of software developed in research laboratories should not be limited to cataloging. It will only make sense if the codes are truly reusable, either for purposes of reproducibility or for use in different contexts.
It is therefore important, in parallel with this referencing, to be pro-active in encouraging good practices to facilitate contributions, use, and sustainability, and also to strengthen support on legal issues.
A long-standing and asserted dynamic
The process of recognizing software as an essential element of scientific production is well underway, bearing on a long history of sharing and and strong ties with community practices developed in the context of free software. All the dimensions of the complexity of this object will be taken into account in the future work of the “Source codes and software” college, which will benefit from the feedback of the whole scientific community.