The report focuses on software forges used in French Higher Education and Research Establishments. Its aim is to provide an initial overview of the software forges used in French Higher Education and Research Establishments, and to identify ways of making open science software more visible.

Higher Education and Research Forges in France – Definition, uses, limitations encountered and needs analysis

Daniel Le Berre  (University of Artois/CNRS French National Scientific Research Centre)
Jean-Yves Jeannas (Lille University/AFUL French Association of Free Software Users)
Roberto Di Cosmo (Inria French National Institute for Research in Digital Science and Technology)/Paris-Cité University)
François Pellegrini (Bordeaux University/CNRS/CNIL French Data Protection Authority)

Mai 2023

Read the report on HAL

The first software forge, called SourceForge, was launched in 1999, and was designed to help open-source software developers build their software collaboratively and distribute it to their users. Since then, software forges have become vital tools for all software developers. They feature collaborative development tools (for monitoring code modifications, and managing user tickets, contributions and projects) and they industrialise the process of creating software from their source codes (compilation, automated tests, quality assurance and distribution of deliverables) and communications tools such as forums.

Software forges also act as social networks for developers. Whenever devel- opers want to encourage people to use and make contributions to software, they need to come to a decision about which forge to choose based on the target audience and network. Targeting Higher Education and Research developers in France or abroad is one potential option. There are a number of identity federations such as RENATER or eduGAIN which have been providing long-term support for these collaborations. A number of Higher Education and Research forges provide access to these collaboration networks. Should a developer wish to open and share source codes coming from research with the wider society, there are two alternatives available to them – open-source community or commercial forges. Open-source community forges can be used to distribute open-source software within a community which has co-opted it. The challenge here lies in finding the right community for the software under development. Commercial forges boast many features with very few constraints, and often offer a range of services when the developed software is distributed under an open-source licence. These commercial forges include GitHub (owned by Microsoft), which is the most widely used, followed by BitBucket (owned by Atlassian) and GitLab (owned by GitLab Inc.).

Some forges, be they community-based or commercial, such as GitLab, can be self-hosted by Higher Education and Research Institutions, some of which have their own public forge. This report lists 40 of these types of forge as well as the forges for internal use only. These self-hosted forges are often easy to install, ranging from a simple executable for solutions such as Gogs, Gitea and Forgejo to a preconfigured software package integrated into Linux distribution for GitLab, for example. GitLab is basically a commercial forge (gitlab.com) based on open-source forge software that can be installed on premise. GitLab Inc.’s financial model is based on selling licences for addi- tional features to be used by online-service users or self-hosted forge administrators.

In reality, installing a self-hosted forge for internal collaborative development requires few human or material resources, and offers a wide range of solutions. However, as soon as developers want take this collaborative development externally, integrate solutions to industrialise software development and implement good development practice, more substantial efforts are needed, and the choice of solution may be led by different criteria such as the platform’s popularity, its functionalities and how robust it is.

In Higher Education and Research, developers of supporting software and software based on research work can choose between a number of forges to host their software. The simplest solution is when their institution has its own forge, particularly if no interaction is needed outside the institution.

When wider interaction is required, communities developing research software often look to online commercial forges. This is reflected by the winners of the first French open science open-source research software award laureates, with 9 projects being hosted on GitHub and one project on Source- Forge. The social network effect of “people go where most people are” and the international scope of the projects were the reasons for their selection. However, it really should be noted that commercial forges can suddenly disappear, as was the case with the Google forge, Google Code, which ceased operation after nine years presence in just a matter of months. The same thing happened with the Gitorious hosting solution. In addition, these forges have terms and conditions of use which each member must agree to as an individual, rather than on behalf of their institution.

Self-hosted forges are one way of mitigating this kind of problem. However, it may be the case that the solution selected is no longer being maintained, or no longer developed under an open-source licence. This is what happened with the SourceForge code, and it was maintained in a community version under the name “GForge”, which has itself changed licences to enable it to be maintained in a community version under the name “FusionForge”, to now end up now with an unmaintained software (the latest version of the software dates back to 2018).

Therefore, decisions around self-hosting and which forge to use are important. Of the 40 forges listed, 38 are GitLab platforms (the other two forges use Tuleap and Gogs respectively). GitLab’s domination can be explained by how easy it is to install and maintain, and the wide range of functionalities which are available.

Hence the interest in having a specific Higher Education and Research forge operating at any level (institutional, national, European or international). Institutional forges are the answer when software is being developed internally within an institution and an institutional forge already exists. In this case, the functionalities available and access to data are managed, but they offer little or no scope for development between multiple institutions. Where an institutional forge does not already exist or does not allow project owners to invite contributions from outside the institution, a national or European forge would provide an alternative to using commercial forges.

Throughout the mid-2000s, a French national forge, SourceSup, was set up by RENATER (which manage the national electronic communications network for technology, education and research) in order to get around these restrictions on interaction. However, this forge, which was a state-of-the-art platform when it was created, now only offers a set of tools that have fallen behind current development standards.

This report provides a comprehensive picture of the existing forges and practices in Higher Education and Research in France, and posits a number of observations and points of concern as regards the current situation.

 

Summary

Overview Background 1. Introduction 2. About the importance of forges

2.1. Monitoring changes to the source code 

2.2. Steering the entire life cycle of the software 

2.3. Enabling and encouraging collaborative work

2.4. Building software and analyzing its source code

2.5. Looking beyond the source code

2.6. Evolution of the target audience for forges

2.7. Open-source research software

2.8. Target audiences for forges within Higher Education and Research

3. The outlook for Higher Education and Research forges 

3.1. Major use of commercial forges

3.2. Using open-source community forges

3.3. The SourceSup national forge

3.4. 40 self-hosted public Higher Education and Research forges

3.5. Analysis of the situation 

3.5.1. Chronological history of the forges referred to in this document

3.5.2. Why are there so many self-hosted forges?

3.5.3. Difficulties in interacting with society

3.5.4. Support levels and the need for trust

3.5.5. An open-source or a proprietary license?

4. Points to consider when it comes to forges

4.1. A showcase platform or a simple tool?

4.2. Project organisation

4.3. Copyright management

4.4. Managing a project across a number of forges

4.5. More and more continuous integration services

5. An overview of the solutions 6. Conclusion