Relying on a dataset of more than one million articles published between 2013 and 2020, the study quantifies the costs generated for French research institutions by the open access publishing model based on authors paying article processing charges (APC). It also produces several scenarios for the evolution of these costs up to 2030.

Retrospective and prospective study of the evolution of APC costs and electronic subscriptions for French institutions

Antoine Blanchard (Datactivist)

Diane Thierry (Datactivist)

Maurits van der Graaf (Pleiade Management and Consultancy)

December 2022

A journal article dataset has been developed with metadata of articles by France-based authors in the period 2013-2020. The purpose of this dataset was to form a basis for the retrospective and prospective analyses of the total costs of APCs paid by French institutions. APCs are the article processing charges (APCs) that researchers must pay to have their articles published in some open access journals.

The dataset has been built as follows:

  • BSO data as the basis: the basis of the dataset is formed by data from the French Open Science Barometer (called BSO as in Baromètre de la science ouverte), which we have defined for this study as being the ‘total universe of journal articles by France-based authors’. Compiled and extracted from Unpaywall database and enriched by other sources, BSO is focused on publications with a DOI. It uses an algorithm to assess if an author is affiliated with a French institution and collects APC information using an algorithm based on the OpenAPC data
  • enriching the BSO data with data from Web of Science and OpenAlex: for the assessment of APC costs for French institutions, information on the affiliation country of the corresponding author (who pays the APC), information on whether an APC has been paid, and the amount of APC are crucial elements. Therefore, we enriched the BSO data by adding information on the corresponding author derived from the Web of Science. Another enrichment with OpenAlex data took place in order to ascertain whether the article was Open Access as a result of an APC payment (APC-paid articles). The data were further enriched with data from Couperin and QOAM.

The main results of the retrospective analysis of the above-described dataset regarding the numbers of APC-paid articles with a France-based corresponding author:

  • the total cost of APCs has tripled in the period 2013-2020. The major driver is the growth of articles in Gold OA journals, i.e. fully open access journals with APCs (without this growth, the APC cost would have been multiplied by 1,69 instead of 3)
  • the journal publishers and dissemination platforms have been categorised in four tiers, each publishing between 20% and 32% of all articles with a French co-author in 2020: tier 1 with the top publisher (Elsevier), tier 2 with three publishers, tier 3 with 16 publishers and tier 4 with the long tail of publishers (n=1995). The highest growth rate of APC-paid articles by France-based corresponding authors is seen in journals published by tier 2 publishers (Springer Nature, Wiley and MDPI)
  • more than three quarters of the APC-paid articles by France-based corresponding authors are in the fields of biology and medical research.

The main observation regarding the price evolution of the APCs is that the APC-level for HybridOA articles started in 2013 at a high level (2 453 € in average) but have been stable over the years with 2 488 € in average in 2020. The APC-level for articles in Gold journals started considerably lower with an average APC in 2013 of 1 395 €. However, a rapid increase in the level of APCs for Gold OA articles has been observed, with an average APC of 1 745 € in 2020.

This has led to a calculation of the total cost of APCs paid by French institutions between 2013 and 2020 (see figure below).

Evolution of total cost of APCs paid by France-based corresponding authors, overall and per open access color, after reconstructing missing data

In addition to the above-mentioned article dataset, 2019 and 2020 data from the ERE survey by Couperin (the national consortium of research performing organizations that negotiates with publishers the prices and conditions of access to research publications for the benefit of its members) were analysed in order to assess the total subscription costs of journal packages for French institutions. This data was compiled in Microsoft Power BI with additional information about the categories of the products, the publisher tiers, and the respondents to both surveys. This resulted in an estimate of ca. 87,5 M€ for the total expenditure on journal packages by all Couperin members in 2020. An analysis by Couperin of the ERE surveys 2014-2021 showed that the price increases per year in this period varied between -1,95% and +7,22% with an average price increase of 1,76% per year.

The evolution of the total journal subscription costs based on this average annual growth rate results in an estimated 97,5 M€ in 2030.

Using known and trusted data from our dataset (i.e. excluding articles for which the country of the corresponding author could not be determined, or whose APC price could not be precisely estimated), we developed mixed-effects models of APC prices as a function of other articles’ features.

These models were used to predict the evolution of the total cost of APCs for the period 2021-2030, in various situations, represented on the graph below:

  • under the assumption of continuing and unchanged trends (red line)
  • for a scenario with an acceleration towards Gold OA (green line)
  • and for a scenario with increase of Green OA and a transition from HybridOA towards Gold (blue line).

Finally, a situation was simulated where 90% of all articles by France-based corresponding authors would be Gold OA (and 10% would be published in Diamond OA journals) in order to assess the cost ceiling of such a 100% open access situation (yellow line).

Summary of predicted cost of APCs in various situations (2021-2030)






I. Building the article-level dataset

II. Preliminary analysis of the article-level dataset

III. Building the subscription expenditures dataset

IV. Analysis of the subscription expenditures dataset

V. Scoping the analysis


I. Evolution of articles with APC and France-based corresponding authors

II. Evolution of APC prices

III. Total cost of APCs paid in 2013-2020


I. Evolution of subscription expenditures

II. Building a model to predict APC prices

III. Scenario "trends continue unchanged"

IV. Simulation of full Gold APC

V. Scenario "rush"

VI. Scenario "relief"

VII. Conclusion