The Scholarly communication unit (Pôle IST) provides researchers with a decision-making tool for sharing research data. It is in the form of a flat flowchart or a dynamic flowchart (in french).
For exhaustive information you can also refer to the summary analysis of the legal framework for research data (in french).
Political context in support of Open Science
Many countries have established a data sharing policy, for example the USA via the National Science Foundation, the UK with the recommendations of the UK Research and Innovation, Germany, the Netherlands, and Switzerland. The G8 of Science Ministers has also committed to data sharing as part of an Open Science policy.
At European level, many agencies, including the ANR, grouped together in the cOAlition S, strongly encourage that research data are made openly accessible when they are produced from research funded by coalition members. This is part of the Plan S which as well requires more generally that from 2021, scientific publications resulting from research funded by public grants must be published in compliant Open Access journals or platforms.
Since 2017, the European Union already requires that peer-reviewed scientific articles produced in projects that it was funds are to be made freely available without charge. The default provisions of the contract imply that research data linked with these publications should also be shared. However, the decision to share depends entirely on the recipients of the funding. In fact, not all data can be freely accessible and it is still possible to opt out of this obligation, either totally or partially, under certain conditions (potential industrial application, confidentiality, danger of publishing data,…). The EU’s philosophy on this issue is encapsulated in a single phrase:
« As open as possible, as closed as necessary »
In France, the National Open Science Plan (Plan national pour la Science Ouverte) also announces the obligation to disseminate publicly funded research data, but this obligation is already legally enshrined.
In their commitment to this Open Science approach, more and more publishers and funders require publications to be accompanied by the data associated with them.
Many journals now have an official Data Policy for data that supports the scientific content of an article. Some publishers recommend or even require data to be deposited in a particular data repository or repositories (e.g. SpringerNature Group, Geoscience Data Journal).
Rights ownership and dissemination
As a general rule, for École des Ponts researchers, it is clear that the École des Ponts holds the rights to the data. The exceptions concern contracts with an external partner or an enseignant-chercheur, a university status to which none of the École des Ponts' staff belongs.
As for the software code, the École des Ponts holds the property rights as a basic principle, regardless of the status of the researcher. If the code is commercially valued by the École des Ponts, the author agents only have a preferential right and must benefit from an incentive bonus.
In the interests of efficiency, to remain in line with scientific practices, and to encourage data and code sharing, the École des Ponts has delegated to researchers the implementation of data dissemination in compliance with the law.
See the Director of the École des Ponts decision released on December 19th 2018.
Data dissemination: obligations and exceptions
There is no unified data legislation in France. It is necessary to refer to different legal texts or to any existing contracts or agreements.
Data and code are considered as administrative documents: this implies the right to access them on demand, an obligation to distribute it free of charge and the right to reuse them freely (even commercially).
The École des Ponts cannot sell research data. The only open option is to commercialize tools or services associated with the data distributed free of charge. It is a principle of free competition that applies, where all actors have their chances based on the same public data.
From the right to share to the obligation to disseminate data
Since the Loi pour une République Numérique (Digital Republic Act), completed data must be published online if at least one of the following 4 criteria is met:
- they have been requested for communication according to the CADA procedure,
- they are recorded in the directory of the main administrative documents maintained by the École des Ponts,
- they constitute a database,
- they are of environmental, social, health or economic interest.
In all situations, the possibility or obligation to disseminate means that it has been verified that it does not infringe on the protection of privacy, medical secrecy, a person's business secrecy or defence secrecy.
The particular case of personal data
If the demand comes from the person concerned by these personal data, these data must be communicated to this person only. Otherwise, the data may be disseminated but only after anonymisation or pseudonymisation or if the person or persons have given their prior consent.
The special case of geographical data
The European INSPIRE directive aims to establish a geographical information infrastructure in the European Union to promote environmental protection. It makes it mandatory for public authorities to disseminate online the geographical data available in digital format, even if the data is not completed.
The Aarhus Convention stipulates that environmental information must be disseminated if requested. Unlike the INSPIRE Directive, the data must be completed.
Data issued from text or data mining
Whether to save time in the exploration of scientific literature or more often to explore a large volume of texts or data, researchers use tools to automate this work. The data from this work can therefore be disseminated, but the source texts can only be disseminated in accordance with the principle of short quotation.
Researchers sometimes take photographs that are then the basis of their work. To determine whether they can be disseminated, it is first necessary to check whether recognizable persons are on them. If so, diffusion is only allowed with the person’s consent. Otherwise, and even if there are recognizable buildings or works on them, nothing can prevent their dissemination ("panorama exception").
Data may only be disseminated in accordance with the T&Cs and in a non-substantial part. That being said, it is essential to take into account that researchers rework and analyze these scrapped data before publishing them as part of their research. The potential damage to the data producer is therefore zero and the risk is also zero.
The particular case of software code
Software code is considered as an administrative document just like the data. It is essential to check both the licenses applied to external software codes that are reused and to the existing agreements, to know what can be distributed and under which license. At the École des Ponts, the École des Ponts holds the property rights as a basic principle: the code is then subject to the same dissemination obligations as the data. Finally, the auxiliary elements of the software (user interface, specifications, documentation) are protected by common copyright, it is then necessary to verify who holds the rights.
Which license to apply?
When distributing code or data, if there is no other legal obligation (agreement, reuse of data or licensed code), the law stipulates a choice between two types of licenses. The texts can be found on the data.gouv.fr website.
Permissive licenses: they only protect the authorship and limit the authors’ responsibility. They offer complete freedom of reuse, redistribution, even commercial exploitation or license modification. They are to be chosen preferably :
- for software code: BSDL, Apache, CeCILL-B and MIT License
- for data: Open license Etalab
Reciprocal (or copyleft) licenses: they require, on the one hand, that the conditions of the original license be maintained and, on the other hand, that they be propagated to the entire derivative work. They may therefore restrict the commercial use of the data and code and may only be used in a proportionate way for reasons of public interest.
- for software code: Mozilla Public License, GNU GPL and CeCILL
- for data: Open Database Licence
See also "Research Data - 1. Introduction"
See also "Research Data - 3. Technical Questions"
The Scholarly Communication unit (Pôle IST) is constantly looking to learn more about this subject, but your community must also have information and standard practices. Feel free to contact us so we can find out how best to support you.