Data Access for Researchers

We participated with InternetLab in the consultation process with comments on a preliminary version of this regulation.

On 2 July 2025, the European Commission adopted delegated legislation relating to data access for researchers under Article 40(4) of the Digital Services Act (DSA). This is a long-awaited milestone; a rule that, if successfully implemented, could allow access to a large amount of information about the operation of online platforms that remains outside public knowledge to date. At CELE we have been following this conversation for some time and have participated - together with InternetLab - in the consultation process with comments on a preliminary version of this regulation.

The starting point: an expected regulation

The road to data is paved with difficulties. Two complementary clauses of the DSA ensure the delivery of information to researchers: article 40 (4) and 40 (12). Article 40 (4) guarantees access for authorized researchers (vetted researchers) and 40 (12), to all researchers in general - to the extent that they meet some requirements of independence, transparency and data processing.

Overall, the experience with Article 40(12), which mandates platforms to provide access “without undue delay to data, including, where technically possible, real-time data, provided that the data is publicly accessible on their online interface” has not been the best, as documented here and here. That is why the academic community waited so long for the regulation of Article 40 (4).

Legal limitations

The greatest limitation of this regime is given by the DSA itself. Article 40 is explicit that any request for data under its paragraphs (4) or (12) must be based on an investigation “that contributes to the detection, determination and understanding of the systemic risks” described in Article 34 or about the effectiveness of the measures provided for in Article 35 for their mitigation. This limits the set of data that researchers can access and conditions the uses they can make of them, reducing the universe of possible research.

This limitation could create tensions in the interpretation of the “systemic risks” governed by the DSA. If researchers - as is expected - begin to use creative interpretations of the different risks regulated there in their applications with the aim of obtaining as much data as possible, it would be consistent for these types of interpretations to be later used or taken into account by the authorities when applying the DSA to the platforms, with more than worrying results from the point of view of the principle of legality. For this reason, the formalization of a “two-track” system, which keeps the interpretations of systemic risks independent of each other under articles 34 (risk detection) and 40 (access to data) is imposed as a possible alternative to mitigate this problem generated by the law itself.

The second legal limitation is given by the “authorization” system for researchers (article 40 (4)). The DSA does not regulate this process and the delegated legislation does not provide clear parameters either. This grants important discretion to the Digital Services Coordinators in the evaluation of the researchers, an access key so that they can request data from the platforms.

The third major limitation has to do with the possibility that a data request may be rejected because the information is accessible by another means. In that case, the burden is on the applicant to demonstrate why they should access it anyway. This represents a problem given that the data access mechanisms of Article 40 (12), although formally available, have not worked entirely satisfactorily. In all cases, researchers seeking access to data must justify the “necessity and proportionality” of their access for the research that motivates it.

Structural limitations

Although the regulation does not exclude non-European researchers, it requires - in accordance with Article 40 (8) of the DSA - from applicants an analysis of the risks in terms of confidentiality, data security and protection of personal data, as well as the description of the technical, organizational and legal measures that will be put in place to mitigate them. These types of restrictions disproportionately affect academic institutions in the global south, which mostly do not have the necessary means to comply with these technical requirements, and could be forced to depend on the support of institutions with greater resources, which could condition the autonomy of their research agenda.

The importance of research in the DSA architecture The DSA created an interdependent system in which all actors play a fundamental role. Research is essential for its correct functioning because the DSA is based on the recognition of an enormous information asymmetry between the regulated entities and the regulator. In essence, this law regulates the unknown. This can only be overcome through a series of mechanisms aimed at obtaining information on the latter: transparency reports, requests for information, meaningful participation of civil society, independent audits, investigative powers and access to data for both regulators and researchers.

Without research, it is not possible to contrast with data the risk identification and mitigation reports of large companies, the foundations of the sanctions and investigations of the European Commission or any proposal for progress or change in existing regulations. Without access to data, in short, regulators cannot make good decisions in applying the standard or hold platforms responsible for the damage they generate.

The future of data access and its global impact The success of this initiative is far from guaranteed and will depend on its implementation. It is, however, an auspicious starting point.

The discussion of data access far exceeds the European scope. Clauses of this nature were incorporated into legislation such as the British Online Safety Act and into bills such as the Brazilian PL 2630. At CELE we believe in access to data as an essential means to inform public debate and to drive the creation and implementation of evidence-based public policies to guarantee the protection of human rights online.