Entrepôts de Données de Santé : The French Framework for Centralizing Health Data for Multi-Use Applications
In this article
What is an “Entrepôt de Données de Santé” under french law (Référentiel EDS )?
The "Entrepôt de données de santé" (EDS), which translates to data warehouse in French, is a framework established by the French Data Protection Authority (CNIL) for governing the processing of personal data within health data warehouses.
The term EDS includes centralizing, structuring health data from multiple sources (medical records, study databases, RWD, ...) into databases allowing multiple reuses for research purposes. Data lakes are considered as EDS under the french law.
The “référentiel EDS” that we will call the EDS Framework in this article published by the CNIL specifies how to implement such projects in France.
The EDS Framework applies to data controllers who, as part of their public interest mission, wish to collect data with a view to re-using it for the following purposes:
- Production of indicators and strategic management of activity for hospitals/clinics,
- Improving the quality of medical information,
- Optimizing health data coding as part of the National System of Health Data (SNDS)
- Development of tools to aid medical diagnosis or treatment, including AI solutions in the health field
- Pre-screening/feasibility studies
- Studies (RWD/RWE studies, meta-analysis, observational studies, …)
The following activities are not covered by an the EDS Framework :
- EDS implemented by a private company on the basis of its legitimate interest ;
- Centralization and structuration of data solely for care purposes, not research, including preventive medicine, medical diagnosis, the administration of care or treatment, or the management of healthcare services
- EDS only based on patients' consent
- Databases merged with SNDS databases (See MR-008)
Note : Private companies implementing an EDS must obtain CNIL's authorization. However, even when authorization from the CNIL is not required, the EDS framework remains the applicable reference for compliance.
Note : When the EDS is based on patients' consent, no authorization or EDS Framework compliance statement is required.
Procedure
The EDS Framework proposes a binary system:
- Either the entity Sponsor is 100% compliant with EDS Framework requirements. It declares compliance to the CNIL, and the EDS can be initiated without further formalities.
- Or the Sponsor is not 100% compliant with the EDS Framework requirements. The data controller must then obtain PRIOR authorization to implement the EDS in France.
Note: A private company implementing EDS must obtain an authorization from the CNIL with EDS Framework as references for building project.
The CNIL has a two-month period to respond from the date of the authorization request. In the absence of a response after this period, the authorization is considered tacit.Applications for CNIL authorization should be made on the CNIL website - Health authorization section - excluding research.
Note: The total duration of the submission process is more likely to be at least 6 months, taking into account the conduct of the DPIA as well as interactions with the CNIL during the examination of the submission.
Therefore, compliance with EDS Framework is a crucial step in the regulatory journey of a centralized health databases for research purposes involving French patients’ data. The data controller must, therefore, evaluate the compliance of its project with the EDS Framework.
Note: A data controller is the entity in charge of data centralization, structuration and allowing data reuse, even when such operations are carried out by intermediary/vendors. This includes AI company gathering health data from multiple sources to train AI models.
Requirements related to EDS
Requirements related to lawfulness of the EDS
Data governance
The data controller must implement dual governance for each EDS to ensure compliance with the intended purposes. Governance bodies can be shared if the data controller operates multiple EDS.
- Primary body - Steering Committee
A primary body (steering committee or equivalent) is responsible for determining the strategic and scientific orientations of the warehouse.The composition of the committee is free.
- Secondary body - Scientific and Ethics committee
A secondary body (scientific and ethical committee or equivalent) systematically provides a prior and rational opinion on projects on EDS data.
This Committee must include :
- One person involved in health ethics,
- An independent person from the data controller (e.g., not an employee of the data controller),
- Health and medico-social professionals,
- Researchers,
- Representative of users or a patient association.
Note: The EDS Framework defines profiles, so the same person can have several roles (e.g., the patient representative and the person independent of the controller) or several people can hold the same role (e.g., several persons involved .
Legal basis
The EDS Framework is only applicable to projectsimplemented by public entities in consideration of their mission of public interests.
This means that for a private company, the project can only be based on legitimate interests or patients' consent. When the project is based on the company's legitimate interests, an authorization from the CNIL is mandatory.
Requirements related to patients included in the EDS
Sources of data
Only data coming from the following data sources can be collected and included into EDS:• Medical records• Previous studies databases
Categories of data
Only pseudonymized patient data can be integrated into the EDS. If the source data is already pseudonymized, a new, unique pseudonym specific to the EDS must be generated.
Directly identifying data (name, first name, address, ...) can only be collected for limited purposes, including patients' contact to create new studies cohorts. The purposes of the EDS will determine if directly identifying data must be collected.
These data must be stored in a separate database from the pseudonymized patient health data.
Data retention period
The data may be kept for a maximum of 20 years from the date of initial collection in the context of care or research.After retention periods all data must be anonymized in accordance with G28 criteria or destroyed.
Information and rights of patients
Patients must be informed by the data controller that the data collected during their care will be included in the EDS. The purposes for data reuse and the procedures for exercising access and objection rights must be clearly emphasized in the information sheet.Depending on the situation, it may be necessary to inform the public of the existence of the EDS project.
In all cases, Public information is highly recommended as follow:
- Publishing of the information note about the EDS on the data controller's website
- Communicating about the EDS on social media, regional media, and through patient associations.
- Issuing a press release to inform about the establishment of the EDS
EDS Framework provides different information requirements based on patients' status :
- Information for new and followed-up patients (care data only)
New patients and those currently under follow-up must be individually informed about the creation of the EDS, typically via hospital information leaflet.
- Information for lost-to-follow up patients (care data)
Patients who are no longer under follow-up must also be individually informed about the creation of the EDS.
Note: The definition of ‘lost to follow-up’ is not harmonised by French law. The data controller therefore has some leeway in defining ‘lost to follow-up’ within the meaning of the EDS Framework while providing a rational to the CNIL during submission process.
- Information for former studies’ participants (research data)
If the EDS includes data from former research, participants must be informed individually about the reuse of their research data to constitute the said EDS.
Note: This implies for example in the context of clinical trial data reused by the sponsor in a EDS, to collaborate with investigational site to provide information to the patients.
Note : When authorization is required, the CNIL may grant approval with a waiver of patient notification. The data controller must provide a justification explaining why informing patients is impossible or too complex, and how they ensure the protection of patients' rights (refer to public information).
Patients must be allowed to exercise their rights by contacting data controller's DPO.Note: The agreement with the data provider should include a clause requiring the provider to collaborate with the data controller's DPO to ensure the effectiveness of individuals' rights.
Data retention period
The storage of patients' health data is limited to 20 years from the date of collection, with no archiving permitted.
Note: When authorization is required, an extension of the period may be requested if justified. For example the need for long-term periods depending on the pathologies concerned and the purposes of EDS
Requirements related to Healthcare Professionals
Categories of data
The following data can be collected related to healthcare professionals
• Identification Data: Name, first name, title• Professional Data: Function, service, and unit of practice• Contact Information: Professional email address and phone number• Professional Identification Numbers (excluding employee number)
Data is generally collected through source data (medical records, study database, …)
Information and rights of Professionals
Professionals must be informed by the data controller that their professionals personal data is included into the EDS.
The EDS differentiates information delivery requirements based on the status of the professionals.
- For Professionals currently employed by the data controller
Professionals whose data is included in the EDS must be individually informed in writing. If the data controller is the employer, the information can be provided via a letter or email attached to the payslip or employment contract.
- For Professionals no longer employed by the data controller
If the data controller is not the employer, they must individually inform each professional in writing, whatever the means.
Note: The agreement with data provider should provide clause for provider’s collaboration with data controller DPO to ensure rights effectiveness.
Professionals can exercise their rights by contacting data controller's DPO.
Security requirements
The EDS Framework from the CNIL provides 46 security measures concerning 12 themes to be implemented by the data controller.Note: If a security measure is not implemented, the data controller must document in the DPIA how other security measures are used to ensure the same level of protection for personal data.
The 46 security measures are:
Logical and cryptographic partitioning
• SEC-LOG-1 - Logical and Cryptographic Segmentation: Personal data in the EDS must be collected and stored on systems and databases separate from those used for patient care.
• SEC-LOG-2 - Encryption at Rest: Personal data must be encrypted at rest with a formalized key management procedure.
• SEC-LOG-3 - Encrypted Backups: Backups must also be encrypted.
• SEC-LOG-4 - Separation of Directly Identifying Data: Directly identifiable data or correspondence tables stored in the EDSEDSmust be logically separated from pseudonymized data using cryptographic means.
• SEC-LOG-5 - Access Management: Access to separated data categories must be managed through different user accounts or profiles.
• SEC-LOG-6 - Encryption of Genetic and Location Data: Genetic or location tracking data must be encrypted with a specific key distinct from other data.
Setting up and feeding the warehouse
• SEC-ALM-1 - Data Collection Security: Data collection flows must have appropriate security measures, including regular purging of transit directories and strict access control.
• SEC-ALM-2 - Secure Data Entry Software: If the EDS is manually fed through data entry software, access to these software must be secured with strong authentication.
Data pseudonymization
• SEC-PSB-1 - Re-Pseudonymization: Initial patients numbers, such as patient file numbers, cannot be reused directly as identifiers in the EDS; only a unique pseudonymous identifier can be used.
• SEC-PSB-2 - New Pseudonyms: Existing pseudonymized datasets integrated into the warehouse must be assigned a new unique pseudonymous identifier.
• SEC-PSB-3 - Pseudonymization of Professional Data: Data related to healthcare professionals must be pseudonymized.
• SEC-PSB-4 - Masking of source data: Unstructured documents (medical records, PDF scans, ...) added to the EDS must remove or mask identifying data before integration.
• SEC-PSB-5 - Comprehensive Masking Process: The masking or removal process must apply to visible content, metadata, and file attributes.
Physical access to data
• SEC-PHY-1 - Physical Access Security: Physical access to servers and premises hosting the EDS infrastructure must be secured with adequate protection measures.
Management of authorizations and logical access to data
• SEC-HAB-1 - Profiles: Different authorization profiles must be defined to manage data access on a need-to-know basis.
• SEC-HAB-2 - Granular Access Control: Access granularity must be defined for each authorization profile, respecting the separation of correspondence tables and directly identifiable data.
• SEC-HAB-3 - Authorization Validation: Authorized personnel must be individually validated by a governance body or hierarchical superior.
• SEC-HAB-4 - Restricted Privileged Access: Privileged access rights for administration and maintenance must be limited to a restricted team and kept to the strict necessary.
• SEC-HAB-5 - Regular Review of Authorizations: Access permissions must be reviewed regularly, at least annually, and at the end of each research project.
• SEC-HAB-6 - Immediate Revocation of Permissions: Access permissions must be revoked immediately upon the withdrawal of authorization.
Authentication
• SEC-AUT-1 - Strong Authentication: Access to personal data must require strong authentication involving at least two distinct factors.
• SEC-AUT-2 - Internal and External Authentication: Strong authentication must be implemented for both internal and external access to the warehouse.
• SEC-AUT-3 - Server Authentication: All data transmissions to and from the warehouse must be performed by mutually authenticated servers.
Projects space
• SEC-ESP-1 - Internal Workspaces: Data in the EDS must be manipulated by researchers only within internal workspaces specific to each research project.
• SEC-ESP-2 - Minimization of Source Data: Imported datasets in a specific research project workspace must be minimized and limited to necessary data.
• SEC-ESP-3 - Reuse of Pseudonyms: In cohort follow-ups, the same unique pseudonymous identifier can be reused in multiple workspaces.
Data exports
• SEC-EXP-1 - Export of Anonymized Data: Only anonymized datasets can be exported.
• SEC-EXP-2 - Pre-Approval of Exports: Data exports must be pre-approved by a responsible party to validate the principle.
• SEC-EXP-3 - Monitoring of Exports: Data exports must be automatically or manually monitored to verify anonymity, with non-compliant exports quarantined and manually reviewed.
• SEC-EXP-4 - Anonymous Data Outputs: Systems for producing indicators and strategic management must only allow anonymous data outputs.
• SEC-EXP-5 - Compliance with Anonymization Requirements: Data outputs must be exported in compliance with anonymization requirements.
Users training and workstation security
• SEC-SEN-1 - Training and Awareness: Authorized personnel must be trained in medical confidentiality and regularly sensitized to data processing risks and obligations.
• SEC-SEN-2 - Confidentiality Charter: Authorized personnel must sign a confidentiality charter outlining their obligations and the sanctions for non-compliance.
• SEC-SEN-3 - Workstation Security: Workstations of authorized personnel must have specific security measures, such as named accounts, appropriate authentication, automatic session locking, encrypted storage, and filtering measures.
Logs / Traceability
• SEC-JRN-1 - Logging User Actions: User actions in the EDS workspaces must be logged, including connections, queries, and operations.
• SEC-JRN-2 - Administrator Access Logging: System and network administrator access must be through a specific system ensuring strong authentication and detailed traceability.
• SEC-JRN-3 - Regular Log Review: Logs must be regularly reviewed, at least bimonthly, and at the end of each research project.
• SEC-JRN-4 - Log Retention: Logs must be retained for a period between six months and one year.
Re-identification of patients
• SEC-REC-1 - Secure Re-identification Procedure: A secure operational procedure must be in place to ensure the exercise of individuals' rights and, if necessary, the lifting of pseudonymity and proper re-identification.
• SEC-REC-2 - Recontacting Patients for Research: A secure operational procedure must be in place to recontact patients for research participation, using medical criteria to select pseudonymous identifiers.
• SEC-REC-3 - Emergency Re-identification: A secure operational procedure must be in place to re-identify patients in case of medical emergencies.
• SEC-REC-4 - Restricted Re-identification Access: Authorizations and access for re-identification procedures must be restricted to a limited team and strictly necessary.
• SEC-REC-5 - Risk Management in Re-identification: Adequate measures must be in place to manage the risks inherent in re-identification procedures, ensuring they are only used for legitimate requests.
Other Requirements
A Data Privacy Impact Assessment ("DPIA") is required from the data controller. This analysis must include a presentation of the data flow, the identification of security measures, and the analysis of potential risks to the rights and freedoms of the data subjects.
Note: The DPIA is required by the CNIL during authorization process. By principle, no data transfers outside the European Economic Area are allowed.
Note: In case of authorization process, the data controller may ask for data transfer authorization by providing security measures as provided by the GDPR.
The agreements between the data controller and its service providers must respect the mandatory mentions of the article 28 of the GDPR.The controller must appoint a DPO, internal or external, and keep a register of processing activities.
Sign up for our newsletter
We like to keep our readers up to date on complex regulatory issues, the latest industry trends and updated guidelines to help you to solve a problem or make an informed decision.