Modern health technology relies on increased availability and quality of personal health information to improve the quality of treatment and prevent diseases. However, it is no secret that data in its non-identifying form is also valuable in the health care context, as it can drive innovation and inform decision-makers on public health matters and health system planning. Further, the surge of artificial intelligence in health care often requires large datasets to ensure the development, improvement, and overall accuracy of such technology.
Techniques to transform personal health information into non-identifying information can assist organizations in making their data assets incredibly valuable, as well as in preventing risks associated with privacy breaches. However, these techniques come with risks, especially in the health care sector, where personal health information is recognized as highly sensitive by nature, and where the possibility of re-identifying a dataset could seriously prejudice an individual.
This insight focuses on a recent decision (the Decision) of Ontario’s Information and Privacy Commissioner (the IPC), concerning a group of medical clinics (the custodians) collecting and using personal health information in the course of providing health care, which then sold de-identified information derived from this personal health information to a third-party corporation. To do so, the custodians retained the services of one service provider to de-identify the personal health information on their behalf, and the services of another service provider to enter a sale agreement with the purchaser of the de-identified information.
Key considerations when contemplating the transformation of personal information in health care
1) What constitutes personal health information and de-identified information?
Under Ontario’s health privacy law, the Personal Health Information and Protection Act (PHIPA), “personal health information” generally encompasses any identifying information concerning someone’s physical or mental health, and often includes a wide scope of information which may not always consist of health information on its own. For example, personal information collected for the purpose of, or in connection with, providing health care and information contained in the same record as personal health information may fall under the definition.
Under PHIPA, “de-identify” is defined as the “means to remove any information that identifies the individual or for which it is reasonably foreseeable in the circumstances that it could be utilized, either alone or with other information, to identify the individual.” Note that PHIPA was amended in 2020 to change this definition and allow for future regulations to prescribe expectations around the process and standards for de-identifying personal information.
This definition is consistent with the definition of “identifying information” under PHIPA, which means “information that identifies an individual or for which it is reasonably foreseeable in the circumstances that it could be utilized, either alone or with other information, to identify an individual.”
The IPC has previously stated in Dispelling the Myths Surrounding De-identification: Anonymization Remains a Strong Tool for Protecting Privacy (the Guidance), that de-identified information falls outside the application of PHIPA. According to the Guidance, applying legislative privacy protections to this information may result in “unintended consequences,” such as reducing the incentive of de-identification and imposing unnecessarily burdensome requirements to certain data. In last year’s comments from the IPC in response to the Ontario government’s White Paper regarding a possible provincial private-sector privacy law, the IPC supported a scheme where some information considered as “de-identified” would still be subject to privacy law but afforded greater flexibility for its use in certain situations, while “anonymized” information (as in, information considered as meeting a greater threshold of de-identification) would be outside “the four corners of the law.”
2) Who has authority to de-identify or anonymize the dataset? Is consent required?
The IPC made it clear: the process of de-identifying personal information is considered a “use” of that information under PHIPA. This finding is, according to the IPC, in line with the public interest of ensuring personal health information is protected at every stage of being dealt with and handled by custodians.
Canadian health privacy laws generally rely on a structure of accountability where a custodian, the entity prescribed under law, is responsible for the collection, use, and disclosure of personal health information for the purpose of, or in relation to, providing health care to a patient. In the course of providing health care, the custodian may need to use services offered by third-party service providers, for example, in order to document the personal health information or store it on secure servers. That said, the custodian remains the entity accountable for the information, even while it is transferred to the third party to act on its behalf, and may “use” the personal health information to transform it to a de-identified form, without consent, subject to certain conditions to be met. From the third party’s perspective, however, the mere access to personal health information does not in and of itself authorize an organization to de-identify such information in order to use it for one’s own purposes. Third parties may only de-identify personal information when acting on behalf of another entity if it has authorization to do so.
While “using” personal health information for de-identification purposes may be allowed without consent of the individual under PHIPA, custodians must be transparent with the public about their information practices by describing them clearly and explicitly in their privacy notice. In addition, the privacy notice must include information on the purposes of the information, including the purpose of a sale of the de-identified information to a third party, essentially its “destiny”, as the IPC describes it in her May 17 blog post.
3) How much information to disclose to the public to meet transparency requirements?
While the IPC determined in the Decision that the intention of the legislature was not to require custodians to describe each one of its information practices in its public notice, the custodian must at least provide a general description of information practices by giving notice of routine and wide ranging practices that affect all, most, or a substantial number of individuals, or by giving notice of significant practices. Accordingly, the custodian’s description of its information practices in relation to de-identification efforts must include, at a minimum, information about:
- The de-identification process, including the use of personal health information and its modification or where elements are being removed, in order to conceal individuals’ identities; and
- The purposes for de-identification as part of its information practices, whether it is research, sale, or licensing of the de-identified information to a third party, or for safeguarding purposes.
4) What is considered appropriate safeguarding?
PHIPA requires custodians to take reasonable steps to ensure the personal health information they hold is protected and kept secure at all times. In this circumstance, this means that the process of de-identification must address whether appropriate measures are taken to ensure that the information sold is properly de-identified and to ensure that it is sufficiently unlikely that the information can be re-identified.
In the Decision, the IPC was satisfied that the custodian had met its safeguarding obligations under PHIPA in light of the circumstances. The measures in place included:
- De-identification methods: Prior to the disclosure of the information to the data purchaser, the information was loaded onto a separate secure server, and de-identification algorithms –which used sophisticated de-identification techniques developed by industry-recognized privacy experts– were applied to the information;
- Re-identification Risk Analysis: As part of its efforts, the third party implemented a de-identification and masking strategy and conducted two re-identification risk analyses. Context risk was also assessed in questionnaires completed by the data purchaser which addressed, among other things, controls and safeguards of the data, accountability and contractual measures in place, and the motive and capability of the recipient to re-identify the data. An appropriate risk threshold was calculated based on the sensitivity and potential of injury to individuals in the event of re-identification and based on industry precedents for such thresholds, which revealed a low risk that the data could be used, alone or in combination with other reasonably available information, by the data purchaser to identify an individual. Note that this aligns with the aforementioned Guidance, which set out factors to consider when conducting re-identification risk assessments: the re-identification probability, any mitigating controls in place to discourage the data recipient from re-identifying the information, any motive and capacity for the data recipient to re-identify the information (for example, whether a party would have motive to harm the custodian or the individuals concerned by the information, whether any recipient of the data has the resources or expertise to attempt to re-identify the dataset and whether the dataset has commercial or criminal value), and the extent to which an inappropriate use or disclosure would consist of an invasion of privacy.
- Contractual Safeguards: Any sale agreement between a custodian and a third party must include additional privacy and security controls to ensure the de-identified data remains de-identified. These agreements are not only required under law, but also serve as a strong safeguard to assist in preventing the misuse of the information and in mitigating risks of harm to individuals. The sale agreement in the decision contained different provisions, namely: a prohibition that any identifiable information be provided to the data purchaser; mandatory notification requirement in the event that such information was inadvertently provided and a prohibition on the use of any of the identifiable information; and a requirement that the identifiable information be de-identified to the satisfaction of the third party and data purchaser. In addition, one of the third parties was contractually obligated to ensure no identifiable information be provided.
- Audit Rights: A provision in the sale agreement provided audit rights to the custodians where an independent third party could verity its practices, to ensure the data purchaser is not collecting identifiable information unless it is authorized.
- Confidentiality Agreements: As part of its practices, the data purchaser required all employees, consultants and sub-contractors to sign a data confidentiality agreement which prohibits re-identification as well as “data linking.” The data purchaser’s clients were required to comply with standard operating procedures on data security and usage that specifically prohibit re-identification.
- Probability of “data linking”: Assurance that data linking is not technically feasible as no common patient identifiers were included in the data, was provided to the IPC during the investigation.
- Privacy and Security Controls: An amended sale agreement explicitly required that a number of measures be in place, including:
- Signed confidentiality agreements prohibiting data linking and/or re-identification, by all employees, consultants, and sub-contractors;
- Access controls only allowing authorized staff to access and use data on a “need-to-know” basis and role-based access policies, which are regularly enforced and audited;
- Training requirements for all employees, consultants, and sub-contractors working with the data;
- Privacy, security, and usage standard operating procedures, specifically prohibiting re-identification;
- Retention, destruction, and storage policies;
- Record-keeping obligations regarding all signed data-sharing agreements and confidentiality agreements, and making such agreements available to the custodian upon request;
- A privacy program to monitor practices within the organization, including a breach protocol that is regularly reviewed and tested; and
- Regular internal and external audits where gaps are identified and mitigated.
Organizations should consider the above in their approach to the de-identification of the personal information, as there may be certain regulatory, legal and reputational risks associated with the practice.