Contributed by: Samvad Partners
The concept of data privacy and personal data protection has seen significant traction over the last couple of years. With the decision of the Supreme Court in 2017 in Justice K.S.Puttaswamy (Retd) vs Union Of India, a fundamental right to privacy was recognised. With the release of the Justice Sri Krishna Committee Report on Personal Data Protection in 2018, a regime for the protection of an individual’s rights to their data was envisaged. This envisaged regulatory regime, through the draft Personal Data Protection Bill, is in the process of eventually becoming legally enforceable. However, most of the emphasis over the last few years has been on personal data, i.e. the data belonging to and capable of identifying an individual. Until recently, there has been little thought in India or elsewhere devoted to the regulation of other categories of data, including aggregated data sets, data belonging to a particular community or anonymised personal data. The Justice SriKrishna Committee did recommend a law to protect community data. Additionally, there were calls from the Government to ‘break up big tech’ and allow the sale of government data, all harbingers of an anticipated regulation of non-personal data. In September 16, 2019 a committee formed to recommend governance for Non-Personal Data, headed by Ex-Infosys co-founder Kris Gopalakrishnan. On July 7, 2020 the Non-Personal Data Framework Committee (“NPD Framework Committee”) releasted its draft report on the regulation of non-personal data (“Original NPD Report”). The NPD Framework Committee also invited comments on the draft report, and several stakeholders submitted comments raising concerns about a multitude of issues regarding non-personal data regulation. In December, 2020, the NPD Framework Committee released a revised report (“Revised NPD Report”) making a number of modifications and limitations to the regulation of non-personal data. However, the core of the framework, a regulatory regime for the acquisition and sharing of non-personal data from government entities as well as private companies, remains. In other words, both the Original and the Revised NPD Report envisage a regime for the compulsory acquisition, regulation and sharing of non-personal data. It is important to note that a regulatory regime of this nature has not been envisaged elsewhere in the world and, accordingly, a legislation deriving from the NPD Framework Committee’s reports could be the first of its kind worldwide. However, it is an open question as to whether the NPD Framework Committee’s Reports address this vital issue with sufficient nuance and detail. This article seeks to lay out the salient points of this proposed regulatory regime, as reflected in the Revised NPD Report.
Understanding Non-Personal Data
Non-Personal Data (“NPD”) is defined in two ways in the NPD Reports. Firstly, NPD is defined through a negative definition, as data which is not ‘Personal Data’ (as defined under the PDP Bill), or data without any Personally Identifiable Information (PII). PII is information that is capable of identifying the individual from which it originated. Secondly, NPD is defined through a positive definition, as data that either:
never related to an identified or identifiable natural person, such as weather data, industrial machine sensor data, or other such data, or
data which was initially personal data but was later made anonymous. For instance, aggregated personal data which has undergone anonymisation techniques to the extent that individual specific events are not identifiable would qualify as anonymous data.
The Revised NPD Report goes on to provide several illustrations of scenarios describing non-personal data collected by public entities as well as private entities. It is important to note that NPD, as a result, pertains to all categories of data collected by the Government or by private entities that is not personal data. It could pertain to government data about pollution, or to aggregated government hospital health data pertaining to patients. It could also pertain to aggregated or statistical data collected by a private company pertaining to its employees, vendors, partners, product orders or customers, or to a private company’s satellite remote sensing information about India’s forest coverage. In other words, the scope of non-personal data is vast and capable of extending to most databases and datasets collected and utilised by companies for their analytics operations.
The Original NPD Report had categorised NPD into three groups, Public, Private and Community NPD. Each Such category had specific regulations governing its ownership as well as sharing regimes. The Revised NPD Report has done away with this explicit categorisation. However, the treatment of NPD pertaining to these categories remains in the Revised Report.
Concept of a Data Business and Obligations
Data Business: The NPD Framework Committee in both of its reports had defined the concept of a Data Business, which are public and private sector entities who collect, process, store or otherwise manage NPD. The concept of a Data Business is not limited to any industry sector but is instead a horizontal classification, capable of envisaging businesses from sectors such as banking and finance to internet services to universities to non-governmental organisations, if they meet an as yet undetermined threshold of data collected or processed. The concept of a data business is applicable to Government entities as well.
Meta-Data Sharing Obligation: All Data Businesses are required to share meta-data (i.e. data that provides information about the data being collected by the business) under the NPD Framework. For instance, if a hospital is collecting patient data, the hospital would be required to share data about what data they are collecting, for instance, the fact that they collect “patient name”, “age”, “weight”, “symptoms” and other such data pertaining to the patient. This meta-data is envisaged as being stored in open access ‘meta-data directories’.
Registration Obligation: All data businesses are required to register with the authority envisaged under the Revised NPD Report once they achieve certain threshold parameters. Such parameters could include gross revenue, number of consumers/households/devices handled, percentage of revenues from consumer information and other such factors.
Community: Both the Original and Revised NPD Reports envisage community rights over NPD. Under both reports, the definition of a community is broad and nebulous, encompassing any group of people that are bound by common interests and purposes, and involved in social and/or economic interactions. In other words, under the present regime, a community could be a specific religious or ethnic sub-group such as the Orthodox Syrian Christians of Kerala, or it could also be a culturally, linguistically and ethnically diverse group focused on the playing of board games online. The NPD Framework Committee envisages that communities should be able to benefit from the processing of NPD pertaining to them, and that the community should be able to raise a complaint with a regulatory authority about harms emerging from sharing NPD about them.
Data Principal: The Data Principal is the entity to whom the NPD being collected relates.
Data Custodian: The Data Custodian is the entity which collects, stores and processes data in the best interests of the data principal. The data custodian may be either a private entity or a government entity. Data custodians have a responsibility towards responsible data stewardship and a duty of care to the concerned community in relation to handling non-personal data related to it.
Data Processor: Similar to the PDP Bill, Data Processors are entities that process Non-Personal Data on behalf of a data custodian. The data processor will be considered a data custodian only for data that it collects, stores, processes, uses, etc. as part of its business operation, and not for data processed on behalf of any data custodian. This concept of a data processor and the exemption from Data Custodian obligations was only introduced and clarified in the Revised NPD Report.
High-Value Datasets (HVD): A HVD is a dataset that is beneficial to the community at large and shared as a public good, subject to certain guidelines. The HVD is a concept created under the Revised NPD Report, whereas the Original NPD Report envisaged the concept of Data Trusts instead.
Data Trustee: A Data Trustee is an entity that will exercise rights over community data for a community. The Data Trustee could either be a government entity or a non-profit private organisation. The Data Trustee is responsible for creating, maintaining and sharing such HVDs in India. The Revised Report states that any group of individuals can constitute a Data Trustee. This could result in situations of conflict between different entities claiming to be the appropriate Data Trustee for a particular HVD. A Data Trustee is also considered a Data Business under the Revised NPD Report. Each HVD will be managed by a Data Trustee. A Data Trustee can request the creation of a specific HVD subject to certain guidelines and can request NPD from Data Custodians having data relevant to the creation and maintenance of the HVD. Data Trustees, in turn, are required to share access to datasets in the HVD with public or private organisations (not individuals) registered in India who request such data (“Data Requester”), for which the Data Trustee may levy a nominal charge.
Data Sharing Framework
Under both the Original NPD Report and the Revised NPD Report, a framework is envisaged with two enforceable regimes for the sharing of data. The Revised Report mentions that data sharing for sovereign purposes (such as national security or legal purposes) or business purposes (shared between two or more for-profit entities) is excluded from the ambit of the NPD Framework Committee’s Reports. The focus of data sharing under the NPD Framework Committee’s Reports, however, is limited to public good purposes such as, for instance, health, academic research or agriculture.
Regime 1: Sharing Data for HVDs: A Data Trustee examines the meta-data compulsory provided by different Data Custodians. The Data Trustee then identifies the meta-data relevant to the creation, maintenance or improvement of a HVD and, in accordance with specific guidelines and procedures, requests the relevant data from all Data Custodians. Such data is then formed into a HVD or used for the maintenance or improvement of a HVD. The Revised NPD Report also envisages a granular system of differentiating data being requested from a HVD. Raw, factual or transactional data (the base level of data provided or observed) cannot be requested as a complete dataset, though subsets of data fields can be requested. Aggregate data (such as means or medians of raw data) can be requested. Finally, inferred data (such as insights developed by combining different data points) may not be requested from private entities but may be requested from public entities except in cases of national security. By way of illustration, the details pertaining to a taxi trip of a traveller would be raw data, aggregated details of daily taxi trips of all travellers in a locality would constitute aggregate data, and insights into traveller behaviour, travel patterns or payment preferences pertaining to such trips would be inferred data. It is important to note that the Revised Report specifically excludes the sharing of private companies’ trade secrets or other proprietary information regarding their employees / internal processes and productivity data from its ambit.
Regime 2: Requests for HVD Data: Once the HVD is created under the management of a Data Trustee, a Data Requester can request a Data Trustee for access to the datasets in the HVDs managed by the Data Trustee. Though the Revised NPD Report examines data sharing for the creation of HVDs in much detail, there is not much detail provided on the regime for managing requests for HVD data from data trustees.
Non-Personal Data Authority
The NPD Framework Committee proposes the creation of a separate authority for administration of the NPD Framework, the Non-Personal Data Authority (NPDA). The NPDA is intended to work harmoniously and without conflict with other authorities like the Personal Data Protection Authority and the Competition Commission of India. Its function consists both of enablement (creating a data sharing framework and rules for the same, as well as managing the directory consisting of meta-data provided by Data Businesses) and enforcement (adjudicating data sharing requests, address privacy and prevent reidentification of anonymised data, and establishing rights over Indian NPD).
Intellectual Property Rights in the Data Sharing Regime
The Revised Report examined the conflict between the NPD sharing regime and intellectual property rights from the perspective of copyright law and trade secret law.
Copyright Law: The NPD Framework Committee, in the Revised Report, determined that, though copyright protection may vest in databases where data has been compiled or organised by the exercise of some skill or creativity, where the grant of copyright over a compilation of data would in effect amount to conferring a property right over the underlying data, in such circumstances the database would not be copyrightable. The NPD Framework Committee recommended, on this basis, that data sharing may be mandated only for designated high value data-sets, where the fields for data to be shared are also pre-determined (which are expected to be a subset of the fields in the original database), and are relatively straight-forward. In the opinion of the NPD Framework Committee, if the extraction is done per given pre-set fields, it would not violate the copyright in the database from which such extraction is done.
Trade Secret Law: The NPD Framework Committee, in the Revised Report, noted that trade secrets are protected only by contract law and equity in India, and not by any specific legislation. However, the NPD Framework Committee noted that, if the act of compiling or processing any non-personal data leads to an inherently secret compilation of data, then such a compilation of data would be entitled to trade secret protection. However, the NPD Framework Committee observed that this protection is unlikely to prevent the exercise of eminent domain of such data.
Treatment of Conflicts and Overlaps with Personal Data Regulation
The NPD Framework Committee, in both the Original and Revised NPD Report, has sought to minimise conflicts with the regulation of Personal Data. In the Original Report, however, ambiguities still arose, most important of which was the question of how to deal with NPD derived from anonymised Personal Data. Given that such anonymised Personal Data is capable of being re-identified with the right tools and techniques, this raised a question of how such data would be treated. While the Original Report acknowledged this situation and recommended that appropriate standards of anonymisation be defined to prevent or minimize therisks of re-identification, the Revised Report provides a deeper level of distinction between the regulatory regimes for Personal Data and NPD.
The Revised Report provides that NPD which becomes re-identifiable would once again be governed by the PDP Bill. Further, the Revised Report provided that mixed datasets that have inextricably linked personal and non-personal data would be governed by the PDP Bill. The Revised Report also recommended amendments to the PDP Bill to minimise conflict with the proposed NPD Framework, specifically suggesting amendment of Section 91(2) of the PDP Bill which allows the Central Government to direct any data fiduciary or data processor to provide any anonymised personal data or other non-personal data to enable better targeting of delivery of services or formulation of evidence-based policies by the Central Government, and other provisions deriving from Section 91(2) of the PDP Bill.
It is important to note that one of the major changes in the Revised Report is the treatment of individual consent for anonymised data. Whereas in the Original Report, prior consent from an individual was a requirement for anonymisation of personal data into NPD, the Revised Report envisages an opt-out regime which only allows the data principal (the individual providing the data) to opt out on a prospective basis. This provision is contrary to a number of provisions of the PDP Bill, in which prior informed consent is a significant theme.
Issues and Concerns with the Non-Personal Data Framework
While the NPD Framework is a new and unique approach to the sharing of non-personal data, there are several issues that arise from the reports of the NPD Framework Committee. At the very outset, the question of appropriation arises. The NPD Framework envisages the compulsory acquisition (subject to guidelines) of datasets and databases from public entities as well as private companies, and its effective redistribution through HVD sharing mechanisms. Private companies who have spent considerable time and effort collecting and acquiring data would find significant issues with such an initiative, which effectively amounts to exercise of eminent domain over databases. Additionally, a number of issues arise regarding the practical implementation of such a sharing mechanism. The definition of a community within the scope of the NPD Framework is currently nebulous. Additionally, the determination of which Data Trustee is the appropriate entity to manage a particular community’s data is not detailed, and the mechanism of adjudicating the relevant Data Trustee for a particular community’s data has not been detailed. This could result in conflicts between public and private entities, as well as conflicts between specific private entities lobbying for the position. Further, the actual mechanism by which Data Requesters may access HVDs has not been fleshed out with sufficient detail. While the NPD Framework is still at a conceptual level and has not proposed a draft legislation as yet, there are fundamental concerns about the desirability of a legislation that appropriates data from entities. Additionally, there are fundamental concerns with the working of such a regulatory regime that will need to be effectively addressed before determining whether a framework for acquisition and redistribution of databases is desirable or not.
Contributed by Samvad Partners
The above article has been authored by Mr Rohan K George(Partner). Samvad Partners.