Is the Right to Be Forgotten a Real Problem for Blockchain?
New technologies that give people power and convenience tend to have their shortcomings. Even the great power of the Internet — the lifeblood of today’s post-industrial society — is sometimes used for bad things. Apparently, blockchain technology has its own disadvantages. One of such is the inherent property of a blockchain ledger: immutability. It is also one of the key strengths of blockchain tech.
Adoption of immutable blockchain ledgers contradicts what appears to be a basic human right and one of the existing rights stated by the EU’s Data Protection Directive: “the right to erasure”. In some cases this conflict may cause substantial harm. Yet, a deeper dive into the details shows that even blockchain immutability has its loopholes.
The Right to be Forgotten in A Nutshell
In 1998, a Spanish national Mario Costeja González faced financial difficulties. To solve them, he decided to auction his house. By 2009, his problems have been solved but their consequences remained. The sale was mentioned in the online version of Spanish newspaper La Vanguardia, and Google search engine has indexed the page with the news. Even years later, whenever people searched for Mr. González’s name, Google offered the news about the auction that took place in 1998.
Eventually, Mr. González started to argue that all those search results cause serious damage to his reputation as he wasn’t in debts anymore. La Vanguardia hasn’t removed the information even though the Spanish Ministry of Labour and Social Affairs ordered it to do so, so the man went directly to Google asking to remove the information about the auction from the indexed sources. Google refused to do so also. On 13th of May, 2014 the European Court of Justice brought closure in the dispute between Mr. González and the search giant binding the latter not to show the search results related to the auction.
Various search engines help find and store information about almost everything, including people’s personal data. Moreover, even if a particular website that contains the information goes offline, the search engine keeps the cache, which means that this information is still accessible. To be clear, according to the ECJ, personal information is “any information relating to a living individual who is or can be identified either from the data or from the data in conjunction with other information that is in.”
In the aforementioned decision the ECJ ruled that:
- Information indexing by a search engine should be considered the “processing of personal data.”
- Considering the previous point, Google and other search engines are “data controllers.”
- Due to the fact that Google Spain is a subsidiary of Google Inc., the court ruled that the promotion and selling of advertising space offered by the search giant in Spain enables to apply Spanish data protection laws. Therefore the national data protection legislation should apply, regardless of where exactly information indexing took place, in Spain, in the US, or in any other country.
- Google should remove the links to websites that contain personal data, even if those websites are lawful and publish the data legally.
- A fair balance should be found between the legitimate interests of search engine users and the data subject’s (data owner’s) fundamental rights. The ruling states that such a balance “may however depend, in specific cases, on the nature of the information in question and its sensitivity for the data subject’s private life and on the interest of the public in having that information, an interest which may vary, in particular, according to the role played by the data subject in public life.”
Shortly after the decision, Google has established a system to help users exercise their right to claim for information removal via submitting a specific online request. The company keeps an option to reject the request, and such a reject may be challenged in front of a supervisory or judicial authority.
Said “claim for information removal” is similar to the “right to be forgotten”, also known as the “right to erasure”, recognized by the ECJ under the Data Protection Directive. This right gives lawful grounds to ask for the removal of the links to the websites containing personal information that is no longer irrelevant, inadequate, or excessive, considering the purposes for which the information was processed and the time that has elapsed.
The Right to be Forgotten vs Blockchain Immutability
According to the new General Data Protection Regulation (GDPR) of the European Union, that will come into force on May 25th, 2018, the right to be forgotten empowers any person to claim for correction or even removal of their personal data and information that affects them from public sources if said information is no longer necessary for the purposes it was collected for, or if the person has not given their consent for data storage or processing.
Enter the blockchain technology. Generally, any information registered in a blockchain becomes part of a unique, unrepeatable, and even indelible record. This unique feature is a problem and an advantage at the same time. On the one hand, it guarantees information security and allows the system to be able to defend itself against any illegal or duplicate transactions. On the other, it prevents the possibility of deleting it. In addition, the inability to correct undesired data can continuously cause substantial harm to every user. It appears that the main feature of this new technology conflicts with the fundamental principles of the right to be forgotten.
Let’s try to figure out what “immutability” means in terms of blockchain technology in the first place. Blockchain immutability, in general, means that as soon as a block with a certain transaction gets a specific amount of validations in blockchain network, it will never be replaced, changed or reversed. This feature is deemed the main in all blockchains and the basis of the “trustlessness”. It makes blockchain technology different from traditional databases where data can be freely deleted or edited by administrators.
Yet, this property is not absolute.
“In blockchains, there is no such thing as perfect immutability. The real question is: What are the conditions under which a particular blockchain can and cannot be changed? Do those conditions match the problem we’re trying to solve?” said Gideon Greenspan, CEO of Coin Sciences Ltd.
To find out if there is a conflict the regulations and blockchain technology, it is necessary to note that the technology underlies different kinds of solutions. Generally, there are two kinds of a blockchain network in existence:
- Public chains (or open blockchains).
- Private chains (the same as close or exclusive blockchains).
The main differences between these two are as follows:
- Consensus algorithms. In most cases, public chains use Proof-of-Work (PoW), although Ethereum and some other projects are going to use or already use Proof-of-Stake (PoS) or some hybrid consensus mechanism. Private blockchains mostly use well-known and developed consensus algorithms with authentified participants such as Practical Byzantine Fault Tolerance (PBFT) or modified Proof-of-Authority (PoA). PoW may be applied as well.
- Control over the network. Public chains are controlled by the wide community of core developers, miners and users. In turn, when it comes to private blockchains, the governance is carried out by a specific group of people or enterprises.
- Application. While public chains are mostly used for payments (as seen in Bitcoin) or as a platform for decentralized applications’ development (as seen in Ethereum), almost all private chains are used for solving specific business tasks such as medical data storage, property rights governance, bank settlements etc.
Thus, the right to be forgotten might be less of a problem for private networks, while contradicting with some key properties of public blockchains.
The Right to be Forgotten vs Public Chains
The most popular public blockchain networks are Bitcoin and Ethereum. These two blockchains are used to exchange values (even smart contract deployment might be deemed a value transaction for making it live on the network), and both store their networks’ transaction history. In both systems, there is an option to include somebody’s personal information to transaction metadata, thus uploading the information to a publicly available source. In theory, these are just the grounds for applying the right to be forgotten.
However, actually exercising the right to be forgotten, as stated by the GDPR, would be at least complicated.
Most public blockchains are p2p networks without any centralized authority. If the need to enforce the right arises, there would be no authority able to delete or hide the data, just like Google. Even in case of Ethereum, which is a public blockchain with somewhat of an administration, the developers’ team can’t just alter the ledger as they please.
The reason for that lies, of course, in the immutability. Changes to a public blockchain are made by achieving consensus between the validating nodes in the system. If the transaction containing undesired metadata is validated by the sufficient amount of participants, it will stay the way it is, as well as its metadata. Furthermore, nodes can’t just wipe a certain transaction from their disks as that would change the corresponding block’s hash and break a link in the chain. In that case, the next time the blockchain is scanned or shared, everything would fall apart.
To ensure realization of someone’s right to be forgotten, network nodes would have to rewrite the block without the problematic transaction, calculate the block’s new hash, then change the hash embedded in the next block to match and so on till the end of the chain. This is theoretically possible yet it would take hours or days to complete in a blockchain with millions of blocks and transactions. Even worse, the nodes engaged in this process will be incapable of processing new incoming network activity nominally, thus impairing the network’s work for a long time.
Apart from transactions, some blockchains work with smart contracts that also store information within the network. In case of Ethereum, if the undesired information is stored in a smart contract, there is a way to erase the smart contract, however, the latter should have a ‘selfdestruct()’ function included in its code. If this function isn’t set beforehand, the data would remain where it had been.
In addition, in a decentralized and pseudonymous network it is hard to identify the violator. And even if the data are somehow erased from the ledger, there is no guarantee that it won’t appear again, since, in a public blockchain, everybody can add data by executing a valid transaction.
In theory, there are several solutions. The most obvious one is to fork the chain from one of the blocks placed before the one with the data in question. This can be achieved through a so-called 51% attack, when more than a half of all miners or validators (or a single one with more than half of all the computing power) in the network in question agree to build a new valid chain from the selected block.
Another way is a hard fork prompted by the network developers, like the one that took place after notorious breach of the DAO. However, in both cases the actual undesired data will remain somewhere in the orphaned blocks. In addition, the required amount of time and effort to make changes will be to high in comparison with the traditional databases.
According to the report by the law firm Osborne Clarke, in cases when there is a possibility that some pieces of data might need to be erased, blockchain developers should consider the “tokenization” concept. It means that the data in the transaction blocks are replaced with unique identifiers acting as a link to removable “tokens” containing the actual personal data. This concept doesn’t seem hard to implement in the software (e.g. wallets), and would allow the system to comply with the GDPR by removing the personal data if required without compromising the integrity of the entire ledger.
The Right to be Forgotten vs Private Chains
As private chains are mostly used in more specific cases than public blockchains, it’s more likely that if the administrators of such chains would collect and store personal data, they would do it just like traditional services like Google. In this case, exercising the right to be forgotten won’t necessarily imply any problems. For companies and institutions that use private chains, the immutability is based on the validators’ and other network participants’ compliance with the existing rules, so they can modify their databases accordingly.
If it’s about medical or banking fields that store downright sensitive data, such as passport numbers or a Social Security Numbers (SSN), HIPAA-protected information (e.g., health, medical, or psychological information), and so forth, such an information must be stored and protected accordingly. In this case the information shouldn’t be considered publicly available, therefore the right to be forgotten shouldn’t be relevant.
Private chain validators can restrict access to the data to the specific groups (like banks, police etc.), so the case won’t fall under the “public source” notion. As validators have full control of the blockchain, they can easily change or delete the data from the chain and carry out the erasure this way. Moreover, in this case the network in question would have an entity in charge which can be asked to remove the data or at least forced to do so. It is safe to assume that for private blockchains the GDPR and the right to be forgotten won’t pose any big problems.
Another potentially useful concept is the chameleon hash (or redactable blockchain). It implies using a specific chameleon hash function that, if used instead of the original hash function (such as SHA-256 employed in Bitcoin), would output the same hash value as before, given that the block was edited. In other words, an authorized person can alter the contents of a block with hash A and use a chameleon function to generate the same hash A, but for the new input data.
Although the chameleon hash concept may enable ledgers to comply with the GDPR and with the right to be forgotten in particular, it is applicable only in certain situations that generally involve private ledgers. Moreover, chameleon functions should not be implemented in cases related to transferring financial value, as it allows the trusted parties to alter the chain for their benefit.
A Conflict for Some, an Obstacle for the Others
To sum up, the apparent conflict between the blockchain technology and the right to be forgotten may be critical in certain cases, but in terms of wider adoption and further development of both the legislation and the technologies it is solvable, to say the least.
The “tokenization” and the chameleon hash concepts may contribute substantially to the optimal solution for the existing contradictions between blockchain properties and the right to be forgotten. However, there have been no cases of their implementation so far.
It is important that the regulators and the blockchain developers work towards a mutually acceptable solution. Considering that the GDPR was developed without taking blockchain into account, it would be wise to start optimizing the legal framework for blockchains, and at the same time continue optimizing blockchains for the existing and future legal frameworks.
Join our Telegram channel to stay tuned on the recent developments in regulation of new technologies, and be the first to read expert opinions and editorials.