Securing sensitive data with the blockchain
Introduction
This tutorial explains how to secure sensitive data using a hybrid solution that stores the sensitive data in a centralized database, and places a unique proof of all operations on the public blockchain.
Ever since the General Data Protection Regulation (commonly known as GDPR) was agreed upon by the European Parliament in April 2016, decentralized data storage such as DLTs have been facing a major challenge to comply with it.
One of blockchain's founding principles is that the data is as immutable as possible, which is inherently in contradiction to the EU citizens' personal data regulation. Other regions such as Asia and North America have also begun initiatives to protect personal data in a similar way.
Thanks to the architecture of the Ardor platform, it is possible to quickly develop a solution that combines the possibility to secure sensitive data with a public blockchain. When reviewing the solution described in the following sections, it is important to consider that in this solution the hashing of sensitive data is not considered pseudo-personal data due to the uni-directional function of hashing. In order to replicate the solution, the greater the personal data is hashed, the higher level of security is granted.
Architecture of the solution
The schema that describes how this solution works is as follows:
The basic concept is to combine a centralized database to store the personal data such as name, surname, nationality and credit score, with an immutable ledger so that this essential data can't be tampered with. The interface to either insert, query, delete or update these fields is managed through an add-on that extends the node's capabilities.
Once a data is inserted, a message is stored on the blockchain between the manager account and the account configured in the nxt.properties file as the receiver of the information.
Therefore, all operations except querying the centralized CUSTOMER table, leaves a public fingerprint so any changes can be easily tracked. In this example, the credit score is highly sensitive data and credit score rating companies have struggled to find a secure way to share this data with financial entities. Having this extension of the Ardor's software, the data can even be shared using blockchain encryption capabilities.
Centralized database
The table "Customer" that stores the sensitive data could be the following:
CREATE TABLE customer(id INT IDENTITY, " +
" name VARCHAR NOT NULL, " +
" surname VARCHAR NOT NULL, " +
" nationality VARCHAR NOT NULL, " +
" credit_rating INT NOT NULL)"
Where id represents an auto increment ID in the H2, with a sequence as the default. This data type implicitly is the primary key of this table.
The values are stored as can be seen below:
As shown on the architecture diagram, the add-ons functionality enables the possibility of creating new APIs.
Proof of concept
An example of the code structure could be as follows:
- Class for the managing the AddOn: GdprAddon
- It can take care of the personal bundler for sending zero fees when setting up the hashes as account properties, as well as managing the properties and setting up the add-on operations.
- Class for the centralized database operations: GdprDb
- This class would have the methods for interacting with the sensitive data.
- Class for the managing the centralized database: GdprVersion
- This class would create the table for the sensor data in case it was not created previously.
- Class for the inserting sensitive data: GdprInsertAPI
- This class can contain all the code related to the insert of the sensitive data.
- Class for the querying and check the data: GdprQueryAPI
- It can contain all the code related to the query of the data as well as checking their correctness against the blockchain.
- Class for the updating an entry: GdprUpdateAPI
- This class contains the code related to updating an entry and creating a new hash to store as a message in the blockchain layer.
- Class for the deleting an entry: GdprDeleteAPI
- This class contains the code related to deleting an entry and creating a new hash to store as a message in the blockchain layer for proof.
Add-on nxt.properties configuration
It is recommended to allow the following parameters to be configured for instance:
nxt.addOns=com.jelurida.ardor.gdpr.GdprAddOn
gdpr.secretPhrase=I won't tell you
gdpr.recipientAccountRS=ARDOR-GHKP-XWB5-XMZB-CTUE3
- gdpr.secretPhrase is the secret phrase of the master account that sends the messages with the data.
- gdpr.recipientAccountRS is the recipient of the messages triggered with any change of state of the data stored
Walk through the functionality
Once the node is up and running with the add-on configured, the "customer" table is created. Accessing from the add-ons page you will be able to send, query, update and delete data:
Click on the add-ons section of the API page:
It is possible to access to the operations developed:
Insert sensitive data
The operation gdprInsert is used for inserting sensitive data as shown in the following image:
As part of the insert operation, it triggers a transaction on the blockchain:
A message from the master account to the recipient account is sent. This message contains the following information:
- Operation
- Hash
- Customer Id
Query sensitive data
The operation gdprQuery is used for querying sensitive data stored in the customer table
There are two options:
- Option 1: without checking the data with the blockchain
- Option 2: checking the data with the blockchain
Update sensitive data
The operation gdprUpdate performs the following actions:
- It updates the field or fields in the customer table that are specified in the operation's arguments
- It sends a message with the operation, new hash and customer id from the master account to the recipient account
Delete sensitive data
The operation gdprDelete deletes all fields except the id in order to keep track of the next inserts in the customer table.
As this operation means a change of the state, it triggers a message with the operation, null hash and the customer id that was erased from the database
Conclusion and use cases
This proof of concept is valid for all use cases where sensitive data has to be stored in a public or private blockchain. When someone requests their right to be forgotten, you can simply clear all data in your centralized database and assign a new account to that user. The hashes that remain on the blockchain become useless without the matching unique hashes assigned to the data in the centralized database. This guarantees it is possible to erase the data if the customer asks for deletion and it enables the ability to secure and track sensitive data on the blockchain.
The centralized database can be part of a legacy environment where a cluster or a more robust centralized database where a high availability cluster can be designed.
In order to make the solution totally secure, the commit to the centralized database must be done once there is a confirmation of the message triggered as a 2-phase commit transaction.