Open source eKYC blockchain built on Hyperledger Sawtooth

By October 18, 2018 Blog

Guest post: Rohas Nagpal, Primechain Technologies

1. Introduction

Financial and capital markets use the KYC (Know Your Customer) system to identify “bad” customers and minimize money laundering, tax evasion, and terrorism financing. Efforts to prevent money laundering and the financing of terrorism are costing the financial sector billions of dollars. Banks are also exposed to huge penalties for failure to follow KYC guidelines. Costs aside, KYC can delay transactions and lead to duplication of effort between banks.

Blockchain-eKYC is a permissioned Hyperledger Sawtooth blockchain for sharing corporate KYC records amongst banks and other financial institutions.

The records are stored in the blockchain in an encrypted form and can only be viewed by entities that have been “whitelisted” by the issuer entity. This ensures data privacy and confidentiality while at the same time ensuring that records are shared only between entities that trust each other.

Blockchain-eKYC is maintained by Rahul Tiwari, Blockchain Developer, Primechain Technologies Pvt. Ltd.

The source code of Blockchain-eKYC is available on GitHub at:

Primary benefits

  1. Removes duplication of effort, automates processes and reduces compliance errors.
  2. Enables the distribution of encrypted updates to client information in real time.
  3. Provides the historical record of all compliance activities undertaken for each customer.
  4. Provides the historical record of all documents pertaining to each customer.
  5. Establishes records that can be used as evidence to prove to regulators that the bank has complied with all relevant regulations.
  6. Enables identification of entities attempting to create fraudulent histories.
  7. Enables data and records to be analyzed to spot criminal activities.

2. Uploading records

Records can be uploaded in any format (doc, pdf, jpg etc.) up to a maximum of 10 MB per record. These records are automatically encrypted using AES symmetric encryption algorithm and the decryption keys are automatically stored in the exclusive web application of the uploading entity.

When a new record is uploaded to the blockchain, the following information must be provided:

  1. Corporate Identity Number (CIN) of the entity to which this document relates – this information is stored in the blockchain in plain text / un-encrypted form and cannot be changed.
  2. Document category – this information is stored in the blockchain in plain text / un-encrypted form and cannot be changed.
  3. Document type – this information is stored in the blockchain in plain text / un-encrypted form and cannot be changed.
  4. A brief description of the document – this information is stored in the blockchain in plain text / un-encrypted form and cannot be changed.
  5. The document – this can be in pdf, word, excel, image or other format and is stored in the blockchain in AES-encrypted form and cannot be changed. The decryption key is stored in the relevant bank’s dedicated database and does NOT go into the blockchain.

When the above information is provided, this is what happens:

  1. Hash of the uploaded file is calculated.
  2. The file is digitally signed using the private key of the uploader bank.
  3. The file is encrypted using AES symmetric encryption.
  4. The encrypted data is converted into hexadecimal.
  5. The non-encrypted data is converted into hexadecimal.
  6. Hexadecimal content is uploaded to the blockchain.

Sample output:

  {file_hash: 84a9ceb1ee3a8b0dc509dded516483d1c4d976c13260ffcedf508cfc32b52fbe
     file_txid: 2e770002051216052b3fdb94bf78d43a8420878063f9c3411b223b38a60da81d
     data_txid: 85fc7ff1320dd43d28d459520fe5b06ebe7ad89346a819b31a5a61b01e7aac74
     signature: IBJNCjmclS2d3jd/jfepfJHFeevLdfYiN22V0T2VuetiBDMH05vziUWhUUH/tgn5HXdpSXjMFISOqFl7JPU8Tt8=
     secrect_key: ZOwWyWHiOvLGgEr4sTssiir6qUX0g3u0
     initialisation_vector: FAaZB6MuHIuX}


3. Transaction Processor and State

This section uses the following terminology:

  • Transaction Processor – this is the business logic / smart contracts layer.
  • Validator Process – this is the Global State Store layer.
  • Client Application (User) – this implies a user of the solution; the user’s public key executes the transactions.

The Transaction Processor of the eKYC application is written in Java. It contains all the business logic of the application. Hyperledger Sawtooth stores data within a Merkle Tree. Data is stored in leaf nodes and each node is accessed using an addressing scheme that is composed of 35 bytes, represented as 70 hex characters.

Using the Corporate Identity Number, or CIN, provided by the user while uploading, a 70 characters (35 bytes) address is created for uploading a record to the blockchain. To understand the address creation and namespace design process, see the documentation regarding Address and Namespace Design.

Below is the address creation logic in the application:


  • uniqueValue is the type of data (can be any value)
  • kycAddress is the CIN of the uploaded document.

The User can upload multiple files using the same CIN. However, state will return only the latest uploaded document. To get all the uploaded documents on the same address, business logic is written in Transaction Processor.

The else { part will do the uploading of multiple documents on the same address and fetching every uploaded document from the state.

4. Client Application

The client application uses REST API endpoints to upload (POST) and get (GET) documents on the Sawtooth blockchain platform. It is written in Nodejs. In case of uploading, few steps to be considered:

  • Creating and encoding transactions having header, header signature, and payload.(Transaction payloads are composed of binary-encoded data that is opaque to the validator.)

  • Creating BatchHeader, Batch, and encoding Batches.

  • Submitting batches to the validator.

When getting uploaded data from blockchain, the following steps needs to be considered:

  1. Creating the same address from the CIN given by User, using GET method to fetch the data stored on the particular address. As shown in  the following code snippet, updatedAddress is created by getting user input either from User (search using CIN in the network) or from the private database of the user (Records uploaded by the user). Similarly, splitStringArray splits the data returned from a particular address because of the transaction logic written in the Transaction Processor to upload multiple documents on the same address while updating state with the list of all the uploaded data (not only the current payload).

2. The client side logic is then written to convert the splitStringArray by decoding it to the required format and giving User an option to download the same in the form of a file.

5. Installation and setup

Please refer to the guide here:

6. Third party software and components

Third party software and components: bcryptjs, body-parser, connect-flash, cookie-parser, express, express-fileupload, express-handlebars, express-session, express-validator, mongodb, mongoose, multichain, passport, passport-local, sendgrid/mail.

7. License

Blockchain-eKYC is available under Apache License 2.0. This license does not extend to third party software and components.