Skip to main content

Hash utility to create Cryptographic Linkage Keys

Project description

CLK Hash

Python implementation of cryptographic longterm key hashing. Supports Python versions 2.7+, 3.4+

This is as described by Rainer Schnell, Tobias Bachteler, and Jörg Reiher in A Novel Error-Tolerant Anonymous Linking Code

Documentation Status

Build Status

Installation

Install clkhash with all dependencies using pip:

pip install clkhash

If the installation of bitarray fails on Windows you may need to install the appropriate Visual Studio C++ compiler for your version of Python; this is required because the bitarray library compiles a C extension.

Documentation

https://clkhash.readthedocs.io

CLI Tool

After installation of the clkhash library you should have a clkutil program in your path. Alternatively you can use python -m clkhash.cli.

This command line tool can be used to process PII data into Cryptographic Longterm Keys. The tool also has an option for generating fake PII data, and commands to upload hashes to an entity matching service.

$ clkutil generate 1000 fake-pii-out.csv
$ head -n 4  fake-pii-out.csv
INDEX,NAME freetext,DOB YYYY/MM/DD,GENDER M or F
0,Libby Slemmer,1933/09/13,F
1,Garold Staten,1928/11/23,M
2,Yaritza Edman,1972/11/30,F

A schema is required to hash this data. You can retrieve the default schema with

$ clkutil generate-default-schema fake-pii-schema.json

or you can make your own.

To hash this data using its schema, with the shared secret keys horse and staple:

$ clkutil hash fake-pii-out.csv horse staple fake-pii-schema.json /tmp/fake-clk.json
CLK data written to /tmp/fake-clk.json

Note the keys should only be shared with the other entity - and not with anyone carrying out the record linkage.

To use the command line tool without installing clkhash, install the dependencies, then run:

python -m clkhash.cli

clkhash api

To hash a CSV file of entities using the default schema:

from clkhash import clk, randomnames
fake_pii_schema = randomnames.NameList.SCHEMA
clks = clk.generate_clk_from_csv(open('fake-pii-out.csv','r'), ('key1', 'key2'), fake_pii_schema)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clkhash-0.11.0rc3.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clkhash-0.11.0rc3-py2.py3-none-any.whl (1.9 MB view details)

Uploaded Python 2Python 3

File details

Details for the file clkhash-0.11.0rc3.tar.gz.

File metadata

  • Download URL: clkhash-0.11.0rc3.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for clkhash-0.11.0rc3.tar.gz
Algorithm Hash digest
SHA256 494d4e273b157a84a23dea95aa9ffb9785efeceb7c5ad0d2b017464d9f334a7b
MD5 d4bce2b2ad422675343de847be9f9317
BLAKE2b-256 64d0cec7a941bfc686bfb19e5a9ac67b92c3ab0248dd898d1c1d8e95fd19d23f

See more details on using hashes here.

File details

Details for the file clkhash-0.11.0rc3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for clkhash-0.11.0rc3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e3478a1d400a8e076a51b1422bb4c2d0ddd49016f59cf79031dfd449f055c587
MD5 1128315c464bfc3daa75c260dd5c6bd7
BLAKE2b-256 2dc267039b63d619d0ee7621771a573d9fd1f7e4118cfd30cb29a8f7dab4606a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page