Skip to main content

Document fingerprint generator

Project description

  • What is it?

    “fingerprint” module generates fingerprints of a document.

  • Fingerprint!!!! :(, What is that?

    Fingerprint is like the signature of the document. More specifically, in our context, it is the subset of hash values calculated from the document.

  • Okay i now know about it a bit, tell me how it calculates fingerprints?

    Generation of fingerprints of a document is a three stage process;

    • (1st phase) generates the k-grams from the standard string

    • (2nd phase) generates the hash values for each k-gram using rolling hash function

    • (3rd phase) generates the fingerprints from the hash values using winnowing

  • How can i install it on my machine?

    You can install it in basically two ways;

    • using source
      1. git clone git@github.com:kailashbuki/fingerprint.git

      2. cd fingerprint

      3. sudo python setup.py install

    • using pip
      1. sudo pip install fingerprint

  • Hmm! … How can i use it?

    It’s plain simple. Here’s an example for you;

            from fingerprint.fingerprintgenerator import file_content_refiner, FingerprintGenerator

            # You could get the standard string from a document as;
            s = file_content_refiner("path/to/file")
            # OR you could directly pass the standard string if you have
            s = "some sample string"

            fpg = FingerprintGenerator(input_string=s)
            fpg.generate_fingerprints()
            print fpg.fingerprints

>>Feel free to contact at kailash<DOT>buki<AT>gmail<DOT>com (kailash.buki@gmail.com)<<

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fingerprint-0.1.0.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fingerprint-0.1.0.macosx-10.7-intel.exe (67.8 kB view details)

Uploaded Source

File details

Details for the file fingerprint-0.1.0.tar.gz.

File metadata

  • Download URL: fingerprint-0.1.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for fingerprint-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8ef6bdd1887bfd36ff63de6b0ceacefba46aa523d661d5a6a711fc00154da05a
MD5 9f52e3ad7e56c2e2c4e2980cb8e61297
BLAKE2b-256 02f2d3692fd7600de68a9f6ba9c77501f77849ec96befc8f5673ed446ae5b468

See more details on using hashes here.

File details

Details for the file fingerprint-0.1.0.macosx-10.7-intel.exe.

File metadata

File hashes

Hashes for fingerprint-0.1.0.macosx-10.7-intel.exe
Algorithm Hash digest
SHA256 e0b63170cb789b8a4d73424baacbf23ae4f04e38fb594508c1da4ada65f5bec0
MD5 e784d8189388d957290a831b3b78de50
BLAKE2b-256 d39a0d2e477b5c6cea488bff9bfd4bdbd69e8f2840eb31073c43352b7595bb02

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page