Skip to main content

C implementation of parts of difflib

Project description

cdifflib

Python difflib sequence matcher reimplemented in C.

Actually only contains reimplemented parts. Creates a CSequenceMatcher type which inherets most functions from difflib.SequenceMatcher.

cdifflib is about 4x the speed of the pure python difflib when diffing large streams.

Limitations

The C part of the code can only work on list rather than generic iterables, so anything that isn't a list will be converted to list in the CSequenceMatcher constructor. This may cause undesirable behavior if you're not expecting it.

Works with Python 2.7 and 3.6 (Should work on all 3.3+)

Usage

Can be used just like the difflib.SequenceMatcher as long as you pass lists. These examples are right out of the difflib docs:

>>> from cdifflib import CSequenceMatcher
>>> s = CSequenceMatcher(None, ' abcd', 'abcd abcd')
>>> s.find_longest_match(0, 5, 0, 9)
Match(a=1, b=0, size=4)
>>> s = CSequenceMatcher(lambda x: x == " ",
...                      "private Thread currentThread;",
...                      "private volatile Thread currentThread;")
>>> print round(s.ratio(), 3)
0.866

It's completely compatible, so you can replace the difflib version on startup and then other libraries will use CSequenceMatcher too, eg:

from cdifflib import CSequenceMatcher
import difflib
difflib.SequenceMatcher = CSequenceMatcher
import library_that_uses_difflib

# Now the library will transparantely be using the C SequenceMatcher - other
# things remain the same
library_that_uses_difflib.do_some_diffing()

Making

To install:

python setup.py install

To test:

python setup.py test

License etc

This code lives at https://github.com/mduggan. See LICENSE for the license.

Changelog

  • 1.2.5 - Fix some memory leaks (#7)
  • 1.2.4 - Repackage yet again using twine for pypi upload (no binary changes)
  • 1.2.3 - Repackage again with changelog update and corrected src package (no binary changes)
  • 1.2.2 - Repackage to add README.md in a way pypi supports (no binary changes)
  • 1.2.1 - Fix bug for longer sequences with "autojunk"
  • 1.2.0 - Python 3 support for other versions
  • 1.1.0 - Added Python 3.6 support (thanks Bclavie)
  • 1.0.4 - Changes to make it compile on MSVC++ compiler, no change for other platforms
  • 1.0.2 - Bugfix - also replace set_seq1 implementation so difflib.compare works with a CSequenceMatcher
  • 1.0.1 - Implement more bits in c to squeeze a bit more speed out
  • 1.0.0 - First release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdifflib-1.2.5.tar.gz (7.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cdifflib-1.2.5-py3.7-macosx-10.14-x86_64.egg (10.6 kB view details)

Uploaded Egg

cdifflib-1.2.5-py3.6-macosx-10.14-x86_64.egg (10.6 kB view details)

Uploaded Egg

cdifflib-1.2.5-py3.4-macosx-10.14-x86_64.egg (10.6 kB view details)

Uploaded Egg

cdifflib-1.2.5-py2.7-macosx-10.14-intel.egg (10.3 kB view details)

Uploaded Egg

cdifflib-1.2.5-cp37-cp37m-macosx_10_14_x86_64.whl (9.2 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

cdifflib-1.2.5-cp36-cp36m-macosx_10_14_x86_64.whl (9.2 kB view details)

Uploaded CPython 3.6mmacOS 10.14+ x86-64

cdifflib-1.2.5-cp34-cp34m-macosx_10_14_x86_64.whl (10.4 kB view details)

Uploaded CPython 3.4mmacOS 10.14+ x86-64

cdifflib-1.2.5-cp27-cp27m-macosx_10_14_intel.whl (9.1 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file cdifflib-1.2.5.tar.gz.

File metadata

  • Download URL: cdifflib-1.2.5.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5.tar.gz
Algorithm Hash digest
SHA256 a6bbd0bd5047180c893dc1d36383c7b12b544d252ac4a72f9ad8493d10a4cdc7
MD5 99461836b7b550cc81d7babf67f2851e
BLAKE2b-256 84033df024147be8db59c622244d19885503f9a0133016ccaf618f35c8e0bfaf

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-py3.7-macosx-10.14-x86_64.egg.

File metadata

  • Download URL: cdifflib-1.2.5-py3.7-macosx-10.14-x86_64.egg
  • Upload date:
  • Size: 10.6 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-py3.7-macosx-10.14-x86_64.egg
Algorithm Hash digest
SHA256 0b385c2d9d66dfb982a9ab83e5c34a6ee7b29358924869e87275fcba4dcd109c
MD5 9894dc9201b123b5812b28765906b5d9
BLAKE2b-256 b27e6088fc007f02c2f795556d5c27fc108ead63fa095ac2972956b5f763c471

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-py3.6-macosx-10.14-x86_64.egg.

File metadata

  • Download URL: cdifflib-1.2.5-py3.6-macosx-10.14-x86_64.egg
  • Upload date:
  • Size: 10.6 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-py3.6-macosx-10.14-x86_64.egg
Algorithm Hash digest
SHA256 c4a6cc04c43282806474d5fbceaac8cf244036de91dbd2b839f9275e1d8eb1b9
MD5 9fe6d5635ebfd55f78ad83ec2f49e5a6
BLAKE2b-256 b1bae91c430774b73755db6f84def16905c99325c2355fb0dee16d7359c77d96

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-py3.4-macosx-10.14-x86_64.egg.

File metadata

  • Download URL: cdifflib-1.2.5-py3.4-macosx-10.14-x86_64.egg
  • Upload date:
  • Size: 10.6 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-py3.4-macosx-10.14-x86_64.egg
Algorithm Hash digest
SHA256 910e2875485d953fafdaf4f2427d16f4c8896a124a45f56900bd190cc64e044f
MD5 4f76d85e1a6b11465befca316f9b10e1
BLAKE2b-256 adeee5aaaa365404d5ac84c7a82fa5342883e37a160d43383c7f6214f3b121f7

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-py2.7-macosx-10.14-intel.egg.

File metadata

  • Download URL: cdifflib-1.2.5-py2.7-macosx-10.14-intel.egg
  • Upload date:
  • Size: 10.3 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-py2.7-macosx-10.14-intel.egg
Algorithm Hash digest
SHA256 e2277bcd4a63cb4cc238302077b8d4012e63751288d217d3e7f5b9e8fa4cc4fa
MD5 cde56248ee8eb0a2ba7e379dfc909d66
BLAKE2b-256 9f63bd388fdc8e79050d92b4b7c861155bd23bbc25d1859c2149562ddc5cf569

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: cdifflib-1.2.5-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 dcb50b940e944775ddeb4ba6840c0824364c6972274943f1aeb82c692d2da416
MD5 53980b48479ac38081d5e4a04d1f1e22
BLAKE2b-256 db63a46cc06d8581e8f89e6eaa41a8165a1f92ad97767bc46c4af7912de267d8

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: cdifflib-1.2.5-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ac631ad6dbd4d28b1b2746a27645c3b97ac67f8176a54d476e0490d6f69daac9
MD5 8d0259357205a07a33bf50f6d84ecff4
BLAKE2b-256 88f0eaaa548388403ebd7572ebe122cac4de09c99ed97a775baff425539e3c7d

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-cp34-cp34m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: cdifflib-1.2.5-cp34-cp34m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: CPython 3.4m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-cp34-cp34m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 973e18e77eecb3b6df7b3823e2a4c1df4df871364abc8a134899dd23c6a314a1
MD5 1435f33dfc6690efc3d716c50aa4ffcf
BLAKE2b-256 15ba03820aab1392bfe984837faeb55317ecaa4aa4bc4453d7a89cf4087b748d

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.5-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: cdifflib-1.2.5-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.5-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 9c03e19b56e4dfa46a50836558e51c0badb5829e492077b0ef667c9bb94efa30
MD5 76bfbe381105bceba03a71ac50609bb5
BLAKE2b-256 942cd41b441e8af02d2caa1e096d0bb08d11f11189f0b7db7d2fa0a50d88be00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page