Mutable variant of collections.namedtuple -- recordclass.recordclass, which support assignments, and other memory saving variants.
Project description
Recordclass library
Recordclass is MIT Licensed python library.
It was started as a "proof of concept" for the problem of fast "mutable"
alternative of namedtuple (see question on stackoverflow).
It implements a factory function recordclass (a variant of collection.namedtuple) in order to create record-like classes with the same API as collection.namedtuple.
It was evolved further in order to provide more memory saving, fast and flexible types.
Recordclass library provide record-like classes that do not participate in cyclic garbage collection (CGC) mechanism, but support only reference counting mechanism for garbage collection.
The instances of such classes havn't PyGC_Head prefix in the memory, which decrease their size.
This may make sense in cases where it is necessary to limit the size of the objects as much as possible, provided that they will never be part of references cycles in the application.
For example, when an object represents a record with fields that represent simple values by convention (int, float, str, date/time/datetime, timedelta, etc.).
In order to illustrate this, consider a simple class with type hints:
class Point:
x: int
y: int
By contract instances of the class Point have attributes x and y with values of int type.
Assigning other types of values, which are not subclass of int, should be considered as a violation of the contract.
Another examples are non-recursive data structures in which all leaf elements represent a value of an atomic type. Of course, in python, nothing prevent you from “shooting yourself in the foot" by creating the reference cycle in the script or application code. But in many cases, this can still be avoided provided that the developer understands what he is doing and uses such classes in the code with care. Another option is to use static code analyzers along with type annotations to monitor compliance with data types.
-
The
recodeclasslibrary provide the base classdataobject. The type ofdataobjectis special metaclassdatatype. It control creation of subclasses ofdataobject, which will not participate in CGC by default. As the result the instance of such class need less memory. It's memory footprint is similar to memory footprint of instances of the classes with__slots__. The difference is equal to the size ofPyGC_Head. It also tunesbasicsizeof the instances, creates descriptors for the fields and etc. All subclasses ofdataobjectcreated byclass statementsupportattrs/dataclasses-like API. For example:from recordclass import dataobject, astuple, asdict class Point(dataobject): x:int y:int >>> p = Point(1, 2) >>> astuple(p) (1, 2) >>> asdict(p) {'x':1, 'y':2} -
The
recordclassfactory create dataobject-based subclass with specified fields and supportnamedtuple-like API. By default it will not participate in CGC too.>>> from recordclass import recordclass >>> Point = recordclass('Point', 'x y') >>> p = Point(1, 2) >>> p.y = -1 >>> print(p._astuple) (1, -1) -
It provide a factory function
make_dataclassfor creation of subclasses ofdataobjectwith the specified field names. These subclasses supportattrs/dataclasses-like API. This is an equivalent to creation of subclasses of dataobject usingclass statement. For example:>>> Point = make_dataclass('Point', 'x y') >>> p = Point(1, 2) >>> p.y = -1 >>> print(p.x, p.y) 1 -1 -
It provide a factory function
make_arrayclassin order to create subclass ofdataobjectwich can consider as array of simple values. For example:>>> Pair = make_arrayclass(2) >>> p = Pair(2, 3) >>> p[1] = -1 >>> print(p) Pair(2, -1) -
It provide classes
lightlistandlitetuple, which considers as list-like and tuple-like light containers in order to save memory. Mutable variant of litetuple is called bymutabletuple. The instances of both types don't participate in CGC. For example:lt = litetuple(1, 2, 3) mt = mutabletuple(1, 2, 3) lt == mt True mt[-1] = -3 lt == mt False print(sys.getsizeof(litetuple(1,2,3)), sys.getsizeof((1,2,3))) 64 48
Main repository for recordclassis on bitbucket.
Here is also a simple example.
Quick start
Installation
Installation from directory with sources
Install:
>>> python setup.py install
Run tests:
>>> python test_all.py
Installation from PyPI
Install:
>>> pip install recordclass
Run tests:
>>> python -c "from recordclass.test import *; test_all()"
Quick start with recordclass
The recordclass factory function is designed to create classes that support namedtuple's API, can be mutable and immutable, provide fast creation of the instances and have a minimum memory footprint.
First load inventory:
>>> from recordclass import recordclass
Example with recordclass:
>>> Point = recordclass('Point', 'x y')
>>> p = Point(1,2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 1, 2
>>> print(p)
Point(1, 2)
>>> sys.getsizeof(p) # the output below is for 64bit cpython3.8+
32
Example with class statement and typehints:
>>> from recordclass import RecordClass
class Point(RecordClass):
x: int
y: int
>>> print(Point.__annotations__)
{'x': <class 'int'>, 'y': <class 'int'>}
>>> p = Point(1, 2)
>>> print(p)
Point(1, 2)
>>> print(p.x, p.y)
1 2
>>> p.x, p.y = 1, 2
>>> print(p)
Point(1, 2)
By default recordclass-based class instances doesn't participate in CGC and therefore they are smaller than namedtuple-based ones. If one want to use it in scenarios with reference cycles then one have to use option gc=True (gc=False by default):
>>> Node = recordclass('Node', 'root children', gc=True)
or
@clsconfig(gc=True)
class Node(RecordClass):
root: 'Node'
chilren: list
Quick start with dataobject
Dataobject is the base class for creation of data classes with fast instance creation and small memory footprint. They don't provide namedtuple-like API.
First load inventory:
>>> from recordclass import dataobject, asdict, astuple
class Point(dataobject):
x: int
y: int
>>> print(Point.__annotations__)
{'x': <class 'int'>, 'y': <class 'int'>}
>>> p = Point(1,2)
>>> print(p)
Point(x=1, y=2)
>>> sys.getsizeof() # the output below for 64bit python 3.8+
32
>>> p.__sizeof__() == sys.getsizeof(p) # no additional space for CGC support
True
>>> p.x, p.y = 10, 20
>>> print(p)
Point(x=10, y=20)
>>> asdict(p)
{'x':10, 'y':20}
>>> astuple(p)
(10, 20)
By default subclasses of dataobject are mutable. If one want make it immutable then there is the option readonly=True:
@clsconfig(readonly=True)
class Point(dataobject):
x: int
y: int
>>> p = Point(1,2)
>>> p.x = -1
TypeError: item is readonly
By default subclasses of dataobject are not iterable by default. If one want make it iterable then there is the option iterable=True:
@clsconfig(iterable=True)
class Point(dataobject):
x: int
y: int
>>> p = Point(1,2)
>>> for x in p: print(x)
1
2
Another way to create subclasses of dataobject – factory function make_dataclass:
>>> from recordclass import make_dataclass
>>> Point = make_dataclass("Point", [("x",int), ("y",int)])
or
>>> Point = make_dataclass("Point", {"x":int, "y":int})
Default values are also supported::
class CPoint(dataobject):
x: int
y: int
color: str = 'white'
or
>>> CPoint = make_dataclass("CPoint", [("x",int), ("y",int), ("color",str)], defaults=("white",))
>>> p = CPoint(1,2)
>>> print(p)
Point(x=1, y=2, color='white')
But
class PointInvalidDefaults(dataobject):
x:int = 0
y:int
is not allowed. A fields without default value may not appear after a field with default value.
There is the options fast_new=True. It allows faster creation of the instances. Here is an example:
class FastPoint(dataobject, fast_new=True):
x: int
y: int
The followings timings explain (in jupyter notebook) boosting effect of fast_new option:
%timeit l1 = [Point(i,i) for i in range(100000)]
%timeit l2 = [FastPoint(i,i) for i in range(100000)]
# output with python 3.9 64bit
25.6 ms ± 2.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
10.4 ms ± 426 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using dataobject-based classes for recursive data without reference cycles
There is the option deep_dealloc (default value is True) for deallocation of recursive datastructures.
Let consider simple example:
class LinkedItem(dataobject, fast_new=True):
val: object
next: 'LinkedItem'
class LinkedList(dataobject, deep_dealloc=True):
start: LinkedItem = None
end: LinkedItem = None
def append(self, val):
link = LinkedItem(val, None)
if self.start is None:
self.start = link
else:
self.end.next = link
self.end = link
Without deep_dealloc=True deallocation of the instance of LinkedList will be failed if the length of the linked list is too large.
But it can be resolved with __del__ method for clearing the linked list:
def __del__(self):
curr = self.start
while curr is not None:
next = curr.next
curr.next = None
curr = next
There is builtin more fast deallocation method using finalization mechanizm when deep_dealloc=True. In such case one don't need __del__ method for clearing tthe list.
Note that for classes with
gc=True(cyclic GC is used) this method is disabled: the python's cyclic GC is used.
For more details see notebook example_datatypes.
Memory footprint
The following table explain memory footprints of recordclass-base and dataobject-base objects:
| namedtuple | class with __slots__ | recordclass | dataobject |
|---|---|---|---|
| $g+b+s+n*p$ | $g+b+n*p$ | $b+n*p$ | $b+n*p$ |
where:
- b = sizeof(
PyObject) - s = sizeof(
Py_ssize_t) - n = number of items
- p = sizeof(
PyObject*) - g = sizeof(PyGC_Head)
This is useful in that case when you absolutely sure that reference cycle isn't supposed. For example, when all field values are instances of atomic types. As a result the size of the instance is decreased by 24-32 bytes (for cpython 3.4-3.7) and by 16 bytes since cpython 3.8.
Performance counters
Here is the table with performance counters (python 3.9, debian linux, x86-64), which are mesured using utils/perfcount.py script:
| id | new | getattr | setattr | size |
|---|---|---|---|---|
| namedtuple | 2.643526 | 0.471421 | 56 | |
| class+slots | 1.851441 | 0.536047 | 0.549807 | 48 |
| dataobject | 2.017816 | 0.466287 | 0.534306 | 32 |
| dataobject+fast_new | 0.927759 | 0.468668 | 0.523788 | 32 |
| dataobject+gc | 2.162687 | 0.463672 | 0.523189 | 48 |
| dataobject+fast_new+gc | 1.046897 | 0.468382 | 0.525876 | 48 |
Changes:
0.15
- Now library supports only Python >= 3.6
- 'gc' and 'fast_new' options now can be specified as kwargs in class statement.
- Add a function
astuple(ob)for transformation dataobject instanceobto a tuple. - Drop datatuple based classes.
- Add function
make(cls, args, **kwargs)to create instance of the classcls. - Add function
clone(ob, **kwargs)to clone dataobject instanceob. - Make structclass as alias of make_dataclass.
- Add option 'deep_dealloc' (@clsconfig(deep_dealloc=True)) for deallocation instances of dataobject-based recursive subclasses.
0.14.3:
- Subclasses of
dataobjectnow support iterable and hashable protocols by default.
0.14.2:
- Fix compilation issue for python 3.9.
0.14.1:
- Fix issue with hash when subclassing recordclass-based classes.
0.14:
- Add doc to generated
dataobject-based class in order to supportinspect.signature. - Add
fast_newargument/option for fast instance creation. - Fix refleak in
litelist. - Fix sequence protocol ability for
dataobject/datatuple. - Fix typed interface for
StructClass.
0.13.2
- Fix issue #14 with deepcopy of dataobjects.
0.13.1
- Restore ``join_classes
and add new functionjoin_dataclasses`.
0.13.0.1
- Remove redundant debug code.
0.13
- Make
recordclasscompiled and work with cpython 3.8. - Move repository to git instead of mercurial since bitbucket will drop support of mercurial repositories.
- Fix some potential reference leaks.
0.12.0.1
- Fix missing .h files.
0.12
clsconfignow become the main decorator for tuning dataobject-based classes.- Fix concatenation of mutabletuples (issue
#10).
0.11.1:
dataobjectinstances may be deallocated faster now.
0.11:
- Rename
memoryslotstomutabletuple. mutabletupleandimmutabletupledosn't participate in cyclic garbage collection.- Add
litelisttype for list-like objects, which doesn't participate in cyglic garbage collection.
0.10.3:
- Introduce DataclassStorage and RecordclassStorage. They allow cache classes and used them without creation of new one.
- Add
iterabledecorator and argument. Now dataobject with fields isn't iterable by default. - Move
astupletodataobject.c.
0.10.2
- Fix error with dataobject's
__copy__. - Fix error with pickling of recordclasses and structclasses, which was appeared since 0.8.5 (Thanks to Connor Wolf).
0.10.1
- Now by default sequence protocol is not supported by default if dataobject has fields, but iteration is supported.
- By default argsonly=False for usability reasons.
0.10
- Invent new factory function
make_classfor creation of different kind of dataobject classes without GC support by default. - Invent new metaclass
datatypeand new base classdataobjectfor creation dataobject class usingclassstatement. It have disabled GC support, but could be enabled by decoratordataobject.enable_gc. It support type hints (for python >= 3.6) and default values. It may not specify sequence of field names in__fields__when type hints are applied to all data attributes (for python >= 3.6). - Now
recordclass-based classes may not support cyclic garbage collection too. This reduces the memory footprint by the size ofPyGC_Head. Now by default recordclass-based classes doesn't support cyclic garbage collection.
0.9
- Change version to 0.9 to indicate a step forward.
- Cleanup
dataobject.__cinit__.
0.8.5
- Make
arrayclass-based objects support setitem/getitem andstructclass-based objects able to not support them. By default, as beforestructclass-based objects support setitem/getitem protocol. - Now only instances of
dataobjectare comparable to 'arrayclass'-based andstructclass-based instances. - Now generated classes can be hashable.
0.8.4
- Improve support for readonly mode for structclass and arrayclass.
- Add tests for arrayclass.
0.8.3
- Add typehints support to structclass-based classes.
0.8.2
- Remove
usedict,gc,weaklistfrom the class__dict__.
0.8.1
- Remove Cython dependence by default for building
recordclassfrom the sources [Issue #7].
0.8
- Add
structclassfactory function. It's analog ofrecordclassbut with less memory footprint for it's instances (same as for instances of classes with__slots__) in the camparison withrecordclassandnamedtuple(it currently implemented withCython). - Add
arrayclassfactory function which produce a class for creation fixed size array. The benefit of such approach is also less memory footprint (it currently currently implemented withCython). structclassfactory has argumentgcnow. Ifgc=False(by default) support of cyclic garbage collection will switched off for instances of the created class.- Add function
join(C1, C2)in order to join twostructclass-based classes C1 and C2. - Add
sequenceproxyfunction for creation of immutable and hashable proxy object from class instances, which implement access by index (it currently currently implemented withCython). - Add support for access to recordclass object attributes by idiom:
ob['attrname'](Issue #5). - Add argument
readonlyto recordclass factory to produce immutable namedtuple. In contrast tocollection.namedtupleit use same descriptors as for regular recordclasses for performance increasing.
0.7
- Make mutabletuple objects creation faster. As a side effect: when number of fields >= 8
recordclass instance creation time is not biger than creation time of instaces of
dataclasses with
__slots__. - Recordclass factory function now create new recordclass classes in the same way as namedtuple in 3.7 (there is no compilation of generated python source of class).
0.6
- Add support for default values in recordclass factory function in correspondence to same addition to namedtuple in python 3.7.
0.5
- Change version to 0.5
0.4.4
- Add support for default values in RecordClass (patches from Pedro von Hertwig)
- Add tests for RecorClass (adopted from python tests for NamedTuple)
0.4.3
- Add support for typing for python 3.6 (patches from Vladimir Bolshakov).
- Resolve memory leak issue.
0.4.2
- Fix memory leak in property getter/setter
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file recordclass-0.15.tar.gz.
File metadata
- Download URL: recordclass-0.15.tar.gz
- Upload date:
- Size: 472.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c06718d4668e08885cf500d5f3966e50837ce1a84adb2dcdcb6bff38702592d1
|
|
| MD5 |
ed2dd19723ef71b6fd15eee2912a03fc
|
|
| BLAKE2b-256 |
2103ea7c27ed02b1750bcac3ec9edfb05f64ea22b2650e97a2a221568ce1c889
|
File details
Details for the file recordclass-0.15-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: recordclass-0.15-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 117.8 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17f4972fefdbe2da6e5dc60ca58e915bb6e636525a0143382c98bbad80d11ba4
|
|
| MD5 |
7bbe44ce63edbf9688904ef6f8051d2c
|
|
| BLAKE2b-256 |
0dd8b043231bee5db0ae3462073a8352ea5d165e301b707a440809e7a9de4a38
|
File details
Details for the file recordclass-0.15-cp39-cp39-win32.whl.
File metadata
- Download URL: recordclass-0.15-cp39-cp39-win32.whl
- Upload date:
- Size: 104.0 kB
- Tags: CPython 3.9, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
185449b6265a56807e63aeb06ea10c7d252acb9ad574ea3cf7a0660ae1679edf
|
|
| MD5 |
b033bfca1f656f238c477f5860defa6a
|
|
| BLAKE2b-256 |
fbd9fc8c6a53e167ba3c39044e8df863ba83baa57898c4a709e3b7a10737295f
|
File details
Details for the file recordclass-0.15-cp38-cp38-win_amd64.whl.
File metadata
- Download URL: recordclass-0.15-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 117.9 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
327d4629ec1c224dddf954d8db08b8fddfda6a19ea2649a51452bfdc6ad69c9a
|
|
| MD5 |
30f5521c5b69151f06c1053d0ffc828f
|
|
| BLAKE2b-256 |
fce623f7a5e8e63f518cd9faf52e98ec207a68dfc1ab90f10469ef303c8b214d
|
File details
Details for the file recordclass-0.15-cp38-cp38-win32.whl.
File metadata
- Download URL: recordclass-0.15-cp38-cp38-win32.whl
- Upload date:
- Size: 104.6 kB
- Tags: CPython 3.8, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7e0555d2603d97c0cc218fec91fa9b932d1c0cad37644eca2f2a02fb121bcf3
|
|
| MD5 |
0fd7361edeeb05aa1a023f467f195692
|
|
| BLAKE2b-256 |
b492ba2348021f1676bcc1ef26fceffdfaf20c896039545e2ebdbb67f447513d
|
File details
Details for the file recordclass-0.15-cp37-cp37m-win_amd64.whl.
File metadata
- Download URL: recordclass-0.15-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 115.9 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8216b1ff032a3e18c1358bfb70559296f80d0ead1791fcad65cd518ffd560534
|
|
| MD5 |
d1f38cf725c0660d5d4056581e3dead4
|
|
| BLAKE2b-256 |
35137ab2b6766d6439a92ab092717f3a233c44d543779accc0233a21da3f106b
|
File details
Details for the file recordclass-0.15-cp37-cp37m-win32.whl.
File metadata
- Download URL: recordclass-0.15-cp37-cp37m-win32.whl
- Upload date:
- Size: 103.5 kB
- Tags: CPython 3.7m, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
989bc44c72cea1d51095d1a131677f694b95c6838c21bc180dfba6828202cd07
|
|
| MD5 |
112c72432345260d3849ab860755a693
|
|
| BLAKE2b-256 |
c4f82728e6b2f946bf32acd78c9a43f6c6b388e38094e22d8a63446696d2d551
|
File details
Details for the file recordclass-0.15-cp36-cp36m-win_amd64.whl.
File metadata
- Download URL: recordclass-0.15-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 115.9 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc084a2e37dadbb8e11b6c24a4ca12ebcc996a66fc2e3a0c28dc6accead29da6
|
|
| MD5 |
9ee8b69e316490a224286c05a88bcd79
|
|
| BLAKE2b-256 |
fe4e78c8c6fa76b3ae6e7cc4ed43e8f7568f797c0c2d0b396928b9f9954a0840
|
File details
Details for the file recordclass-0.15-cp36-cp36m-win32.whl.
File metadata
- Download URL: recordclass-0.15-cp36-cp36m-win32.whl
- Upload date:
- Size: 103.4 kB
- Tags: CPython 3.6m, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.4.2 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2f6200d2f6987e23fbb0a33fa24e5c4199d365ea4a1f6c4ff6f24f7005f06b7
|
|
| MD5 |
cc7d6685f909199b64397163ed175455
|
|
| BLAKE2b-256 |
44daade040cccd070a0e479e282276282e6626a32dbb04aa9508e5bccd206aa8
|