Type and shape validation and serialization for numpy arrays in pydantic models
Project description
numpydantic
Type and shape validation and serialization for numpy arrays in pydantic models
This package was picked out of nwb-linkml, a translation of the NWB schema language and data format to linkML and pydantic models.
It does two primary things:
- Provide types - Annotations (based on npytyping) for specifying numpy arrays in pydantic models, and
- Generate models from LinkML - extend the LinkML pydantic generator to create models that that use the linkml-arrays syntax
Parameterized Arrays
Arrays use the npytying syntax:
from typing import Union
from pydantic import BaseModel
from numpydantic import NDArray, Shape, UInt8, Float, Int
class Image(BaseModel):
"""
Data values. Data can be in 1-D, 2-D, 3-D, or 4-D. The first dimension should always represent time. This can also be used to store binary data (e.g., image frames). This can also be a link to data stored in an external file.
"""
array: Union[
NDArray[Shape["* x, * y"], UInt8],
NDArray[Shape["* x, * y, 3 rgb"], UInt8],
NDArray[Shape["* x, * y, 4 rgba"], UInt8],
NDArray[Shape["* t, * x, * y, 3 rgb"], UInt8],
NDArray[Shape["* t, * x, * y, 4 rgba"], Float]
]
Validation:
import numpy as np
# works
frame_gray = Image(array=np.ones((1280, 720), dtype=np.uint8))
frame_rgb = Image(array=np.ones((1280, 720, 3), dtype=np.uint8))
frame_rgba = Image(array=np.ones((1280, 720, 4), dtype=np.uint8))
video_rgb = Image(array=np.ones((100, 1280, 720, 3), dtype=np.uint8))
# fails
wrong_n_dimensions = Image(array=np.ones((1280,), dtype=np.uint8))
wrong_shape = Image(array=np.ones((1280,720,10), dtype=np.uint8))
wrong_type = Image(array=np.ones((1280,720,3), dtype=np.float64))
# shapes and types are checked together
float_video = Image(array=np.ones((100, 1280, 720, 4),dtype=float))
wrong_shape_float_video = Image(array=np.ones((100, 1280, 720, 3),dtype=float))
JSON schema generation:
class MyArray(BaseModel):
array: NDArray[Shape["2 x, * y, 4 z"], Float]
>>> print(json.dumps(MyArray.model_json_schema(), indent=2))
{
"properties": {
"array": {
"items": {
"items": {
"items": {
"type": "number"
},
"maxItems": 4,
"minItems": 4,
"type": "array"
},
"type": "array"
},
"maxItems": 2,
"minItems": 2,
"title": "Array",
"type": "array"
}
},
"required": [
"array"
],
"title": "MyArray",
"type": "object"
}
Serialization
class SmolArray(BaseModel):
array: NDArray[Shape["2 x, 2 y"], Int]
class BigArray(BaseModel):
array: NDArray[Shape["1000 x, 1000 y"], Int]
Serialize small arrays as lists of lists, and big arrays as a b64-encoded blosc compressed string
>>> smol = SmolArray(array=np.array([[1,2],[3,4]], dtype=int))
>>> big = BigArray(array=np.random.randint(0,255,(1000,1000),int))
>>> print(smol.model_dump_json())
{"array":[[1,2],[3,4]]}
>>> print(big.model_dump_json())
{
"array": "( long b64 encoded string )",
"shape": [1000, 1000],
"dtype": "int64",
"unpack_fns": ["base64.b64decode", "blosc2.unpack_array2"],
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
numpydantic-1.0.0rc1.tar.gz
(25.2 kB
view hashes)
Built Distribution
Close
Hashes for numpydantic-1.0.0rc1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 819b19b5379e9d78193dcb22026f1005ddf6ec1ed5f288d7e3fe124060021c4e |
|
MD5 | bd62acd4be134a16d78f728fff1e2450 |
|
BLAKE2b-256 | ad53772d9a95ddb5b07fa8ddaf493f8c872693c10dbc26adb910099c81b58fbb |