Skip to main content

Generate Avro Schemas from a Python class

Project description

Dataclasses Avro Schema Generator

Generate Avro Schemas from a Python class

Build Status GitHub license codecov python version

Requirements

python 3.7+

Installation

pip install dataclasses-avroschema

Documentation

https://marcosschroh.github.io/dataclasses-avroschema/

Usage

Generating the avro schema

import typing

from dataclasses_avroschema import AvroModel, types


class User(AvroModel):
    "An User"
    name: str
    age: int
    pets: typing.List[str]
    accounts: typing.Dict[str, int]
    favorite_colors: types.Enum = types.Enum(["BLUE", "YELLOW", "GREEN"])
    country: str = "Argentina"
    address: str = None

    class Meta:
        namespace = "User.v1"
        aliases = ["user-v1", "super user"]

User.avro_schema()

'{
    "type": "record",
    "name": "User",
    "doc": "An User",
    "namespace": "User.v1",
    "aliases": ["user-v1", "super user"],
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "age", "type": "int"},
        {"name": "pets", "type": "array", "items": "string"},
        {"name": "accounts", "type": "map", "values": "int"},
        {"name": "favorite_colors", "type": "enum", "symbols": ["BLUE", "YELLOW", "GREEN"]},
        {"name": "country", "type": "string", "default": "Argentina"},
        {"name": "address", "type": ["null", "string"], "default": "null"}
    ]
}'

Serialization to avro or avro-json and json payload

For serialization is neccesary to use python class/dataclasses instance

from dataclasses import dataclass

import typing

from dataclasses_avroschema import AvroModel


@dataclass
class Address(AvroModel):
    "An Address"
    street: str
    street_number: int

@dataclass
class User(AvroModel):
    "User with multiple Address"
    name: str
    age: int
    addresses: typing.List[Address]

address_data = {
    "street": "test",
    "street_number": 10,
}

# create an Address instance
address = Address(**address_data)

data_user = {
    "name": "john",
    "age": 20,
    "addresses": [address],
}

# create an User instance
user = User(**data_user)

user.serialize()
# >>> b"\x08john(\x02\x08test\x14\x00"

user.serialize(serialization_type="avro-json")
# >>> b'{"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}'

# Get the json from the instance

user.to_json()
# python dict >>> {'name': 'john', 'age': 20, 'addresses': [{'street': 'test', 'street_number': 10}]}

Deserialization

Deserialization could take place with an instance dataclass or the dataclass itself

import typing

from dataclasses_avroschema import AvroModel


class Address(AvroModel):
    "An Address"
    street: str
    street_number: int

class User(AvroModel):
    "User with multiple Address"
    name: str
    age: int
    addresses: typing.List[Address]

avro_binary = b"\x08john(\x02\x08test\x14\x00"
avro_json_binary = b'{"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}'

User.deserialize(avro_binary)
# >>> {"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}

User.deserialize(avro_json_binary, serialization_type="avro-json")
# >>> {"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}

Features

  • Primitive types: int, long, float, boolean, string and null support
  • Complex types: enum, array, map, fixed, unions and records support
  • Logical Types: date, time, datetime, uuid support
  • Schema relations (oneToOne, oneToMany)
  • Recursive Schemas
  • Generate Avro Schemas from faust.Record
  • Instance serialization correspondent to avro schema generated
  • Data deserialization
  • Generate json from python class instance

Development

  1. Create a virtualenv: python3.7 -m venv venv && source venv/bin/activate
  2. Install requirements: pip install -r requirements.txt
  3. Code linting: ./scripts/lint
  4. Run tests: ./scripts/test

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataclasses-avroschema-0.14.5.tar.gz (15.3 kB view details)

Uploaded Source

File details

Details for the file dataclasses-avroschema-0.14.5.tar.gz.

File metadata

  • Download URL: dataclasses-avroschema-0.14.5.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.0

File hashes

Hashes for dataclasses-avroschema-0.14.5.tar.gz
Algorithm Hash digest
SHA256 d1b6a2e9e41cd7b8331739c6b73ad56aaf17c8230c8f0ceb814ff5aceff0795b
MD5 bbaae5d3e5cb6bde76f9b1285b094402
BLAKE2b-256 5d280f1e69416ca941b48efc575ba0c41846bdf720e35a58da7bb6fe0cdad8a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page