Python function to construct an ODS spreadsheet on the fly - without having to store the entire file in memory or disk

These details have not been verified by PyPI

Project links

Homepage

Project description

stream-write-ods

Python function to construct an ODS (OpenDocument Spreadsheet) on the fly - without having to store the entire file in memory or disk.

Can be used to convert CSV, SQLite, or JSON to ODS format.

Installation

pip install stream-write-ods

Usage

In general, pass a nested iterable to stream_write_ods and it will return an interable of bytes of the ODS file, as follows.

from stream_write_ods import stream_write_ods

def get_sheets():
    def get_rows_of_sheet_1():
        yield 'Value A', 'Value B'
        yield 'Value C', 'Value D'

    yield 'Sheet 1 name', ('col_1_name', 'col_2_name'), get_rows_of_sheet_1()

    def get_rows_of_sheet_2():
        yield 'col_1_value',

    yield 'Sheet 2 name', ('col_1_name',), get_rows_of_sheet_2()

ods_chunks = stream_write_ods(get_sheets())

Usage: Convert CSV to ODS

The following recipe converts a CSV to ODS.

import codecs
import csv
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a CSV file
# Hard coded for the purposes of this example
csv_bytes_iter = (
    b'col_1,col_2\n',
    b'1,"value"\n',
)

def get_sheets(sheet_name, csv_reader):
    yield sheet_name, next(csv_reader), csv_reader

csv_str_iter = codecs.iterdecode(csv_bytes_iter, 'utf-8')
csv_reader = csv.reader(csv_str_iter, csv.QUOTE_NONNUMERIC)
ods_chunks = stream_write_ods(get_sheets('Sheet 1', csv_reader))

Usage: Convert large/chunked pandas dataframe to ODS

from io import BytesIO
from itertools import chain
import pandas as pd
from stream_write_ods import stream_write_ods

# Hard coded for the purposes of this example,
# but could be any file-like object
csv_file = BytesIO((
    b'col_1,col_2\n' +
    b'1,"value"\n'
    b'2,"other value"\n'
))

def get_sheets():
    columns = None

    def get_rows():
        nonlocal columns
        # Directly reads the hard-coded CSV and saves as ODS for the purposes,
        # of this example, but could have calculations / manipulations between
        with pd.read_csv(csv_file, chunksize=1024) as reader:
            for chunk in reader:
                if columns is None:
                    columns = tuple(chunk.columns.tolist())
                yield from (row for index, row in chunk.iterrows())

    rows = get_rows()
    first_row = next(rows)

    yield 'Sheet 1', columns, chain((first_row,), rows)

ods_chunks = stream_write_ods(get_sheets())

with open('t.ods', 'wb') as f:
    for chunk in ods_chunks:
        f.write(chunk)

Usage: Convert JSON to ODS

Using ijson to stream-parse a JSON file, it's possible to convert JSON data to ODS on the fly:

import ijson
import itertools
from stream_write_ods import stream_write_ods

# Any iterable that yields the bytes of a JSON file
# Hard coded for the purposes of this example
json_bytes_iter = (b'''{
  "data": [
      {"id": 1, "name": "Foo"},
      {"id": 2, "name": "Bar"}
  ]
}''',)

# ijson requires a file-like object
def to_file_like_obj(bytes_iter):
    chunk = b''
    offset = 0
    it = iter(bytes_iter)

    def up_to_iter(size):
        nonlocal chunk, offset

        while size:
            if offset == len(chunk):
                try:
                    chunk = next(it)
                except StopIteration:
                    break
                else:
                    offset = 0
            to_yield = min(size, len(chunk) - offset)
            offset = offset + to_yield
            size -= to_yield
            yield chunk[offset - to_yield:offset]

    class FileLikeObj:
        def read(self, size=-1):
            return b''.join(up_to_iter(float('inf') if size is None or size < 0 else size))

    return FileLikeObj()

def get_sheets(json_file):
    columns = None

    def rows():
        nonlocal columns
        for item in ijson.items(json_file, 'data.item'):
            if columns is None:
                columns = list(item.keys())
            yield tuple(item[column] for column in columns)

    # Ensure columns populated
    rows_it = rows()
    first_row = next(rows_it)

    yield 'Sheet 1', columns, itertools.chain((first_row,), rows_it)

json_file = to_file_like_obj(json_bytes_iter)
ods_chunks = stream_write_ods(get_sheets(json_file))

Usage: Convert SQLite to ODS

SQLite isn't particularly streaming-friendly since typically you need random access to the file. But it's still possible to use stream-write-ods to convert SQLite to ODS.

import contextlib
import sqlite3
import tempfile
from stream_write_ods import stream_write_ods

@contextlib.contextmanager
def get_db():
    # Hard coded in memory database for the purposes of this example
    with sqlite3.connect(':memory:') as con:
        cur = con.cursor()
        cur.execute("CREATE TABLE my_table_a (my_col text);")
        cur.execute("CREATE TABLE my_table_b (my_col text);")
        cur.execute("INSERT INTO my_table_a VALUES ('Value A')")
        cur.execute("INSERT INTO my_table_b VALUES ('Value B')")
        yield con

def quote_identifier(value):
    return '"' + value.replace('"', '""') + '"'

def get_sheets(db):
    cur_table = db.cursor()
    cur_table.execute('''
        SELECT name FROM sqlite_master
        WHERE type = "table" AND name NOT LIKE 'sqlite\\_%' ESCAPE '\\'
    ''')
    cur_data = db.cursor()
    for table, in cur_table:
        cur_data.execute(f'SELECT * FROM {quote_identifier(table)} ORDER BY rowid')
        yield table, tuple(col[0] for col in cur_data.description), cur_data

with get_db() as db:
    ods_chunks = stream_write_ods(get_sheets(db))

Types

There are 8 possible data types in an Open Document Spreadsheet: boolean, currency, date, float, percentage, string, time, and void. 4 of these can be output by stream-write-ods, chosen automatically according to the following table.

Python type	ODS type
boolean	boolean
date	date - without time component
datetime	date - with time component
int	float
float	float
str	string
bytes	string - base64 encoded
NoneType	string - as #NA

It is possible to change how each type is encoded by overriding the encoders parameter of the stream_write_ods function. See stream-write-ods.py for the default implementation.

Modified at

ODS files are ZIP files, and as such require a "modified at" time for each member file. This defaults to datatime.now, but can be overridden by the get_modified_at parameter of the stream_write_ods function. See stream-write-ods.py for the default implementation.

This is useful if you want to make sure generated ODS files are byte-for-byte identical to a fixed reference, say from automated tests.

Limitations

ODS spreadsheets are essentially ZIP archives containing several member files. While in general ZIP archives can be up to 16EiB (exbibyte) in size using ZIP64, LibreOffice does not support ZIP64, and so ODS files are de-facto limited to 4GiB (gibibyte). This limit applies to the size of the entire compressed archive, the compressed size of each member file, and the uncompressed size of each member file.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.25

Oct 4, 2024

0.0.24

Aug 12, 2023

0.0.23

Jul 15, 2023

0.0.22

Apr 7, 2022

0.0.21

Feb 2, 2022

0.0.20

Jan 9, 2022

0.0.19

Jan 9, 2022

0.0.18

Jan 9, 2022

0.0.17

Jan 9, 2022

0.0.16

Jan 9, 2022

0.0.15

Jan 5, 2022

This version

0.0.14

Jan 5, 2022

0.0.13

Jan 5, 2022

0.0.12

Jan 2, 2022

0.0.11

Jan 1, 2022

0.0.10

Jan 1, 2022

0.0.9

Dec 31, 2021

0.0.8

Dec 31, 2021

0.0.7

Dec 31, 2021

0.0.6

Dec 30, 2021

0.0.5

Dec 30, 2021

0.0.4

Dec 30, 2021

0.0.3

Dec 30, 2021

0.0.2

Dec 30, 2021

0.0.1

Dec 30, 2021

0.0.0

Dec 30, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stream-write-ods-0.0.14.tar.gz (5.9 kB view details)

Uploaded Jan 5, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stream_write_ods-0.0.14-py3-none-any.whl (6.2 kB view details)

Uploaded Jan 5, 2022 Python 3

File details

Details for the file stream-write-ods-0.0.14.tar.gz.

File metadata

Download URL: stream-write-ods-0.0.14.tar.gz
Upload date: Jan 5, 2022
Size: 5.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for stream-write-ods-0.0.14.tar.gz
Algorithm	Hash digest
SHA256	`21dfa6ab2f3c84923dadf7d94f788b4342ec022feb0c4b7d96a639d820943803`
MD5	`a28e983ab763f6d10588f5e47f927593`
BLAKE2b-256	`c93dcb733091932f6a7837f41ae8e79def6090ba5a76d5d43dba6a6ba09d502f`

See more details on using hashes here.

File details

Details for the file stream_write_ods-0.0.14-py3-none-any.whl.

File metadata

Download URL: stream_write_ods-0.0.14-py3-none-any.whl
Upload date: Jan 5, 2022
Size: 6.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for stream_write_ods-0.0.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5890c10ef957569865293e6d520162e15dec20bcd637c0700e2e5ab68a83ade8`
MD5	`58b53505f9bdaa727f11b396edd39d91`
BLAKE2b-256	`4ad29016d0edcaae5ac650e80e4e1969e5b57da004fe47174cb1884bff35e097`

See more details on using hashes here.

stream-write-ods 0.0.14

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

stream-write-ods

Installation

Usage

Usage: Convert CSV to ODS

Usage: Convert large/chunked pandas dataframe to ODS

Usage: Convert JSON to ODS

Usage: Convert SQLite to ODS

Types

Modified at

Limitations

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes