pandabase links pandas DataFrames to SQL databases. Upsert, append, read, drop, describe...

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

pandabase

pandabase links pandas DataFrames to SQL databases, supporting read, append, and upsert.

By default, uses DataFrame.index as the primary key. By using an explicit primary key, pandabase makes rational database schemas the obvious choice, and makes it easy to maintain clean data even when it must be updated frequently.

Designed for especially for time-series datasets that need to be updated over time and stored to disk, but are used primarily in-memory for computation.

Tested under Python 3.6 and 3.7, with new versions of Pandas (>= 0.24) SQLAlchemy (>= 1.3). Requires psycopg2 for postgres (8+) support.

It's a relatively new tool, but for my purposes it works great. Comments and contributions welcome.

Features

primary keys: by default, any named index is assumed to be the PK
- also supports auto_index (with parameter auto_index=True)
insert modes: 'create_only', 'upsert', and 'append'
replaces pd.DataFrame.to_sql and pd.read_sql
tested under SQLite and PostgresQL
automated test suite in pytest
- 93% test coverage
also includes pandabase.companda.companda(df1, df2) for rich comparisons of DataFrames

Design Considerations

Minimal dependencies: SQLAlchemy and Pandas are the only requirements
Database is the source of truth: will coerce incoming DataFrames to fit existing schema
- but also is reasonably smart about how new tables are created from DataFrames
Not horrendously slow

License

MIT license

Thanks

Code partially stolen from Dataset and pandas.sql

Installation

From your inside your virtual environment of choice:

~/$ pip install pandabase

For latest version:

~/$ git clone https://github.com/notsambeck/pandabase
~/$ cd pandabase
~/pandabase/$ pip install -r requirements.txt
~/pandabase/$ pip install .

Usage

# Python >= 3.6
>>> import pandas as pd
>>> import pandabase
>>> my_data = pd.DataFrame(index=range(7, 12), 
                           columns=['some_number'],
                           data=pd.np.random.random((5,1)))
>>> my_data.index.name = 'made_up_name'        # index must be named to use as PK
>>> pandabase.to_sql(my_data, table_name='my_table', con='sqlite:///new_sqlite_db.sqlite', how='create_only')
Table('my_table', ...
>>> exit()

That's all!

Your data is now persistently stored in a SQLite database, using my_data.index as primary key. To append or update data, replace 'create_only' with 'append' or 'upsert'. To store records without an explicit index, use 'autoindex=True'.

~/pandabase$ ls
new_sqlite_db.sqlite

>>> import pandabase
>>> df = pandabase.read_sql('my_table', con='sqlite:///new_sqlite_db.sqlite'))
>>> df
    some_number 
7   0.722416 
8   0.076045 
9   0.213118 
10  0.453716 
11  0.406995

Additional Features

Companda - rich comparisons of DataFrames. call companda on two DataFrames, get a Companda object back (that evaluates to True/False).

>>> from pandabse.companda import companda
>>> df = pandabase.read_sql('my_table', con='sqlite:///new_sqlite_db.sqlite'))
>>> companda(df, df.copy())
Companda(True, message='Equal DataFrames')
>>> bool(companda(df, df.copy()))
True

>>> df2 = df.copy
>>> df2.iloc[1, 2] = -1000
>>> companda(df, df2)
Companda(False, message='Columns, indices are equal, but unqual values in columns...')
>>> bool(companda(df, df2))
False

Table tools: pandabase. ...

add_columns_to_db(new_col, table_name, con):
- """Make new columns as needed with ALTER TABLE, as a weak substitute for migrations"""
drop_db_table(table_name, con):
- """Drop table [table_name] from con"""
get_db_table_names(con):
- """get a list of table names from database"""
get_table_column_names(con, table_name):
- """get a list of column names from database, table"""
describe_database(con):
- """get a description of database content: table_name: {table_info_dict}"""

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.5.0

Jul 30, 2020

0.4.5

Feb 14, 2020

0.3

Nov 28, 2019

This version

0.2.1

Sep 10, 2019

0.1.1

Aug 12, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandabase-0.2.1.tar.gz (12.3 kB view details)

Uploaded Sep 10, 2019 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pandabase-0.2.1-py3-none-any.whl (12.9 kB view details)

Uploaded Sep 10, 2019 Python 3

File details

Details for the file pandabase-0.2.1.tar.gz.

File metadata

Download URL: pandabase-0.2.1.tar.gz
Upload date: Sep 10, 2019
Size: 12.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.7

File hashes

Hashes for pandabase-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`5be24f147f5e0d77df126227e1c10d94f9ba4a98298332a5a4219d6eed5da2e3`
MD5	`334d424f06b2db269d62b42c61d1cece`
BLAKE2b-256	`03a97c4644d8a6d2cef4bc2093cdb01f279620ad80d2813311cadba4c0ba00c2`

See more details on using hashes here.

File details

Details for the file pandabase-0.2.1-py3-none-any.whl.

File metadata

Download URL: pandabase-0.2.1-py3-none-any.whl
Upload date: Sep 10, 2019
Size: 12.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.6.7

File hashes

Hashes for pandabase-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f2634d36bb384198f90804b5ae89f798f36bea2a0c2bb524ad9bc707754f9a7d`
MD5	`24152057d1968994ef236046d9c2285f`
BLAKE2b-256	`809714dd35b2a5c8a153180361d344f34a1da379127704ddd46b7b0d50f2bbcb`

See more details on using hashes here.

pandabase 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pandabase

Features

Design Considerations

License

Thanks

Installation

Usage

Additional Features

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes