Skip to main content

Embrace SQL keeps your SQL queries in SQL files. An anti-ORM inspired by HugSQL and PugSQL

Project description

Does writing complex queries in an ORM feel like driving with the handbrake on? Embrace SQL! Put your SQL queries in regular .sql files, and embrace will load them.

Installation:

pip install embrace

Usage:

import embrace

# Connect to your database, using any db-api connector.
# If python supports it, so does embrace.
conn = psycopg2.connect("postgresql:///mydb")

# Create a module populated with queries from a collection of *.sql files:
queries = embrace.Module("resources/sql")

# Run a query
users = queries.list_users(conn, order_by='created_at')

Add resources/sql/list_users.sql containing an SQL query:

-- :name list_users :many
select * from users where active = :active order by :identifier:order_by

What is the format of a query SQL file?

Embrace-SQL is based on the format used by HugSQL and PugSQL. SQL files can contain special comments to specify the query name and expected return types

-- :name get_user_count
-- :returns int
SELECT count(1) FROM users

If :name is omitted, it will default to the filename without extension.

A single file may contain multiple SQL queries, separated by a structured SQL comment. For example to create two query objects accessible as queries.list_users() and queries.get_user_by_id():

-- :name list_users
select * from users

-- :name get_user_by_id
select * from users where id=:id

If you don’t have the separating comment, embrace-sql will run multiple statements in a single query call, returning the result from just the last one.

Why? Because it makes this possible in MySQL:

insert into users (name, email) values (:name, :email);
select last_insert_id();

Including queries

You may include one query inside another using the :include: directive. For example:

-- :name select_valid_users
SELECT * FROM users WHERE deleted_at IS NULL;

-- :name select_user_by_email
SELECT * FROM (:include:select_valid_users) WHERE email = :email;

What can queries return?

Queries return tuples by default. Calling Query.returning() or specifiying a :returns hint in the SQL can change this to a dictionary, dataclass, or any other object. See later on in this document for how to do this.

To return a single row, use Query.one(), Query.first() or Query.one_or_none(). For multiple rows, use Query.many(). You can also return a single value, or the values of a single column.

The following Query methods return different result types:

affected, rowcount

The number of rows affected

first

The first row, as returned by cursor.fetchone(), or None if no row is found.

one

A single row, as returned by cursor.fetchone(), usually as a tuple (but most db-api modules have extensions allowing you to access rows as dicts or named tuples.

If no row is generated by the query, embrace.exceptions.NoResultFound will be raised. If more than one row is generated by the query, embrace.exceptions.MultipleResultsFound will be raised.

one_or_none

As one, but returns None if no row is returned by the query.

many

An iterator over a number of rows. Each row will be the value returned by cursor.fetchone(), usually a tuple.

cursor

The cursor object.

scalar

The value of the first column of the first row returned by the query.

If no row is generated by the query, a NoResultFound will be raised.

column

An iterator over the values in the first column returned.

execute, resultset

An object supporting access to query results as any of the above result types.

Use resultset to get access to both the results and cursor metadata, for example:

result = Query("SELECT * from mytable").resultset(conn)
print(result.many())
print(result.cursor.description)

How do I return rows as dicts, namedtuples or dataclasses?

Queries return rows directly from the underlying db-api driver. Many drivers have options to return data structures other than tuples (for example sqlite3.Row or psycopg2.extras.DictCursor). You will need to configure these at the connection level.

See the next section for how to use embrace.query.mapobject to map rows to dicts, namedtuples, dataclasses or your own ORM-style model classes.

How do I map rows onto objects?

Embrace supports simple ORM style mapping.

Example:

import embrace
from dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str

query = queries.query("SELECT * from users").returning(User)
users = query.many(conn)

Map multiple classes in a single query:

query = queries.query(
    "SELECT * FROM posts JOIN users ON posts.user_id = users.id"
).returning((Post, User))
for post, user in query.many(conn):
    …

By default embrace looks for fields named id (case insensitive) to split up the row.

If you need to split on different columns, use mapobject to specify how to map the returned columns onto objects:

from embrace import mapobject

query = queries.query(
    """
    SELECT posts.*, users.*
    FROM posts JOIN users ON posts.user_id = users.user_id
    """
).returning(
    (
        mapobject(Post, split="post_id"),
        mapobject(User, split="user_id")
    )
)
for post, user in query.many(conn):
    …

mapobject can also load columns into dicts, namedtuples and dataclasses:

from embrace import mapobject

query = queries.query(
    """
    SELECT posts.*, categories.*, users.*
    FROM posts
        JOIN categories ON posts.category_id = categories.id
        JOIN users ON posts.user_id = users.id
    """
).returning(
    (
        mapobject.dict(),
        mapobject.namedtuple()
        mapobject.dataclass()
    )
)
for post, category, user in query.many(conn):
    …

When mapping dataclasses, you can specify additional fields for the dataclass by providing a list of fields in the format expected by dataclasses.make_dataclass:

mapobject.dataclass(
    [
        ('owner', typing.Any),
        ('images', list[Image], dataclasses.fields(default_factory=list))
    ]
)

Or as keyword arguments:

mapobject.dataclass(
    owner=typing.Any,
    images=(list[Image], dataclasses.fields(default_factory=list)),
)

mapobject.passthrough passes individual columns through unchanged:

query = queries.query(
    """
    SELECT posts.*, count(*) as reply_count
    FROM posts JOIN replies ON posts.id = replies.post_id
    """
).returning(
    (
        mapobject(Post, split="post_id"),
        mapobject.passthrough(split="reply_count"),
    )
)
for post, reply_count in query.many(conn):
    …

You can also tell embrace to populate join relationships:

from embrace import joinmany
from embrace import joinone

query = queries.query(
    """
    SELECT users.*, orders.*, products.*
    FROM users
    JOIN orders ON orders.user_id = users.id
    JOIN products ON orders.product_id = products.id
    ORDER BY users.id, orders.id
    """
).returning(
    # Each row of this query returns data for a User, Order and Product
    # object. The `key` parameter tells embrace to map items with identical
    # key values to the same python object.
    (
        mapobject(User, key="id"),
        mapobject(Order, key="id"),
        mapobject(Product, key="id"),
    ),
    joins=[
        # Populate User.orders with the list of Order objects
        joinmany(User, 'orders', Order),

        # Populate Order.product with the product object
        joinone(Order, 'product', Product),
    ],
)

for user in query.many(conn):
    for order in user.order:
        product = order.product
        …

The same object mapping and joining can be expressed in comment metadata:

-- :returns tuple[mapobject(User, key="id"), mapobject(Order, key="id"), mapobject(Product, key="id")]
-- :joinmany User.orders = Order
-- :joinone Order.product = Product
SELECT users.*, orders.*, products.*
FROM users
JOIN orders ON orders.user_id = users.id
JOIN products ON orders.product_id = products.id
ORDER BY users.id, orders.id

Note that methods like query.one operate at the level of the database cursor. If you use joins to consolidate multiple database rows into a single object, you will still need to call query.many even if you only require a single object to be returned.

How do parameters work?

Placeholders inserted using the :name syntax are escaped by the db-api driver:

-- Outputs `select * from user where name = 'o''brien'`;
select * from users where name = :name

You can add type hints for placeholders (this only works with the code generation module, see below):

select * from users where id = :id:int

You can interpolate lists and tuples too:

:tuple: creates a placeholder like this (?, ?, ?)

:value*: creates a placeholder like this ?, ?, ?

:tuple*: creates a placeholder like this (?, ?, ?), (?, ?, ?), (useful for multiple insert queries)

-- Call this with `queries.insert_foo(data=(1, 2, 3))`
INSERT INTO foo (a, b, c) VALUES :tuple:data

-- Call this with `queries.get_matching_users(names=("carolyn", "douglas"))`
SELECT * from users WHERE name in (:value*:names)

You can escape identifiers with :identifier:, like this:

-- Outputs `select * from "some random table"`
select * from :identifier:table_name

You can pass through raw sql too. This leaves you open to SQL injection attacks if you allow user input into such parameters:

-- Outputs `select * from users order by name desc`
select * from users order by :raw:order_clause

How do I handle connections? Transactions?

You must pass a db-api connection object every time you call a query. You can manage these connections yourself, but Embrace also offers a connection pooling module.

from embrace import pool

# Create a connection pool
connection_pool = pool.ConnectionPool(
    partial(psycopg2.connect, database='mydb'),
    limit=10
)

# Example 1 - explicit calls to getconn/release
conn = connection_pool.getconn()
try:
    queries.execute_some_query(conn)
finally:
    connection_pool.release(conn)

# Example 2 - context manager
with connection_pool.connect() as conn:
    queries.execute_some_query(conn)

Transaction handling may be handled manually by calling commit() or rollback() on the connection object, or you can also use the transaction context run to queries in a transaction:

with queries.transaction(conn) as q:
    q.increment_counter()

The transaction will be commited when the with block exits, or rolled back if an exception occurred.

Connection pooling

Embrace’s connection pooling maintains a single connection per-thread.

If a thread checks out a connection, subsequent calls will return the same connection object:

connection_pool = pool.ConnectionPool(partial(psycopg2.connect, database='mydb'))
with connection_pool.connect() as conn1:
    with connection_pool.connect() as conn2:

        # The same connection object is returned
        assert conn1 is conn2

If an independent connection is required, pass contextual=False:

connection_pool = pool.ConnectionPool(partial(psycopg2.connect, database='mydb'))
with connection_pool.connect() as conn1:
    with connection_pool.connect(contextual=False) as conn2:

        # A new connection object is returned
        assert conn1 is not conn2

Disable this behaviour globally by configuring the connection pool with contextual=False:

connection_pool = pool.ConnectionPool(partial(psycopg2.connect, database='mydb', contextual=False))

Compiling and type hinting queries

Embrace offers a script that compiles queries into a Python module. This improves script startup time (because the parse step can be skipped) and allows embrace to write type hints into the generated code:

$ python -m embrace.codegen myproject/sql-files -o myproject/queries.py

This searches for all query SQL files under myproject/sql-files and writes a python module under myproject/queries.py. For example if you have a query in the file myproject/sql-files/select_foo.py, you could then access it from Python by writing from myproject.queries import select_foo.

Compiled queries are type hinted, so mypy or pyright can read the signature of your queries and warn you when you are missing parameters, or when you are using the return type incorrectly.

Type hints are infered from the metadata provided in sql comments.

Note that the :returns metadata tells embrace what types to return from the query, before any joins are attempted.

Returns examples

Query with metadata

Notes

-- :returns tuple[int]
SELECT id FROM mytable

Each row is returned as a tuple, containing one data item, an integer.

-- :returns tuple[int, str]
SELECT id, name FROM mytable

The query returns rows containing an int and a str

-- :returns dict
SELECT id, name FROM mytable

Each row will be mapped onto a single dict having the structure {"id": ..., "name": ...}

-- :name foo :one
-- :import myapp.model
-- :returns myapp.model.User
SELECT * FROM users WHERE user_id=:id:int

Each row will be mapped onto an instance of myapp.model.User. Note the import, required to resolve the myapp.model module

-- :name foo
-- :from myapp.model import User
-- :returns User
SELECT * FROM users WHERE user_id=:id:int

Has the same effect as above, but with a from ... import ... style import

-- :name foo
-- :returns User, mapobject(Post, split_on='post_id')
SELECT * FROM users JOIN posts on posts.user_id=users.id
WHERE user_id=:id:int

The embrace.mapobject.mapobject class is always available without import

-- :name foo :one
-- :from myapp.model import User, Post, Tag
-- :returns User, Post, Tag
-- :joinone Post.user = User
-- :joinmany Post.tag = Post
SELECT * FROM users WHERE user_id=:id

The returns field defines the return type of each row. :joinone or :joinmany then causes returned objects to be linked together, narrowing the list of objects returned by the function.

Note that specifying embrace.mapobject.dataclass, or embrace.mapobject.namedtuple will result in a return type of typing.Any, because the structure of the dataclass/namedtuple unknown until the query is executed for the first time

How do I reload queries when the underlying files change?

Pass auto_reload=True when constructing a module:

m = Module('resources/sql', auto_reload=True)

If you are using code generation, then you need to use a file watcher. For example, using entr

ls path/to/queries/*.sql | entr python -m embrace.codegen -o mypackage/queries.py path/to/queries

Exceptions

Exceptions raised from the underlying db-api connection are wrapped in exception classes from embrace.exceptions, with PEP-249 compliant names. You can use this like so:

try:
    queries.execute("SELECT 1.0 / 0.0")
except embrace.exceptions.DataError:
    pass

The original exception is available in the __cause__ attribute of the embrace exception object.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embrace-5.2.0.tar.gz (48.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

embrace-5.2.0-py3-none-any.whl (40.5 kB view details)

Uploaded Python 3

File details

Details for the file embrace-5.2.0.tar.gz.

File metadata

  • Download URL: embrace-5.2.0.tar.gz
  • Upload date:
  • Size: 48.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for embrace-5.2.0.tar.gz
Algorithm Hash digest
SHA256 4b0043f49b316bccaa471ac5ff9d41133c9c2eb7f2cabe5016d49474ad0c7167
MD5 98d7c1253a5f08f22308cc2244e283be
BLAKE2b-256 9f37420f0fef027f1d767c78cfef060d3a9476ca6417fa9c3ea04140436addcf

See more details on using hashes here.

File details

Details for the file embrace-5.2.0-py3-none-any.whl.

File metadata

  • Download URL: embrace-5.2.0-py3-none-any.whl
  • Upload date:
  • Size: 40.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for embrace-5.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 43c6deed264e112c468c1d61fdee19086a423518623b2e8718de19655b8af2a9
MD5 95d145b59e15f6e03426bb6611f07cdc
BLAKE2b-256 8f5f7a58c8a3b787c9e8d9f79d1e2f34d0ebf3c1397c1d91dfbdb34cfc9a8101

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page