Skip to main content

Building simple pipelines simply.

Project description

lined

Building simple pipelines, simply.

And lightly too! No dependencies. All with pure builtin python.

A really simple example:

>>> p = Line(sum, str)
>>> p([2, 3])
'5'

A still quite simple example:

>>> def first(a, b=1):
...     return a * b
>>>
>>> def last(c) -> float:
...     return c + 10
>>>
>>> f = Line(first, last)
>>>
>>> assert f(2) == 12
>>> assert f(2, 10) == 30

Let's check out the signature of f:

>>> from inspect import signature
>>>
>>> assert str(signature(f)) == '(a, b=1) -> float'
>>> assert signature(f).parameters == signature(first).parameters
>>> assert signature(f).return_annotation == signature(last).return_annotation == float

Border case: One function only

>>> same_as_first = Line(first)
>>> assert same_as_first(42) == first(42)

More?

string and dot digraph representations

Line's string representation (__repr__) and how it deals with callables that don't have a __name__ (hint: it makes one up):

from lined.base import Line
from functools import partial

pipe = Line(sum, np.log, str, print, partial(map, str), name='some_name')
pipe
Line(sum, log, str, print, unnamed_func_001, name='some_name')

If you have graphviz installed, you can also do this:

pipe.dot_digraph()

image

And if you don't, but have some other dot language interpreter, you can just get the body (and fiddle with it):

print('\n'.join(pipe.dot_digraph_body()))
rankdir="LR"
sum [shape="box"]
log [shape="box"]
str [shape="box"]
print [shape="box"]
unnamed_func_001 [shape="box"]
sum -> log
log -> str
str -> print
print -> unnamed_func_001

Optionally, a pipeline can have an input_name and/or an output_name. These will be used in the string representation and the dot digraph.

pipe = Line(sum, np.log, str, print, partial(map, str), input_name='x', output_name='y')
str(pipe)
"Line(sum, log, str, print, unnamed_func_001, name='some_name')"
pipe.dot_digraph()

image

Tools

iterize and iterate

from lined import Line

pipe = Line(lambda x: x * 2, 
            lambda x: f"hello {x}")
pipe(1)
'hello 2'

But what if you wanted to use the pipeline on a "stream" of data. The following wouldn't work:

try:
    pipe(iter([1,2,3]))
except TypeError as e:
    print(f"{type(e).__name__}: {e}")
TypeError: unsupported operand type(s) for *: 'list_iterator' and 'int'

Remember that error: You'll surely encounter it at some point.

The solution to it is (often): iterize, which transforms a function that is meant to be applied to a single object, into a function that is meant to be applied to an array, or any iterable of such objects. (You might be familiar (if you use numpy for example) with the related concept of "vectorization", or array programming.)

from lined import Line, iterize
from typing import Iterable

pipe = Line(iterize(lambda x: x * 2), 
            iterize(lambda x: f"hello {x}"))
iterable = pipe([1, 2, 3])
assert isinstance(iterable, Iterable)  # see that the result is an iterable
list(iterable)  # consume the iterable and gather it's items
['hello 2', 'hello 4', 'hello 6']

Instead of just computing the string, say that the last step actually printed the string (called a "callback" function whose result was less important than it's effect -- like storing something, etc.).

from lined import Line, iterize, iterate

pipe = Line(iterize(lambda x: x * 2), 
            iterize(lambda x: print(f"hello {x}")),
            )

for _ in pipe([1, 2, 3]):
    pass
hello 2
hello 4
hello 6

It could be a bit awkward to have to "consume" the iterable to have it take effect.

Just doing a

pipe([1, 2, 3])

to get those prints seems like a more natural way.

This is where you can use iterate. It basically "launches" that consuming loop for you.

from lined import Line, iterize, iterate

pipe = Line(iterize(lambda x: x * 2), 
            iterize(lambda x: print(f"hello {x}")),
            iterate
            )

pipe([1, 2, 3])
hello 2
hello 4
hello 6

Ramblings

Decorating

Toddlers write lines of code. Grown-ups write functions. Plenty of them.

Why break lines of code into small functions? Where to start...

  • It's called modularity, and that's good
  • You can reuse functions (and no, copy/paste isn't D.R.Y. -- and if you don't know what D.R.Y. is, grow up).
  • Because 7+-2, a.k.a chunking or Miller's Law.
  • You can decorate functions, not lines of code.

lined sets you up to take advantage of these goodies.

Note this line (currently 117) of lined/base.py , in the init of Line:

self.funcs = tuple(map(fnode, self.funcs))

That is, every function is cast to with fnode.

fnode is:

def fnode(func, name=None):
    return Fnode(func, name)

and Fnode is just a class that "transparently" wraps the function. This is so that we can then use Fnode to do all kinds of things to the function (without actually touching the function itself).

@dataclass
class Fnode:
    func: Callable
    __name__: Optional[str] = None

def __post_init__(self):
    wraps(self.func)(self)
    self.__name__ = self.__name__ or func_name(self.func)

def __call__(self, *args, **kwargs):
    return self.func(*args, **kwargs)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lined-0.0.17.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lined-0.0.17-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file lined-0.0.17.tar.gz.

File metadata

  • Download URL: lined-0.0.17.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lined-0.0.17.tar.gz
Algorithm Hash digest
SHA256 6188da485e504df7081d85fba0d3b7fe4437cd823550466277ed41107450a044
MD5 f7e527e1e58bfa578e9ffe9708035e98
BLAKE2b-256 74e05f4a70a8ef61d9ae3468e10da6b6c99a29e91cefd48c73f98161ec60f9e8

See more details on using hashes here.

File details

Details for the file lined-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: lined-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for lined-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 f8e9c841f943ca7d0a1169dfddc1d652fa9c4b8aa26ea5a12cdca8b6fde074d3
MD5 cf65cada99c29356954f879788226faa
BLAKE2b-256 5a88713a7fa2dd5cf753a555d558eff193c6dfaa456c2210e4cd90dbfd81743e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page