Skip to main content

No project description provided

Project description

task-py

Overview

One day you need to write a script to process some data, so you sit down and write it. The filepath and other parameters are hard-coded, but it works. A few days later, you need to do the same thing, but need to accept a filepath on the command line and maybe one more parameter, so you refactor some functionality into functions and create a simple CLI using sys.argv. Then a coworker asks if he/she/they could use the script, so you finally break down and use argparse to write a better CLI. If this cycle sounds familiar, task-py can help by allowing you to skip steps one and two, while writing interoperable, composable tools.

Installation

Install from PyPi (preferred method)

pip install lc-task

Install from GitHub with Pip

pip install git+https://github.com/libcommon/task-py.git@vx.x.x#egg=lc_task

where x.x.x is the version you want to download.

Install by Manual Download

To download the source distribution and/or wheel files, navigate to https://github.com/libcommon/task-py/tree/releases/vx.x.x/dist, where x.x.x is the version you want to install, and download either via the UI or with a tool like wget. Then to install run:

pip install <downloaded file>

Do not change the name of the file after downloading, as Pip requires a specific naming convention for installation files.

Dependencies

task-py does not have external dependencies. Only Python versions >= 3.6 are officially supported.

Getting Started

Creating a Task

Creating a Task is simple: specify some (optional) attributes and implement Task._perform_task, which performs the primary unit of work. For example, you could implement a task that takes a file path and prints the data of that file to stdout:

from pathlib import Path

from lc_task import Task


class CatFileTask(Task):
    """Print contents of a file to stdout."""
    __slots__ = ("_input_path",)

    def __init__(self, input_path: Path, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self._input_path

    def _perform_task(self) -> None:
        if not self._input_path.is_file():
            raise FileNotFoundError(str(self._input_path))
        with self._input_path.open() as input_file:
            print(input_file.read())

In the example above, we defined an attribute _input_path on CatFileTask to store the input file path. However, you can also use the state dict attribute to store arbitrary state. For tasks that require some setup and teardown steps, use the Task._preamble and Task._postamble functions, which get called before and after the primary unit of work is completed. To run this task explicitly, you would use the Task.run method.

Accepting Command Line Input

To create a single task that takes command line input (using argparse), create a child class of CLITask that defines the name of the command, a brief description, and the command line arguments. Taking the example above:

from pathlib import Path

from lc_task import CLITask


class CatFileTask(CLITask):
    __slots__ = ("_input_path",)

    COMMAND = "cat"
    DESCRIPTION = "Print contents of a file to stdout."

    @classmethod
    def gen_command_parser(cls, parser: Optional[ArgumentParser] = None) -> ArgumentParser:
        parser = super().gen_command_parser()
        parser.add_argument("_input_path", type=Path, metavar="INPUT_PATH", help="Path to input file")
        return parser

    def _perform_task(self) -> None:
        if not self._input_path.is_file():
            raise FileNotFoundError(str(self._input_path))
        with self._input_path.open() as input_file:
            print(input_file.read())


if __name__ == "__main__":
    CatFileTask.run_command()

For defining more complex, hierarchical command line interfaces with subcommands, take a look at the docstring for cli.gen_cli_parser. It allows you to define your command line app with a mapping of commands to actions, then generates the CLI for you.

Pipelines and Message Passing

Anyone who has used Bash, PowerShell, or other scripting languages is familiar with the idea of composability and pipes: writing simple commands that return structured data recognized by other commands can be very powerful. Python does not support this style of programming out of the box, but it does support overloading operators for custom types, such as the bitwise or operator (|). In Python, the __ror__ builtin method is called when using the bitwise or operator on two objects where the first (left side) doesn't implement __or__.

task-py takes advantage of this flexibility to allow piping Tasks together to create pipelines. For example, suppose you were writing a CSV handing toolkit and wanted to createa a command line app that reads data from a CSV and removes some columns. There are two clear steps in this pipeline:

  1. Read data from the CSV into some data structure
  2. Remove specified columns (and write to stdout)
import csv
from pathlib import Path
from typing import List, Optional, TextIO

from lc_task import CLITask, Task, TaskResult


class CSVColumnRemovalTask(Task):
    """Remove specified columns from frows in a CSV."""
    __slots__ = ("columns", "input_file", "reader")

    def __init__(self, *args, columns: Optional[List[str]] = None, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.columns = columns

    def _perform_task(self) -> None:
        for record in reader:
            for column in self.columns:
                del record[column]
            print(", ".join(record))

    def _preamble(self) -> None:
        self.input_file.close()


class CSVReaderTaskResult(TaskResult):
    __slots__ = ("input_file", "reader")

    def __init__(self: input_file: Optional[TextIO] = None, reader: Optional[csv.DictReader] = None, **kwargs) -> None:
        super().__init__(**kwargs)
        self.input_file = input_file
        self.reader = reader


class CSVReaderTask(Task):
    """Read data from a CSV into a `csv.DictReader` object."""
    __slots__ = ("input_path",)

    @staticmethod
    def gen_task_result() -> TaskResult:
        return CSVReaderTaskResult()

    def __init__(self, *args, input_path: Optional[Path] = None, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.input_path = input_path

    def _perform_task(self) -> None:
        # Need to check self.input_path because it could be provided in
        # __init__ or via merge_object
        if not self.input_path:
            raise ValueError("must provide path to CSV file")

        input_file = self.input_path.open(newline="")
        reader = csv.DictReader(input_file, dialect=csv.unix_dialect)
        # Using a defined TaskResult type here, but could also just use a dict
        self.result = CSVReaderTaskResult(input_file = input_file, reader = reader)


class RemoveColumnsCLITask(CLITask):
    __slots__ = ("columns", "input_path")

    COMMAND = "remove-columns"
    DESCRIPTION = "Remove columns from a CSV and write to disk."

    @classmethod
    def gen_command_parser(cls, parser: Optional[ArgumentParser] = None) -> ArgumentParser:
        parser = super().gen_command_parser()
        parser.add_argument("input_path", type=Path, help="Path to input file")
        parser.add_argument("columns", nargs=+, help="Column names to remove")
        return parser

    def _perform_task(self) -> None:
        # Pipe the output of CSVReaderTask to CSVColumnRemovalTask
        (CSVReaderTask(input_path = self.input_path).run() |
         CSVColumnRemovalTask(columns = self.columns))


if __name__ == "__main__":
    RemoveColumnsCLITask.run_command()

Contributing/Suggestions

Contributions and suggestions are welcome! To make a feature request, report a bug, or otherwise comment on existing functionality, please file an issue. For contributions please submit a PR, but make sure to lint, type-check, and test your code before doing so. Thanks in advance!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lc_task-0.2.1.tar.gz (9.9 kB view hashes)

Uploaded Source

Built Distribution

lc_task-0.2.1-py3-none-any.whl (11.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page