Skip to main content

Port of dplyr and other related R packages in python, using pipda.

Project description

datar

A Grammar of Data Manipulation in python

Pypi Github Building Docs and API Codacy Codacy coverage

Documentation | Reference Maps | Notebook Examples | API | Blog

datar is a re-imagining of APIs of data manipulation libraries in python (currently only pandas supported) so that you can manipulate your data with it like with dplyr in R.

datar is an in-depth port of tidyverse packages, such as dplyr, tidyr, forcats and tibble, as well as some functions from base R.

Installation

pip install -U datar

# install pdtypes support
pip install -U datar[pdtypes]

# install dependencies for modin as backend
pip install -U datar[modin]
# you may also need to install dependencies for modin engines
# pip install -U modin[ray]

Example usage

from datar import f
from datar.dplyr import mutate, filter, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter, if_else, tibble

df = tibble(
    x=range(4),  # or f[:4]
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar.base import sin, pi
from plotnine import ggplot, aes, geom_line, theme_classic

df = tibble(x=numpy.linspace(0, 2*pi, 500))
(df >>
  mutate(y=sin(f.x), sign=if_else(f.y>=0, "positive", "negative")) >>
  ggplot(aes(x='x', y='y')) +
  theme_classic() +
  geom_line(aes(color='sign'), size=1.2))

example

# easy to integrate with other libraries
# for example: klib
import klib
from datar.core.factory import verb_factory
from datar.datasets import iris
from datar.dplyr import pull

dist_plot = verb_factory(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()

example

See also some advanced examples from my answers on StackOverflow:

Project details


Release history Release notifications | RSS feed

This version

0.7.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datar-0.7.2.tar.gz (10.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datar-0.7.2-py3-none-any.whl (10.3 MB view details)

Uploaded Python 3

File details

Details for the file datar-0.7.2.tar.gz.

File metadata

  • Download URL: datar-0.7.2.tar.gz
  • Upload date:
  • Size: 10.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Linux/5.13.0-1021-azure

File hashes

Hashes for datar-0.7.2.tar.gz
Algorithm Hash digest
SHA256 aa14a4e23829b3845f3ffd97d42253a527a647ed4a9c9958f3183a3b5acf9e33
MD5 3e6d424c1d23d8f7e390d6c11938f9cf
BLAKE2b-256 208764191212cf91cc7cf459ccf6a8714b381c789b9a140ca295a5e2b8f33951

See more details on using hashes here.

File details

Details for the file datar-0.7.2-py3-none-any.whl.

File metadata

  • Download URL: datar-0.7.2-py3-none-any.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Linux/5.13.0-1021-azure

File hashes

Hashes for datar-0.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ca089b8d011fe9fe38a523738fe9fee9d0c55aab881a2e85428a6baa89f58a4e
MD5 6198e8535dfecd66cfc853e24a72b8e9
BLAKE2b-256 4b7ba6278e1feb4ef04c482fa4a6883d85144df99c62187a4284bf4209d01dff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page