Treasure Data extension for pyspark
Project description
td_pyspark
Treasure Data extension of pyspark.
Usage
import td_pyspark
from pyspark.sql import SparkSession
spark = SparkSession\
.builder\
.appName("td-pyspark-app")\
.getOrCreate()
td = td_pyspark.TDSparkContext(spark)
# Read the table data within -1d (yesterday) range as DataFrame
df = td.table("sample_datasets.www_access")\
.within("-1d")\
.df()
df.show()
# Submit a Presto query
q = td.presto("select 1")
q.show()
For Developers
Running pyspark with td_pyspark:
$ ./bin/spark-submit --master "local[4]" --driver-class-path td-spark-assemblyd.jar --properties-file=td-spark.conf --py-files ~/work/git/td-spark/td_pyspark/td_pyspark/td_pyspark.py ~/work/git/td-spark/td_pyspark/td_pyspark/tests/test_pyspark.py
How to publish
Prerequisites
Twine is a secure utility to publish the python package. It's commonly used to publish Python package to PyPI. First you need to install the package in advance.
$ pip install twine
Having the configuration file for PyPI credential may be useful.
$ cat << 'EOF' > ~/.pypirc
[distutils]
index-servers =
pypi
pypitest
[pypi]
repository=https://upload.pypi.org/legacy/
username=<your_username>
password=<your_password>
[pypitest]
repository=https://test.pypi.org/legacy/
username=<your_username>
password=<your_password>
EOF
Build Package
Build the package in the raw source code and wheel format.
$ python setup.py sdist bdist_wheel
Publish Package
Upload the package to the test repository first.
$ twine upload \
--repository pypitest \
dist/*
If you do not find anything wrong in the test repository, then it's time to publish the package.
$ twine upload \
--repository pypi \
dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
td_pyspark-19.5.0.tar.gz
(3.2 kB
view hashes)
Built Distribution
Close
Hashes for td_pyspark-19.5.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71ea79f6a84ae7cc32ba5923694366416254a87789b97e5c63cbe6b49aea0847 |
|
MD5 | 3dada29e9e97c137c5b15a0adfba0d6d |
|
BLAKE2b-256 | 44f67781921289bbb439c9784acd5af7098f1699b99b6a5a885e8357c2b405a1 |