Skip to main content

Trifacta client

Project description

trifacta

Trifacta client that makes it easy to integrate Trifacta into your production and data science workflows

Usage Scenarios

  • Jupyter: Invoke Trifacta jobs from a Jupyter notebook and pass data back and forth between Jupyter and Trifacta
  • Other Notebooks: Integrate Trifacta with Azure Databricks, Zepellin or any other notebook-style interface that supports Python
  • Scripts: Automate Trifacta jobs and input/output using python scripts that can be easily executed from the command line or called from an external scheduler

Functionality

This library makes it simple to do the following:

  1. Connect to a Trifacta instance
  2. Run a job
  3. Download results to a csv file and view in pandas dataframe

Note that file uploads and downloads are performed using Amazon S3, using the boto3 API

#!pip install trifacta
import trifacta

If you need an access token, you can generate it as follows:

#Step 1: Connect to Trifacta by providing the URL and API Access Token
t = trifacta.Client('http://partnerdemo.amer.trifacta.net:3005', 'YOUR_ACCESS_TOKEN')

Get the wrangled dataset id from the URL in the Trifacta UI

Make sure that you have run the job manually at least once Edit recipe

Note the output path (be sure to set it to "replace")

Publish settings

#Step 2: Run the job
t.run_job(23)
About to run job
{'sessionId': '9d339e65-8898-4165-871b-b9db848dc099', 'reason': 'JobStarted', 'jobGraph': {'vertices': [76, 77], 'edges': [{'source': 76, 'target': 77}]}, 'id': 42, 'jobs': {'data': [{'id': 76}, {'id': 77}]}}
2020-02-25 11:19:58.508231 InProgress
2020-02-25 11:20:03.700189 InProgress
2020-02-25 11:20:08.887794 Complete





True
%env AWS_PROFILE=trifacta_master_trial
env: AWS_PROFILE=trifacta_master_trial
#Step 3: Download results to a csv file and view in pandas dataframe
import boto3
s3 = boto3.client('s3', region_name='us-west-2')
s3.download_file(Bucket='trifacta-partnerdemo-trifactabucket-kkcpnw234feu',
                Key='trifacta/queryResults/admin@trifacta.local/MarketingAnalytics.csv',
                Filename='MarketingAnalytics.csv')
import pandas as pd
df = pd.read_csv('MarketingAnalytics.csv')
df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
user_id customerkey event_type event_subtype Date advertiser_id creative_id url product_id domain_url ... customeraccount_number customerphone customeraddress cusotmerstate customerzipcode customercountry socialmedia totalsale Outlier_Identifier currencykey
0 1126310400000-424 1126310400000-424 click click 10-19-2005 164332 543027 http://zdnet.com/praesent/lectus/vestibulum/qu... 1124064000000-475 zdnet ... 310170445527596 (817)718-7309 156 Cozy Berry Arc CA 78710 USA deneleaf 7004.54 False 1
1 1229126400000-20 1229126400000-20 click click 08-17-2009 164332 252030 http://hostgator.com/a/feugiat.js?pid=12331008... 1233100800000-528 hostgator ... 310150240507900 (469)201-1812 3641 Euismod Avenue CA 10769 USA kinphanng 4853.35 False 1
2 1126828800000-518 1126828800000-518 view view 04-05-2006 164332 562765 http://fc2.com/convallis/duis/consequat/dui/ne... 1121904000000-509 fc2 ... 310170133079761 (443)585-1769 Ap #543-7410 Accumsan Rd. CA 92845 USA waldeelbailarin 6885.15 False 1
3 1130112000000-336 1130112000000-336 click click 04-05-2006 164332 466942 http://biblegateway.com/est/phasellus/sit/amet... 1130284800000-343 biblegateway ... 310120073380564 (215)669-3055 900-8123 Aliquam Av. CA 85517 USA charlrey 2593.31 False 1
4 1121990400000-216 1121990400000-216 view view 09-27-2005 164332 400316 https://zdnet.com/elementum/nullam/varius/null... 1108339200000-416 zdnet ... 310160496868669 301 742 1112 164 Cozy Anchor Rd CA 60101 USA scottylago 3958.25 False 1

5 rows × 31 columns

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trifacta-2.5.tar.gz (5.7 kB view hashes)

Uploaded Source

Built Distribution

trifacta-2.5-py3-none-any.whl (5.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page