Metaflow: More Data Science, Less Engineering
Project description
Metaflow
Metaflow is a human-friendly library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
For more information, see Metaflow's website and documentation.
From prototype to production (and back)
Metaflow provides a simple, friendly API that covers foundational needs of ML, AI, and data science projects:
- Rapid local prototyping, support for notebooks, and built-in experiment tracking and versioning.
- Horizontal and vertical scalability to the cloud, utilizing both CPUs and GPUs, and fast data access.
- One-click deployments to highly available production orchestrators.
Getting started
Getting up and running is easy. If you don't know where to start, click the link to get started in your Metaflow sandbox, where you can run code and explore Metaflow in seconds.
Installing Metaflow in your Python environment
To install Metaflow into your local environment, you can install from PyPi:
pip install metaflow
Alternatively, you can install from Conda:
conda install -c conda-forge metaflow
Resources
Slack
An active community of data scientists and ML engineers discussing the ins-and-outs of applied machine learning. Get answers to your MLOps questions in minutes.
Tutorials
Demonstrations of Metaflow in real-world data science contexts. Beginner tutorials are linked below, and the budding Metaflow power user can find more advanced content by exploring here.
- Introduction to Metaflow
- Natural Language Processing with Metaflow
- Computer Vision with Metaflow
- Recommender Systems with Metaflow
Generative AI and LLM use cases
- Parallelizing Stable Diffusion for Production Use Cases
- Whisper with Metaflow on Kubernetes
- Training a Large Language Model With Metaflow, Featuring Dolly
Guides for data scientists
Short posts written to unblock data scientists without breaking their focus. Check here for answers to common Metaflow questions written in a Stack Overflow Q&A style.
Case studies and best practices
This section is a sample of Metaflow content that will not be updated at high frequency. To find the latest Metaflow case studies and best practices, head over to the Outerbounds blog.
Post title | Patterns | Complimentary tools |
---|---|---|
Fast Data: Loading Tables From S3 At Lightning Speed | Optimizing I/O for data transfer in ML systems | Apache Arrow |
Case Study: MLOps for FinTech using Metaflow | Versioning, CI/CD, and Why Metaflow? | Snowflake, TensorFlow, MLFlow, Optuna, Arize |
Scaling Media ML at Netflix | Feature store, Multi-GPU, Model Serving, Vector Search | Ray |
Better Airflow with Metaflow | ML Orchestration, Data Engineering Workflows | Airflow |
Accelerating ML at CNN | ML experimentation platform | AWS Batch, Terraform |
Developing Safe and Reliable ML products at 23andMe | Privacy, compliance, data security, and testing ML systems | MLFlow, Jenkins, AWS Fargate |
Case Study: MLOps for NLP-powered Media Intelligence using Metaflow | Testing, optimizing, and deploying ML workflows | HuggingFace, Neptune, Gradio, PyTorch, ONNX |
Machine Learning with the Modern Data Stack: A Case Study | End-to-end ML with Python and SQL | dbt, Snowflake, CometML, Sagemaker |
Metaflow Deployment Guides
To set up and operate the full stack of ML/data science infrastructure for Metaflow on your own infrastructure, see these guides for engineers. This link will help you navigate the decision for deploying Metaflow in your organization's cloud accounts.
If you are looking for a managed Metaflow offering, see the Outerbounds platform.
Effective Data Science Infrastructure: How to make data scientists productive
A book about data science infrastructure, with many Metaflow tales included! You can find all code examples in this repository.
Get in touch
There are several ways to get in touch with us:
- Slack
- Open an issue at: https://github.com/Netflix/metaflow
Contributing
We welcome contributions to Metaflow. Please see our contribution guide for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for metaflow-2.9.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3240ad71c9a26308a2aec03be99b858782f4952426a46eab6b7867370f4bd4de |
|
MD5 | 16321099dc6373de7e92b6759879d4e1 |
|
BLAKE2b-256 | 57c9b011e19800864d1b5e83aad16edb57bc2bfdea50283489ef9f444d619d44 |