DAGs for LLM interactions

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Diagraph

Diagraph represents Large Language Model (LLM) interactions as a graph, which makes it easy to build, edit, and re-execute a chain of interactions.

Quickstart

pip install diagraph

from diagraph import Diagraph, Depends, prompt, llm

openai.api_key = 'sk-<OPENAI_TOKEN>'

@prompt
def tell_me_a_joke():
  return 'Computer! Tell me a joke about tomatoes.'

@prompt
def explanation(joke: str = Depends(tell_me_a_joke)) -> str:
  return f'Explain why the joke "{joke}" is funny.'

dg = Diagraph(explanation).run()
print(dg.result) # 'The joke is a play on words and concepts. There are two main ideas that make it humorous...
dg

Quickstart visualization

Features

👩‍💻 Build a graph by Expressing Dependencies as Parameters

def fetch_table_schemas():
  ...

def generate_sql(user_query: str, table_schemas = Depends(fetch_table_schemas)):
  ...

🏃‍♀️ Cache and rerun parts of the graph, saving time and money

dg[generate_sql].run()

🖼️ Visualize your graph in a Jupyter Notebook

Visualize a diagraph

Motivation

Who is this for?

Diagraph is primarily for Python developers looking to build software that leverage LLM interactions. It is designed as a low level tool for crafting interactions by hand.

Why would I want this?

Refactoring is Easy: Diagraph simplifies code restructuring. Specify dependencies using parameters for clean, readable, and refactorable code.
Code Faster, Save Money: Diagraph's ability to cache and rerun specific parts of the code saves on execution time and cuts down costs.
Quick Edits: Edit interactions on the fly. Rewrite LLM results as well as functions.
Bring Your Own Code: Use as much or as little as you want.
See The System: Get a straightforward view of the graph with a built-in Jupyter visualization tool.

Usage

Building a Diagraph

Consider a scenario where you need an LLM to generate SQL code based on a user query, incorporating information about table schemas. Traditionally, you might structure it like this:

def fetch_table_schemas():
  # fetch table schemas from the local database
  ...

def prompt_to_generate_sql(user_query: str):
  table_schemas = fetch_table_schemas()
  return openai.ChatCompletion.create(
    messages=[{
      "role": "user", 
      "content": f"Generate a SQL query for the user query: {user_query}. The table schemas are: {table_schemas}"
    }]
  )

return prompt_to_generate_sql('Fetch all active users for the past 30 days')

Now, imagine you want to run another function concurrently: you would have to rewrite your prompt_to_generate_sql function to call both dependent functions concurrently, ensuring you handle error cases, etc.

Diagraph lets you express your functions as a graph:

def fetch_table_schemas():
  # fetch table schemas from the local database
  ...

def generate_sql(user_query: str, table_schemas = Depends(fetch_table_schemas)):
  return openai.ChatCompletion.create(
    messages=[{
      "role": "user", 
      "content": f"Generate a SQL query for the user query: {user_query}. The table schemas are: {table_schemas}"
    }]
  )

print(Diagraph(generate_sql).run('Fetch me all active users for the past 30 days').result)

'Visualization of Diagraph

Here are some key points:

Dependencies are expressed by passing them as parameters using Depends(fn), a familiar pattern if you've used FastAPI.
Diagraph(fn) defines the graph. You provide the terminal nodes (final functions). You can pass multiple terminal nodes and get a tuple in return.
run(*args) accepts user input passed to each function. Multiple arguments can be provided.

Expressing dependencies as a graph allows for easy rearrangement of execution order. Suppose you want to add a step to formalize a user's query. Make it a dependency for fetch_table_schemas: to fetch_table_schemas

def formalize_query(user_query:str):
  return f'The user has provided the following query: {user_query}. Formalize it, fill it out, etc.'

def fetch_table_schemas(formalized_query = Depends(formalize_query)):
  # fetch table schemas from the local database
  ...

'Visualization of Diagraph with extra function

Diagraph automatically inserts the new function at the top of the graph. If fetch_table_schemas doesn't need the formalized query, simply move the dependency to generate_sql:

def formalize_query(user_query:str):
  return f'The user has provided the following query: {user_query}. Formalize it, fill it out, etc.'

def fetch_table_schemas():
  # fetch table schemas from the local database
  ...

def generate_sql(user_query: str, formalized_query = Depends(formalize_query), table_schemas = Depends(fetch_table_schemas)):
  ...

'Visualization of Diagraph with extra function

Now the first two functions execute concurrently, and the final generate_sql query receives both functions' results as arguments.

Re-execution

Encountering a syntax error at the end of a long chain of LLM interactions can be exasperating.

Diagraph comes to the rescue by enabling selective re-execution of parts of your graph, leveraging cached results from prior runs. Consider the following example:

def formalize_query(user_query:str):
  return f'The user has provided the following query: {user_query}. Formalize it, fill it out, etc.'

def fetch_table_schemas():
  # fetch table schemas from the local database
  ...

def generate_sql(user_query: str, formalized_query = Depends(formalize_query), table_schemas = Depends(fetch_table_schemas)):
  ...

dg = Diagraph(generate_sql)

'Visualization of Diagraph with extra function

Assume both formalize_query and fetch_table_schemas executed successfully, but generate_sql encountered a failure. Rerun that specific function with:

dg[generate_sql].run()

Selecting a function with dg[fn] prompts Diagraph to run the subgraph starting from the specified node and running downstream. Any ancestors automatically reuse their cached results. Multiple nodes can also be rerun simultaneously using dg[fn1, fn2, fn3].run().

Nodes can be dynamically edited on the fly. For instance, modify a cached result:

dg[formalize_query].result = 'My fake formalized result'

dg[generate_sql].run()

Functions can also be edited on the fly:

def new_formalize_query():
  ...

dg[formalize_query] = new_formalize_query

dg[formalize_query].run()

When editing a node, use the original function as a key - here, formalize_query.

Visualizations

Larger graphs can be complicated to visualize. If you're in a Jupyter notebook, Diagraph provides a handy visualization tool for inspecting your graphs.

Simply return a diagraph to view it:

Visualize a diagraph

You can open a node to view its function, prompt, and result:

An expanded nod

Prompt

Diagraph accepts regular functions, but functions decorated with the @prompt decorator sprout superpowers.

Decorate a function with @prompt and return a plain string:

@prompt
def formalize_query(user_query:str):
  return f'The user has provided the following query: {user_query}. Formalize it, fill it out, etc.'

The returned string is automatically passed as a prompt to the LLM (default is OpenAI GPT-3.5-turbo).

@prompt accepts additional arguments:

from diagraph import OpenAI

def handle_log(event, data):
    if event == 'start':
        print('*' * 20)
    elif event == 'end':
        print(f'\n')        
    else:
        print(data, end='')

def error(e: Exception):
  print(e)
  raise e


@prompt(
  llm=OpenAI('gpt-4'),
  log=handle_log,
  error=error_handler
)

These same arguments can be passed to the Diagraph constructor as well, to apply to all nodes:

Diagraph(terminal_node, llm=OpenAI('gpt-4'), error=error_handler, log=handle_log)

Or can be set at a global level:

Diagraph.llm = OpenAI('gpt-4')
Diagraph.error=error_handler
Diagraph.log=handle_log

Parameters defined at the node level take precedence over Diagraph level, which takes precedence over global levels.

License

MIT

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.6

Dec 2, 2023

0.4.5

Dec 2, 2023

0.4.4

Dec 1, 2023

0.4.3

Nov 29, 2023

0.4.2

Nov 29, 2023

0.4.1

Nov 28, 2023

0.4.0

Nov 26, 2023

0.3.3

Nov 26, 2023

0.3.2

Nov 26, 2023

0.3.1

Nov 26, 2023

0.3.0

Nov 26, 2023

This version

0.2.0

Nov 14, 2023

0.1.11

Nov 14, 2023

0.1.10

Nov 9, 2023

0.1.9

Nov 9, 2023

0.1.8

Nov 8, 2023

0.1.7

Nov 4, 2023

0.1.6

Nov 4, 2023

0.1.5

Nov 4, 2023

0.1.4

Nov 3, 2023

0.1.4rc3 pre-release

Nov 3, 2023

0.1.3

Nov 3, 2023

0.1.2

Oct 31, 2023

0.1.1

Oct 31, 2023

0.1.0

Oct 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diagraph-0.2.0.tar.gz (949.4 kB view hashes)

Uploaded Nov 14, 2023 Source

Built Distribution

diagraph-0.2.0-py3-none-any.whl (961.1 kB view hashes)

Uploaded Nov 14, 2023 Python 3

Hashes for diagraph-0.2.0.tar.gz

Hashes for diagraph-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c78e72ab84bcca1443b833dfb87fe734fcfb12280c1d53726023f959825ecf28`
MD5	`422feea2ab5c9f23f8195cc1b2cf4cc9`
BLAKE2b-256	`347f16371eb9443c1375f715b86313a3cffa0300b661a28395c189e6f5c36fa1`

Hashes for diagraph-0.2.0-py3-none-any.whl

Hashes for diagraph-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3b4c08d4db4c2aa514201cd072f68579568ad038b81c0703f6f64fa31531bd1`
MD5	`29d2c57de49a61f6f44d3f72cc1a2616`
BLAKE2b-256	`f8b25ddeb2455aa4b43aabc37113fc27e666679229d2bb6c994868f167e78f74`