Skip to main content

Python bindings for the Neo4j Graph Data Science library

Project description

gdsclient

gdsclient is a Python wrapper API for operating and working with the Neo4j Graph Data Science (GDS) library. It enables users to write pure Python code to project graphs, run algorithms, and define and use machine learning pipelines in GDS.

The API is designed to mimic the GDS Cypher procedure API in Python code. It abstracts the necessary operations of the Neo4j Python driver to offer a simpler surface.

Please leave any feedback as issues on the source repository. Happy coding!

NOTE

This is a work in progress and several GDS features are known to be missing or not working properly (see Known limitations below). Further, this library targets GDS versions 2.0+ (not yet released) and as such may not work with older versions.

Installation

To install the latest deployed version of gdsclient, simply run:

pip install gdsclient

Usage

What follows is a high level description of some of the operations supported by gdsclient. For extensive documentation of all operations supported by GDS, please refer to the GDS Manual.

Extensive end-to-end examples in Jupyter ready-to-run notebooks can be found in the examples source directory:

Imports and setup

The library wraps the Neo4j Python driver with a GraphDataScience object through which most calls to GDS will be made.

from neo4j import GraphDatabase
from gdsclient import Neo4jQueryRunner, GraphDataScience

# Replace Neo4j Python driver settings according to your setup
URI = "bolt://localhost:7687"
driver = GraphDatabase.driver(URI)
gds = GraphDataScience(Neo4jQueryRunner(driver))
gds.set_database("my-db")  # (Optional) Use a specific Neo4j database

Projecting a graph

Supposing that we have some graph data in our Neo4j database, we can project the graph into memory.

# Optionally we can estimate memory of the operation first
res = gds.graph.project.estimate("*", "*")
assert res[0]["requiredMemory"] < 1e12

G = gds.graph.project("graph", "*", "*")

The G that is returned here is a Graph which on the client side represents the projection on the server side.

The analogous calls gds.graph.project.cypher{,.estimate} for Cypher based projection are also supported.

Running algorithms

We can take a projected graph, represented to us by a Graph object named G, and run algorithms on it.

# Optionally we can estimate memory of the operation first (if the algo supports it)
res = gds.pageRank.write.estimate(G, tolerance=0.5, writeProperty="pagerank")
assert res[0]["requiredMemory"] < 1e12

res = gds.pageRank.mutate(G, tolerance=0.5, writeProperty="pagerank")
assert res[0]["nodePropertiesWritten"] == G.node_count()

These calls take one positional argument and a number of keyword arguments depending on the algorithm. The first (positional) argument is a Graph, and the keyword arguments map directly to the algorithm's configuration map.

The other algorithm execution modes - stats, stream and write - are also supported via analogous calls.

Though most algorithms are supported this way, not all are yet. Please see Known limitations below for more on this.

The Graph object

In this library, graphs projected onto server-side memory are represented by Graph objects. There are convenience methods on the Graph object that let us extract information about our projected graph. Some examples are (where G is a Graph):

# Get the graph's node count
n = G.node_count()

# Get a list of all relationship properties present on
# relationships of the type "myRelType"
rel_props = G.relationship_properties("myRelType")

# Drop the projection represented by G
G.drop()

Graph catalog utils

All procedures from the GDS Graph catalog are supported with gdsclient. Some examples are (where G is a Graph):

res = gds.graph.list()
assert len(res) == 1  # Exactly one graph is projected

res = gds.graph.streamNodeProperties(G, "rank")
assert len(res) == G.node_count()

Further, there's a new call named gds.graph.get (gdsclient only) which takes a name as input and returns a Graph object if a graph projection of that name exists in the user's graph catalog. The idea is to have a way of creating Graphs for already projected graphs, without having to do a new projection.

Known limitations

Several operations are known to not yet work with gdsclient:

  • Path finding algorithms
  • Topological link prediction
  • Supervised machine learning (GraphSAGE, Link prediction, Node classification)
  • Progress logging and system monitoring
  • Some utility functions

License

gdsclient is licensed under the Apache Software License version 2.0. All content is copyright © Neo4j Sweden AB.

Acknowledgements

This work has been inspired by the great work done in the following libraries:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gdsclient-0.0.3.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gdsclient-0.0.3-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file gdsclient-0.0.3.tar.gz.

File metadata

  • Download URL: gdsclient-0.0.3.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for gdsclient-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3b4e0bf381f20b4fc841edb37696e2498e75266cc817310cc9a5e46d62a8828f
MD5 1cd00e398f030b53cce33d17e4a0d3ef
BLAKE2b-256 5cda1e12c4798b340912da1f549ce05492c4fed087f9b3861f719547c0703f64

See more details on using hashes here.

File details

Details for the file gdsclient-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: gdsclient-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for gdsclient-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 dc444a0f6ff44f7674bc18b63fbf5efbd229c0d97270d5853afe45df35ff8e04
MD5 eaf0a22bb33b917bf0b21b0e310c46c5
BLAKE2b-256 1ed4fd0a4fd6b02ca839e64cece0bcb700f7cf5427c24df087431a44f4541abf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page