AgensGraph Query

Graph Query

Introduction

To retrieve and manipulate graph data, AgensGraph supports the Cypher query language. Cypher is a declarative language similar to SQL. Cypher is easy to learn since its syntax visually describes the patterns found in graphs.

This guide briefly explains how to write Cypher queries using an example graph.

Creating an Example Graph

AgensGraph can store multiple graphs in a single database. However, Cypher has no way of discerning multiple graphs. Therefore, AgensGraph supports additional Data Definition Languages and variables to create and manage graphs using Cypher.

The following statements create a graph called network and set it as a current graph.

CREATE GRAPH network;
SET graph_path = network;

In this example, the graph_path variable is explicitly set to network. However, if graph_path is not set before creating the graph, it will be set automatically after creating the graph.

Creating the Labels

Before creating graph data, generating a label is basic. Although this is the default, label is generated automatically when you specify label in the CREATE statement in cypher(VLABEL/ELABEL can both be created).

note: Be careful not to confuse the label with the lable. Unintentional new labels can be created.

All graph elements have one label. For vertex, if no label is specified, have ag_vertex as a default label. For edge, the label can not be omitted. ag_edge label also exists but is used for other purposes.

AgensGraph supports DDL’s to create such labels.

The following statements create a vertex label person and a edge label knows.

CREATE VLABEL person;
CREATE ELABEL knows;
CREATE (n:movie {title:'Matrix'});

Creating the Vertices and Edges

Now, we can create vertices for person and edges for knows by using Cypher’s CREATE clause. The CREATE clause creates a pattern that consists of vertices and edges. A vertex has the form: (variable:label {property: value, ...}), and a edge has: -[variable:label {property: value, ...}]-. An additional < on the leftmost side or > on the rightmost side is used to denote the direction of the edge. variable can be omitted if created vertices and edges are not to be referenced.

note: AgensGraph does not support -- grammar for an edge in a pattern because -- means a comment to the end of the line.

The following statements create three simple patterns: “Tom knows Summer”, “Pat knows Nikki” and “Olive knows Todd”.

CREATE (:person {name: 'Tom'})-[:knows {fromdate:'2011-11-24'}]->(:person {name: 'Summer'});
CREATE (:person {name: 'Pat'})-[:knows {fromdate:'2013-12-25'}]->(:person {name: 'Nikki'});
CREATE (:person {name: 'Olive'})-[:knows {fromdate:'2015-01-26'}]->(:person {name: 'Todd'});
MATCH (p:Person {name: 'Tom'}),(k:Person{name: 'Pat'}) 
CREATE (p)-[:KNOWS {fromdate:'2017-02-27'} ]->(k);

To store properties of vertices and edges, AgensGraph uses PostgreSQL’s jsonb type. Properties can have nested JSON objects as their values. Since AgensGraph uses PostgreSQL’s type system, any data type supported by PostgreSQL can be stored into the properties of vertices and edges.

Querying the Graph

Let’s retrieve the pattern we created above. Cypher has the MATCH clause to find a pattern in a graph.

The following statement finds the pattern, “A person called Tom knows a person”.

MATCH (n:person {name: 'Tom'})-[:knows]->(m:person) RETURN n.name AS n, m.name AS m;

Result :

   n   |    m
-------+----------
 "Tom" | "Summer"
 "Tom" | "Pat"
(2 rows)

Since properties are of the jsonb type, we need methods to access their property values. PostgreSQL supports those methods through operators such as ->, ->>, #>, and #>>. If a user wants to access to the property name of vertex m, one can write (m)->>name. AgensGraph offers an alternate way to access these elements. AgensGraph uses the dot operator . and bracket operators [] on vertices and edges to access property values in JSON objects and elements in JSON arrays as shown above.

The RETURN clause returns variables and its properties as a result of the query. The result is a table which has multiple matched patterns in its rows.

Variable Length Edges

Let’s consider a query that finds knows of ‘Tom’ and knows of knows. We can use the UNION clause:

MATCH (p:person {name: 'Tom'})-[:knows]->(f:person)
RETURN f.name
UNION ALL
MATCH (p:person {name: 'Tom'})-[:knows]->()-[:knows]->(f:person)
RETURN f.name;

It can also be written as:

MATCH (p:person {name: 'Tom'})-[r:knows*1..2]->(f:person)
RETURN f.name, r[1].fromdate;

A query looking for vertices located after a variable length of edge-vertex paths is typical in graph databases. *1..2 used in the edge represents such a variable-length edge. Where 1 is the minimum length of the edge and 2 is the maximum length. If you do not specify a value, the default range values are 1 and infinity.

You can also use update clauses such as CREATE, SET, REMOVE and DELETE after MATCH clauses. In the next section, we will see how data in a graph can be modified and deleted.

Manipulating the Graph

You can set properties on vertices and edges using the SET clause. If you set a null value to a property, the property will be removed.

The following statement finds the given pattern and updates the property in the matched edge.

MATCH (:person {name: 'Tom'})-[r:knows]->(:person {name: 'Summer'})
SET r.since = '2009-01-08';

To delete vertices and edges in a graph, we can use the DELETE clause.

The following statement finds a vertex and deletes it.

MATCH (n:person {name: 'Pat'}) DETACH DELETE (n);

The above example actually uses the DETACH DELETE clause to delete vertices and edges attached to the vertices all at once.

The final shape of the graph network is as follows:

MATCH (n)-[r]->(m) RETURN n.name AS n, properties(r) AS r, m.name AS m;

Result :

    n    |                         r                         |    m
---------+---------------------------------------------------+----------
 "Tom"   | {"since": "2009-01-08", "fromdate": "2011-11-24"} | "Summer"
 "Olive" | {"fromdate": "2015-01-26"}                        | "Todd"
(2 rows)

MERGE

If you need to ensure that a pattern exists in the graph, you can use MERGE. It will try to find the pattern if the pattern exists in the graph, or else it creates the pattern if it does not exist. If the pattern exists, it is treated like a MATCH clause, otherwise it is treated like a CREATE clause. MERGE is a MATCH or CREATE of the entire pattern. This means that if any element of the pattern does NOT exist, AgensGraph will create the entire pattern.

The following statement guarantees that everyone’s city exists in the graph.

CREATE VLABEL customer;
CREATE VLABEL city;

CREATE (:customer {name:'Tom', city:'santa clara'}),
       (:customer {name:'Summer ', city:'san jose'}),
       (:customer {name:'Pat', city:'santa clara'}),
       (:customer {name:'Nikki', city:'san jose'}),
       (:customer {name:'Olive', city:'san francisco'});

MATCH (a:customer)
MERGE (c:city {name:a.city});

MATCH (c:city) RETURN properties(c);

Result :

       properties        
-------------------------
 {"name": "santa clara"}
 {"name": "san jose"}
 {"name": "san francisco"}
(3 rows)

MERGE can perform SET depending on whether the pattern is MATCHed or CREATed. If it is MATCHed, it will execute ON MATCH SET clause. If it is CREATE-ed, it will execute ON CREATE SET clause.

CREATE (:customer {name:'Todd', city:'palo alto'});

MATCH (a:customer)
MERGE (c:city {name:a.city})
    ON MATCH SET c.matched = 'true'
    ON CREATE SET c.created = 'true';
    
MATCH (c:city) RETURN properties(c);

Result :

                  properties
----------------------------------------------
 {"name": "santa clara", "matched": "true"}
 {"name": "san jose", "matched": "true"}
 {"name": "san francisco", "matched": "true"}
 {"name": "palo alto", "created": "true"}
(4 rows)

The default isolation level of the transaction is Read committed on AgensGraph. Therefore, if another transaction executes MERGE for the same pattern at the same time, two (or more) identical patterns can be created at the same time. To prevent this, you can execute a transaction at the serializable isolation level as in the example below. If an attempt is made to create the same pattern concurrently, one transaction fails with an error. If you retry a failed transaction, it will behave like MATCH instead of CREATE because of the pattern created by the succeeded transaction already exists.

BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
MATCH (a:customer)
MERGE (c:city {name:a.city});
COMMIT;

Finding the Shortest Path

The shortestpath function can be used to find the shortest path between two vertices. If you want to find all paths, you can use the allshortestpaths function.

MATCH (p:person {name:'Tom'}), (f:person {name:'Olive'}) CREATE (p)-[:knows]->(f);
MATCH (p1:person {name: 'Tom'}), (p2:person {name: 'Todd'}), 
       path=shortestpath((p1)-[:knows*1..5]->(p2)) RETURN path;

In this example, we create a ‘knows’ edge path from ‘Tom’ to ‘Olive’. To find the ‘knows’ path from ‘Tom’ to ‘Todd’, we can use the shortestpath function. The shortestpath function takes a pattern consisting of a start vertex, a edge and an end vertex. We can use a variable length edge expression in the edge to specify if we are looking for a certain degree of connection.

The query results are as follows.

[person[3.1]{"name": "Tom"},knows[4.5][3.1,3.5]{},person[3.5]{"name": "Olive"},knows[4.3][3.5,3.6]
{"fromdate": "2015-01-26"},person[3.6]{"name": "Todd"}]

Hybrid Query

Introduction

In this section, we will see how to use SQL and Cypher together in AgensGraph using the following example graph.
Hybrid Query performs aggregation and statistical processing on table and column by using SQL query used in RDB,and the Cypher query used by GDB supports better data query than RDB’s Join operation.

CREATE GRAPH skai;
CREATE VLABEL dev;
CREATE (:dev {name: 'someone', year: 2015});
CREATE (:dev {name: 'somebody', year: 2016});

CREATE TABLE history (year, event)
AS VALUES (1996, 'PostgreSQL'), (2016, 'AgensGraph');

Cypher in SQL

Since the result of a Cypher query is a relation, you can use a Cypher query in the FROM clause of SQL as if it is a subquery. It is possible to use Cypher syntax inside the FROM clause to utilize dataset of vertex or edge stored in graph DB as data in SQL statement.

Syntex :

SELECT [column_name]
FROM ({table_name|SQLquery|CYPHERquery})
WHERE [column_name operator value];

Example :

SELECT n->>'name' as name 
FROM history, (MATCH (n:dev) RETURN n) as dev 
WHERE history.year > (n->>'year')::int;

Result :

  name
---------
 someone
(1 row)

SQL in Cypher

When querying the content of the graph DB with the cypher queries, you can use the match and where clause when you want to search using specific data of the RDB. However, the resulting dataset in the SQL queries must be configured to return a single row of results.

Syntex :

MATCH [table_name]
WHERE (column_name operator {value|SQLquery|CYPHERquery})
RETURN [column_name];

Example :

MATCH (n:dev)
WHERE n.year < (SELECT year FROM history WHERE event = 'AgensGraph')
RETURN properties(n) AS n;

Result :

                 n
-----------------------------------
 {"name": "someone", "year": 2015}
(1 row)