4 SQL-Based Property Graph Query and Analytics
You can use SQL to query property graph data in Oracle Spatial and Graph.
For the property graph support in Oracle Spatial and Graph, all the vertices and edges data are persisted in relational form in Oracle Database. For detailed information about the Oracle Spatial and Graph property graph schema objects, see Property Graph Schema Objects for Oracle Database.
This chapter provides examples of typical graph queries implemented using SQL. The audience includes DBAs as well as application developers who understand SQL syntax and property graph schema objects.
The benefits of querying directly property graph using SQL include:
-
There is no need to bring data outside Oracle Database.
-
You can leverage the industry-proven SQL engine provided by Oracle Database.
-
You can easily join or integrate property graph data with other data types (relational, JSON, XML, and so on).
-
You can take advantage of existing Oracle SQL tuning and database management tools and user interface.
The examples assume that there is a property graph named connections
in the current schema. The SQL queries and example output are for illustration purpose only, and your output may be different depending on the data in your connections
graph. In some examples, the output is reformatted for readability.
- Simple Property Graph Queries
The examples in this topic query vertices, edges, and properties of the graph. - Text Queries on Property Graphs
If values of a property (vertex property or edge property) contain free text, then it might help performance to create an Oracle Text index on the V column. - Navigation and Graph Pattern Matching
A key benefit of using a graph data model is that you can easily navigate across entities (people, movies, products, services, events, and so on) that are modeled as vertices, following links and relationships modeled as edges. In addition, graph matching templates can be defined to do such things as detect patterns, aggregate individuals, and analyze trends. - Navigation Options: CONNECT BY and Parallel Recursion
The CONNECT BY clause and parallel recursion provide options for advanced navigation and querying. - Pivot
The PIVOT clause lets you dynamically add columns to a table to create a new table. - SQL-Based Property Graph Analytics
In addition to the analytical functions offered by the in-memory analyst, the property graph feature in Oracle Spatial and Graph supports several native, SQL-based property graph analytics. - Property Graph Query Language (PGQL)
PGQL is a SQL-like query language for property graph data structures that consist of nodes that are connected to other nodes by edges, each of which can have key-value pairs (properties) associated with them. - Executing PGQL Queries Directly Against Oracle Database
This topic explains how you can execute PGQL queries directly against the graph in Oracle Database (as opposed to in-memory).
4.1 Simple Property Graph Queries
The examples in this topic query vertices, edges, and properties of the graph.
Example 4-1 Find a Vertex with a Specified Vertex ID
This example find the vertex with vertex ID 1 in the connections
graph.
SQL> select vid, k, v, vn, vt
from connectionsVT$
where vid=1;
The output might be as follows:
1 country United States 1 name Barack Obama 1 occupation 44th president of United States of America ...
Example 4-2 Find an Edge with a Specified Edge ID
This example find the edge with edge ID 100 in the connections
graph.
SQL> select eid,svid,dvid,k,t,v,vn,vt
from connectionsGE$
where eid=1000;
The output might be as follows:
1000 1 2 weight 3 1 1
In the preceding output, the K of the edge property is "weight" and the type ID of the value is 3, indicating a float value.
Example 4-3 Perform Simple Counting
This example performs simple counting in the connections
graph.
SQL> -- Get the total number of K/V pairs of all the vertices
SQL> select /*+ parallel */ count(1)
from connectionsVT$;
299
SQL> -- Get the total number of K/V pairs of all the edges
SQL> select /*+ parallel(8) */ count(1)
from connectionsGE$;
164
SQL> -- Get the total number of vertices
SQL> select /*+ parallel */ count(distinct vid)
from connectionsVT$;
78
SQL> -- Get the total number of edges
SQL> select /*+ parallel */ count(distinct eid)
from connectionsGE$;
164
Example 4-4 Get the Set of Property Keys Used
This example gets the set of property keys used for the vertices n the connections
graph.
SQL> select /*+ parallel */ distinct k
from connectionsVT$;
company
show
occupation
type
team
religion
criminal charge
music genre
genre
name
role
political party
country
13 rows selected.
SQL> -- get the set of property keys used for edges
SQL> select /*+ parallel */ distinct k
from connectionsGE$;
weight
Example 4-5 Find Vertices with a Value
This example finds vertices with a value (of any property) that is of String type, and where and the value contains two adjacent occurrences of a, e, i, o, or u, regardless of case.n the connections
graph.
SQL> select vid, t, k, v
from connectionsVT$
where t=1
and regexp_like(v, '([aeiou])\1', 'i');
6 1 name Jordan Peele
6 1 show Key and Peele
54 1 name John Green
...
It is usually hard to leverage a B-Tree index for the preceding kind of query because it is difficult to know beforehand what kind of regular expression is going to be used. For the above query, you might get the following execution plan. Note that full table scan is chosen by the optimizer.
------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 795 | 28 (0)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM)| :TQ10000 | 15 | 795 | 28 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 15 | 795 | 28 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS FULL| CONNECTIONSVT$ | 15 | 795 | 28 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(INTERNAL_FUNCTION("V") AND REGEXP_LIKE ("V",U'([aeiou])\005C1','i') AND "T"=1 AND INTERNAL_FUNCTION("K"))
Note
-----
- Degree of Parallelism is 2 because of table property
If the Oracle Database In-Memory option is available and memory is sufficient, it can help performance to place the table (full table or a set of relevant columns) in memory. One way to achieve that is as follows:
SQL> alter table connectionsVT$ inmemory;
Table altered.
Now, entering the same SQL containing the regular expression shows a plan that performs a "TABLE ACCESS INMEMORY FULL".
--------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 795 | 28 (0)| 00:00:01 | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 15 | 795 | 28 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 3 | PX BLOCK ITERATOR | | 15 | 795 | 28 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | |
|* 4 | TABLE ACCESS INMEMORY FULL| CONNECTIONSVT$ | 15 | 795 | 28 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | |
---------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(INTERNAL_FUNCTION("V") AND REGEXP_LIKE ("V",U'([aeiou])\005C1','i') AND "T"=1 AND INTERNAL_FUNCTION("K"))
Note
-----
- Degree of Parallelism is 2 because of table property
Parent topic: SQL-Based Property Graph Query and Analytics
4.2 Text Queries on Property Graphs
If values of a property (vertex property or edge property) contain free text, then it might help performance to create an Oracle Text index on the V column.
Oracle Text can process text that is directly stored in the database. The text can be short strings (such as names or addresses), or it can be full-length documents. These documents can be in a variety of textual format.
The text can also be in many different languages. Oracle Text can handle any space-separated languages (including character sets such as Greek or Cyrillic). In addition, Oracle Text is able to handle the Chinese, Japanese and Korean pictographic languages)
Because the property graph feature uses NVARCHAR typed column for better support of Unicode, it is highly recommended that UTF8 (AL32UTF8) be used as the database character set.
To create an Oracle Text index on the vertices table (or edges table), the ALTER SESSION privilege is required. For example:
SQL> grant alter session to <YOUR_USER_SCHEMA_HERE>;
If customization is required, also grant the EXECUTE privilege on CTX_DDL:
SQL> grant execute on ctx_ddl to <YOUR_USER_SCHEMA_HERE>;
The following shows some example statements for granting these privileges to SCOTT.
SQL> conn / as sysdba
Connected.
SQL> -- This is a PDB setup --
SQL> alter session set container=orcl;
Session altered.
SQL> grant execute on ctx_ddl to scott;
Grant succeeded.
SQL> grant alter session to scott;
Grant succeeded.
Example 4-6 Create a Text Index
This example creates an Oracle Text index on the vertices table (V column) of the connections graph in the SCOTT schema. Note that unlike the text index capabilities provided by either Apache Lucene or Apache SolrCloud, the Oracle Text index created here is for all property keys, not just one or a subset of property keys. In addition, if a new property is added to the graph and the property value is of String data type, then it will automatically be included in the same text index.
The example uses the OPG_AUTO_LEXER lexer owned by MDSYS.
SQL> execute opg_apis.create_vertices_text_idx('scott', 'connections', pref_owner=>'MDSYS', lexer=>'OPG_AUTO_LEXER', dop=>2);
If customization is desired, you can use the ctx_ddl.create_preference API. For example:
SQL> -- The following requires access privilege to CTX_DDL
SQL> exec ctx_ddl.create_preference('SCOTT.OPG_AUTO_LEXER', 'AUTO_LEXER');
PL/SQL procedure successfully completed.
SQL> execute opg_apis.create_vertices_text_idx('scott', 'connections', pref_owner=>'scott', lexer=>'OPG_AUTO_LEXER', dop=>2);
PL/SQL procedure successfully completed.
You can now use a rich set of functions provided by Oracle Text to perform queries against graph elements.
Note:
If you no longer need an Oracle Text index, you can use the drop_vertices_text_idx or opg_apis.drop_edges_text_idx API to drop it. The following statements drop the text indexes on the vertices and edges of a graph named connections
owned by SCOTT:
SQL> exec opg_apis.drop_vertices_text_Idx('scott', 'connections');
SQL> exec opg_apis.drop_edges_text_Idx('scott', 'connections');
Example 4-7 Find a Vertex that Has a Property Value
The following example find a vertex that has a property value (of string type) containing the keyword "Obama".
SQL> select vid, k, t, v
from connectionsVT$
where t=1
and contains(v, 'Obama', 1) > 0
order by score(1) desc
;
The output and SQL execution plan from the preceding statement may appear as follows. Note that DOMAIN INDEX appears as an operation in the execution plan.
1 name 1 Barack Obama
Execution Plan
----------------------------------------------------------
Plan hash value: 1619508090
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 56 | 5 (20) | 00:00:01 | | |
| 1 | SORT ORDER BY | | 1 | 56 | 5 (20) | 00:00:01 | | |
|* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$ | 1 | 56 | 4 (0)| 00:00:01 | ROWID | ROWID |
|* 3 | DOMAIN INDEX | CONNECTIONSXTV$ | | | 4 (0)| 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("T"=1 AND INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V"))
3 - access("CTXSYS"."CONTAINS"("V",'Obama',1)>0)
Example 4-8 Fuzzy Match
The following example finds a vertex that has a property value (of string type) containing variants of "ameriian" (a deliberate misspelling for this example) Fuzzy match is used.
SQL> select vid, k, t, v
from connectionsVT$
where contains(v, 'fuzzy(ameriian,,,weight)', 1) > 0
order by score(1) desc;
The output and SQL execution plan from the preceding statement may appear as follows.
8 role 1 american business man 9 role 1 american business man 4 role 1 american economist 6 role 1 american comedian actor 7 role 1 american comedian actor 1 occupation 1 44th president of United States of America 6 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1619508090 ----------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 56 | 5 (20)| 00:00:01 | | | | 1 | SORT ORDER BY | | 1 | 56 | 5 (20)| 00:00:01 | | | |* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$ | 1 | 56 | 4 (0)| 00:00:01 | ROWID | ROWID | |* 3 | DOMAIN INDEX | CONNECTIONSXTV$ | | | 4 (0)| 00:00:01 | | | ----------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V"))
Example 4-9 Query Relaxation
The following example is a sophisticated Oracle Text query that implements query relaxation, which enables you to execute the most restrictive version of a query first, progressively relaxing the query until the required number of matches is obtained. Using query relaxation with queries that contain multiple strings, you can provide guidance for determining the “best” matches, so that these appear earlier in the results than other potential matches.
This example searchs for "american actor" with a query relaxation sequence.
SQL> select vid, k, t, v
from connectionsVT$
where CONTAINS (v,
'<query>
<textquery lang="ENGLISH" grammar="CONTEXT">
<progression>
<seq>{american} {actor}</seq>
<seq>{american} NEAR {actor}</seq>
<seq>{american} AND {actor}</seq>
<seq>{american} ACCUM {actor}</seq>
</progression>
</textquery>
<score datatype="INTEGER" algorithm="COUNT"/>
</query>') > 0;
The output and SQL execution plan from the preceding statement may appear as follows.
7 role 1 american comedian actor 6 role 1 american comedian actor 44 occupation 1 actor 8 role 1 american business man 53 occupation 1 actor film producer 52 occupation 1 actor 4 role 1 american economist 47 occupation 1 actor 9 role 1 american business man 9 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2158361449 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 56 | 4 (0)| 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID| CONNECTIONSVT$ | 1 | 56 | 4 (0)| 00:00:01 | ROWID | ROWID | |* 2 | DOMAIN INDEX | CONNECTIONSXTV$ | | | 4 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V")) 2 - access("CTXSYS"."CONTAINS"("V",'<query> <textquery lang="ENGLISH" grammar="CONTEXT"> <progression> <seq>{american} {actor}</seq> <seq>{american} NEAR {actor}</seq> <seq>{american} AND {actor}</seq> <seq>{american} ACCUM {actor}</seq> </progression> </textquery> <score datatype="INTEGER" algorithm="COUNT"/> </query>')>0)
Example 4-10 Find an Edge
Just as with vertices, you can create an Oracle Text index on the V column of the edges table (GE$) of a property graph. The following example uses the OPG_AUTO_LEXER lexer owned by MDSYS.
SQL> exec opg_apis.create_edges_text_idx('scott', 'connections', pref_owner=>'mdsys', lexer=>'OPG_AUTO_LEXER', dop=>4);
If customization is required, use the ctx_ddl.create_preference API.
Parent topic: SQL-Based Property Graph Query and Analytics
4.3 Navigation and Graph Pattern Matching
A key benefit of using a graph data model is that you can easily navigate across entities (people, movies, products, services, events, and so on) that are modeled as vertices, following links and relationships modeled as edges. In addition, graph matching templates can be defined to do such things as detect patterns, aggregate individuals, and analyze trends.
This topic provides graph navigation and pattern matching examples using the example property graph named connections. Most of the SQL statements are relatively simple, but they can be used as building blocks to implement requirements that are more sophisticated. It is generally best to start from something simple, and progressively add complexity.
Example 4-11 Who Are a Person's Collaborators?
The following SQL ststement finds all entities that a vertex with ID 1 collaborates with. For simplicity, it considers only outgoing relationships.
SQL> select dvid, el, k, vn, v
from connectionsGE$
where svid=1
and el='collaborates';
Note:
To find the specific vertex ID of interest, you can perform a text query on the property graph using keywords or fuzzy matching. (For details and examples, see Text Queries on Property Graphs.)
The preceding example's output and execution plan may be as follows.
2 collaborates weight 1 1 21 collaborates weight 1 1 22 collaborates weight 1 1 .... 26 collaborates weight 1 1 10 rows selected. ------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 460 | 2 (0)| 00:00:01 | | | | | | | 1 | PX COORDINATOR | | | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10 | 460 | 2 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) | | 3 | PX PARTITION HASH ALL | | 10 | 460 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | | |* 4 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| CONNECTIONSGE$ | 10 | 460 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | |* 5 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 20 | | 1 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | ------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter(INTERNAL_FUNCTION("EL") AND "EL"=U'collaborates' AND INTERNAL_FUNCTION("K") AND INTERNAL_FUNCTION("V")) 5 - access("SVID"=1)
Example 4-12 Who Are a Person's Collaborators and What are Their Occupations?
The following SQL statement finds collaborators of the vertex with ID 1, and the occupation of each collaborator. A join with the vertices table (VT$) is required.
SQL> select dvid, vertices.v
from connectionsGE$, connectionsVT$ vertices
where svid=1
and el='collaborates'
and dvid=vertices.vid
and vertices.k='occupation';
The preceding example's output and execution plan may be as follows.
21 67th United States Secretary of State 22 68th United States Secretary of State 23 chancellor 28 7th president of Iran 19 junior United States Senator from New York ... -------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 525 | 7 (0)| 00:00:01 | | | | | | | 1 | PX COORDINATOR | | | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 7 | 525 | 7 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) | | 3 | NESTED LOOPS | | 7 | 525 | 7 (0)| 00:00:01 | | | Q1,00 | PCWP | | | 4 | PX PARTITION HASH ALL | | 10 | 250 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | | |* 5 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| CONNECTIONSGE$ | 10 | 250 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | |* 6 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 20 | | 1 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | | 7 | PARTITION HASH ITERATOR | | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | |* 8 | TABLE ACCESS BY LOCAL INDEX ROWID | CONNECTIONSVT$ | | | | | KEY | KEY | Q1,00 | PCWP | | |* 9 | INDEX UNIQUE SCAN | CONNECTIONSXQV$ | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | -------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - filter(INTERNAL_FUNCTION("EL") AND "EL"=U'collaborates') 6 - access("SVID"=1) 8 - filter(INTERNAL_FUNCTION("VERTICES"."V")) 9 - access("DVID"="VERTICES"."VID" AND "VERTICES"."K"=U'occupation') filter(INTERNAL_FUNCTION("VERTICES"."K"))
Example 4-13 Find a Person's Enemies and Aggregate Them by Their Country
The following SQL statement finds enemies (that is, those with the feuds
relationship) of the vertex with ID 1, and aggregates them by their countries. A join with the vertices table (VT$) is required.
SQL> select vertices.v, count(1)
from connectionsGE$, connectionsVT$ vertices
where svid=1
and el='feuds'
and dvid=vertices.vid
and vertices.k='country'
group by vertices.v;
The example's output and execution plan may be as follows. In this case, the vertex with ID 1 has 3 enemies in the United States and 1 in Russia.
United States 3 Russia 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 5 | 375 | 5 (20)| 00:00:01 | | | | | | | 1 | PX COORDINATOR | | | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 5 | 375 | 5 (20)| 00:00:01 | | | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 5 | 375 | 5 (20)| 00:00:01 | | | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 5 | 375 | 5 (20)| 00:00:01 | | | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 5 | 375 | 5 (20)| 00:00:01 | | | Q1,00 | P->P | HASH | | 6 | HASH GROUP BY | | 5 | 375 | 5 (20)| 00:00:01 | | | Q1,00 | PCWP | | | 7 | NESTED LOOPS | | 5 | 375 | 4 (0)| 00:00:01 | | | Q1,00 | PCWP | | | 8 | PX PARTITION HASH ALL | | 5 | 125 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | | |* 9 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| CONNECTIONSGE$ | 5 | 125 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | |* 10 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 20 | | 1 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | | 11 | PARTITION HASH ITERATOR | | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | |* 12 | TABLE ACCESS BY LOCAL INDEX ROWID | CONNECTIONSVT$ | | | | | KEY | KEY | Q1,00 | PCWP | | |* 13 | INDEX UNIQUE SCAN | CONNECTIONSXQV$ | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 9 - filter(INTERNAL_FUNCTION("EL") AND "EL"=U'feuds') 10 - access("SVID"=1) 12 - filter(INTERNAL_FUNCTION("VERTICES"."V")) 13 - access("DVID"="VERTICES"."VID" AND "VERTICES"."K"=U'country') filter(INTERNAL_FUNCTION("VERTICES"."K"))
Example 4-14 Find a Person's Collaborators, and aggregate and sort them
The following SQL statement finds the collaborators of the vertex with ID 1, aggregates them by their country, and sorts them in ascending order.
SQL> select vertices.v, count(1)
from connectionsGE$, connectionsVT$ vertices
where svid=1
and el='collaborates'
and dvid=vertices.vid
and vertices.k='country'
group by vertices.v
order by count(1) asc;
The example output and execution plan may be as follows. In this case, the vertex with ID 1 has the most collaborators in the United States.
Germany 1 Japan 1 Iran 1 United States 7 --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 750 | 9 (23)| 00:00:01 | | | | | | | 1 | PX COORDINATOR | | | | | | | | | | | | 2 | PX SEND QC (ORDER) | :TQ10002 | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,02 | P->S | QC (ORDER) | | 3 | SORT ORDER BY | | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,02 | PCWP | | | 4 | PX RECEIVE | | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,02 | PCWP | | | 5 | PX SEND RANGE | :TQ10001 | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,01 | P->P | RANGE | | 6 | HASH GROUP BY | | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,01 | PCWP | | | 7 | PX RECEIVE | | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,01 | PCWP | | | 8 | PX SEND HASH | :TQ10000 | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,00 | P->P | HASH | | 9 | HASH GROUP BY | | 10 | 750 | 9 (23)| 00:00:01 | | | Q1,00 | PCWP | | | 10 | NESTED LOOPS | | 10 | 750 | 7 (0)| 00:00:01 | | | Q1,00 | PCWP | | | 11 | PX PARTITION HASH ALL | | 10 | 250 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | | |* 12 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | CONNECTIONSGE$ | 10 | 250 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | |* 13 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 20 | | 1 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | | 14 | PARTITION HASH ITERATOR | | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | |* 15 | TABLE ACCESS BY LOCAL INDEX ROWID | CONNECTIONSVT$ | | | | | KEY | KEY | Q1,00 | PCWP | | |* 16 | INDEX UNIQUE SCAN | CONNECTIONSXQV$ | 1 | | 0 (0)| 00:00:01 | KEY | KEY | Q1,00 | PCWP | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 12 - filter(INTERNAL_FUNCTION("EL") AND "EL"=U'collaborates') 13 - access("SVID"=1) 15 - filter(INTERNAL_FUNCTION("VERTICES"."V")) 16 - access("DVID"="VERTICES"."VID" AND "VERTICES"."K"=U'country') filter(INTERNAL_FUNCTION("VERTICES"."K"))
Parent topic: SQL-Based Property Graph Query and Analytics
4.4 Navigation Options: CONNECT BY and Parallel Recursion
The CONNECT BY clause and parallel recursion provide options for advanced navigation and querying.
-
CONNECT BY lets you navigate and find matches in a hierarchical order. To follow outgoing edges, you can use prior dvid = svid to guide the navigation.
-
Parallel recursion lets you perform navigation up to a specified number of hops away.
The examples use a property graph named connections.
Example 4-15 CONNECT WITH
The following SQL statement follows the outgoing edges by 1 hop.
SQL> select G.dvid
from connectionsGE$ G
start with svid = 1
connect by nocycle prior dvid = svid and level <= 1;
The preceding example's output and execution plan may be as follows.
2 3 4 5 6 7 8 9 10 ... ------------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 7 | 273 | 3 (67)| 00:00:01 | | | | | | |* 1 | CONNECT BY WITH FILTERING| | | | | | | | | | | | 2 | PX COORDINATOR | | | | | | | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10000 | 2 | 12 | 0 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) | | 4 | PX PARTITION HASH ALL | | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | | |* 5 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | | |* 6 | FILTER | | | | | | | | | | | | 7 | NESTED LOOPS | | 5 | 95 | 1 (0)| 00:00:01 | | | | | | | 8 | CONNECT BY PUMP | | | | | | | | | | | | 9 | PARTITION HASH ALL | | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | | | | |* 10 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | | | | ------------------------------------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("SVID"=PRIOR "DVID") filter(LEVEL<=2) 5 - access("SVID"=1) 6 - filter(LEVEL<=2) 10 - access("connect$_by$_pump$_002"."prior dvid "="SVID")
To extend from 1 hop to multiple hops, change 1 in the preceding example to another integer. For example, to change it to 2 hops, specify: level <= 2
Example 4-16 Parallel Recursion
The following SQL statement uses recursion within the WITH clause to perform navigation up to 4 hops away, a using recursively defined graph expansion: g_exp
references g_exp
in the query, and that defines the recursion. The example also uses the PARALLEL optimizer hint for parallel execution.
SQL> WITH g_exp(svid, dvid, depth) as
(
select svid as svid, dvid as dvid, 0 as depth
from connectionsGE$
where svid=1
union all
select g2.svid, g1.dvid, g2.depth + 1
from g_exp g2, connectionsGE$ g1
where g2.dvid=g1.svid
and g2.depth <= 3
)
select /*+ parallel(4) */ dvid, depth
from g_exp
where svid=1
;
The example's output and execution plan may be as follows. Note that CURSOR DURATION MEMORY
is chosen in the execution, which indicates the graph expansion stores the intermediate data in memory.
22 4
25 4
24 4
1 4
23 4
33 4
22 4
22 4
... ...
Execution Plan
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 801 | 31239 | 147 (0)| 00:00:01 | | | | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | | |
| 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6614_11CB2D2 | | | | | | | | | |
| 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST | | | | | | | | | | |
| 4 | PX COORDINATOR | | | | | | | | | | |
| 5 | PX SEND QC (RANDOM) | :TQ20000 | 2 | 12 | 0 (0)| 00:00:01 | | | Q2,00 | P->S | QC (RAND) |
| 6 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6614_11CB2D2 | | | | | | | Q2,00 | PCWP | |
| 7 | PX PARTITION HASH ALL | | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | Q2,00 | PCWC | |
|* 8 | INDEX RANGE SCAN | CONNECTIONSXSE$ | 2 | 12 | 0 (0)| 00:00:01 | 1 | 8 | Q2,00 | PCWP | |
| 9 | PX COORDINATOR | | | | | | | | | | |
| 10 | PX SEND QC (RANDOM) | :TQ10000 | 799 | 12M| 12 (0)| 00:00:01 | | | Q1,00 | P->S | QC (RAND) |
| 11 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D6614_11CB2D2 | | | | | | | Q1,00 | PCWP | |
|* 12 | HASH JOIN | | 799 | 12M| 12 (0)| 00:00:01 | | | Q1,00 | PCWP | |
| 13 | BUFFER SORT (REUSE) | | | | | | | | Q1,00 | PCWP | |
| 14 | PARTITION HASH ALL | | 164 | 984 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWC | |
| 15 | INDEX FAST FULL SCAN | CONNECTIONSXDE$ | 164 | 984 | 2 (0)| 00:00:01 | 1 | 8 | Q1,00 | PCWP | |
| 16 | PX BLOCK ITERATOR | | | | | | | | Q1,00 | PCWC | |
|* 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6614_11CB2D2 | | | | | | | Q1,00 | PCWP | |
| 18 | PX COORDINATOR | | | | | | | | | | |
| 19 | PX SEND QC (RANDOM) | :TQ30000 | 801 | 31239 | 135 (0)| 00:00:01 | | | Q3,00 | P->S | QC (RAND) |
|* 20 | VIEW | | 801 | 31239 | 135 (0)| 00:00:01 | | | Q3,00 | PCWP | |
| 21 | PX BLOCK ITERATOR | | 801 | 12M| 135 (0)| 00:00:01 | | | Q3,00 | PCWC | |
| 22 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6614_11CB2D2 | 801 | 12M| 135 (0)| 00:00:01 | | | Q3,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
8 - access("SVID"=1)
12 - access("G2"."DVID"="G1"."SVID")
17 - filter("G2"."INTERNAL_ITERS$"=LEVEL AND "G2"."DEPTH"<=3)
20 - filter("SVID"=1)
Parent topic: SQL-Based Property Graph Query and Analytics
4.5 Pivot
The PIVOT clause lets you dynamically add columns to a table to create a new table.
The schema design (VT$ and GE$) of the property graph is narrow ("skinny") rather than wide ("fat"). This means that if a vertex or edge has multiple properties, those property keys, values, data types, and so on will be stored using multiple rows instead of multiple columns. Such a design is very flexible in the sense that you can add properties dynamically without having to worry about adding too many columns or even reaching the physical maximum limit of number of columns a table may have. However, for some applications you may prefer to have a wide table if the properties are somewhat homogeneous.
Example 4-17 Pivot
The following CREATE TABLE ... AS SELECT statement uses PIVOT to add four columns: ‘company’,’ occupation’,’ name’, and ‘religion’.
SQL> CREATE TABLE table pg_wide
as
with G AS (select vid, k, t, v
from connectionsVT$
)
select *
from G
pivot (
min(v) for k in ('company', 'occupation', 'name', 'religion')
);
Table created.
The following DESCRIBE statement shows the definition of the new table, including the four added columns. (The output is reformatted for readability.)
SQL> DESCRIBE pg_wide;
Name Null? Type
--------------------------------------------------- -------- --------------------
VID NOT NULL NUMBER
T NUMBER(38)
'company' NVARCHAR2(15000)
'occupation' NVARCHAR2(15000)
'name' NVARCHAR2(15000)
'religion' NVARCHAR2(15000)
Parent topic: SQL-Based Property Graph Query and Analytics
4.6 SQL-Based Property Graph Analytics
In addition to the analytical functions offered by the in-memory analyst, the property graph feature in Oracle Spatial and Graph supports several native, SQL-based property graph analytics.
The benefits of SQL-based analytics are:
-
Easier analysis of larger graphs that do not fit in physical memory
-
Cheaper analysis since no graph data is transferred outside the database
-
Better analysis using the current state of a property graph database
-
Simpler analysis by eliminating the step of synchronizing an in-memory graph with the latest updates from the graph database
However, when a graph (or a subgraph) fits in memory, then running analytics provided by the in-memory analyst usually provides better performance than using SQL-based analytics.
Because many of the analytics implementation require using intermediate data structures, most SQL- (and PL/SQL-) based analytics APIs have parameters for working tables (wt). A typical flow has the following steps:
-
Prepare the working table or tables.
-
Perform analytics (one or multiple calls).
-
Perform cleanup
The following subtopics provide SQL-based examples of some popular types of property graph analytics.
Parent topic: SQL-Based Property Graph Query and Analytics
4.6.1 Shortest Path Examples
The following examples demonstrate SQL-based shortest path analytics.
Example 4-18 Shortest Path Setup and Computation
Consider shortest path, for example. Internally, Oracle Database uses the bidirectional Dijkstra algorithm. The following code snippet shows an entire prepare, perform, and cleanup workflow.
set serveroutput on
DECLARE
wt1 varchar2(100); -- intermediate working tables
n number;
path varchar2(1000);
weights varchar2(1000);
BEGIN
-- prepare
opg_apis.find_sp_prep('connectionsGE$', wt1);
dbms_output.put_line('working table name ' || wt1);
-- compute
opg_apis.find_sp(
'connectionsGE$',
1, -- start vertex ID
53, -- destination vertex ID
wt1, -- working table (for Dijkstra expansion)
dop => 1, -- degree of parallelism
stats_freq=>1000, -- frequency to collect statistics
path_output => path, -- shortest path (a sequence of vertices)
weights_output => weights, -- edge weights
options => null
);
dbms_output.put_line('path ' || path);
dbms_output.put_line('weights ' || weights);
-- cleanup (commented out here; see text after the example)
-- opg_apis.find_sp_cleanup('connectionsGE$', wt1);
END;
/
This example may produce the following output. Note that if no working table name is provided, the preparation step will automatically generate a temporary table name and create it. Because the temporary working table name uses the session ID, your output will probably be different.
working table name "CONNECTIONSGE$$TWFS12" path 1 3 52 53 weights 4 3 1 1 1 PL/SQL procedure successfully completed.
If you want to know the definition of the working table or tables, then skip the cleanup phase (as shown in the preceding example that comments out the call to find_sp_cleanup
). After the computation is done, you can describe the working table or tables.
SQL> describe "CONNECTIONSGE$$TWFS12"
Name Null? Type
--------- -------- ----------------------------
NID NUMBER
D2S NUMBER
P2S NUMBER
D2T NUMBER
P2T NUMBER
F NUMBER(38)
B NUMBER(38)
For advanced users who want to try different table creation options, such as using in-memory or advanced compression, you can pre-create the preceding working table and pass the name in.
Example 4-19 Shortest Path: Create Working Table and Perform Analytics
The following statements show some advanced options, first creating a working table with the same column structure and basic compression enabled, then passing it to the SQL-based computation. The code optimizes the intermediate table for computations with CREATE TABLE compression and in-memory options.
create table connections$MY_EXP(
NID NUMBER,
D2S NUMBER,
P2S NUMBER,
D2T NUMBER,
P2T NUMBER,
F NUMBER(38),
B NUMBER(38)
) compress nologging;
DECLARE
wt1 varchar2(100) := 'connections$MY_EXP';
n number;
path varchar2(1000);
weights varchar2(1000);
BEGIN
dbms_output.put_line('working table name ' || wt1);
-- compute
opg_apis.find_sp(
'connectionsGE$',
1,
53,
wt1,
dop => 1,
stats_freq=>1000,
path_output => path,
weights_output => weights,
options => null
);
dbms_output.put_line('path ' || path);
dbms_output.put_line('weights ' || weights);
-- cleanup
-- opg_apis.find_sp_cleanup('connectionsGE$', wt1);
END;
/
At the end of the computation, if the working table has not been dropped or truncated, you can check the content of the working table, as follows. Note that the working table structure may vary between releases.
SQL> select * from connections$MY_EXP;
NID D2S P2S D2T P2T F B
---------- ---------- ---------- ---------- ---------- ---------- ----------
1 0 1.000E+100 1 -1
53 1.000E+100 0 -1 1
54 1.000E+100 1 53 -1 1
52 1.000E+100 1 53 -1 1
5 1 1 1.000E+100 0 -1
26 1 1 1.000E+100 0 -1
8 1000 1 1.000E+100 0 -1
3 1 1 2 52 0 0
15 1 1 1.000E+100 0 -1
21 1 1 1.000E+100 0 -1
19 1 1 1.000E+100 0 -1
...
Example 4-20 Shortest Path: Perform Multiple Calls to Same Graph
To perform multiple calls to the same graph, only a single call to the preparation step is needed. The following shows an example of computing shortest path for multiple pairs of vertices in the same graph.
DECLARE
wt1 varchar2(100); -- intermediate working tables
n number;
path varchar2(1000);
weights varchar2(1000);
BEGIN
-- prepare
opg_apis.find_sp_prep('connectionsGE$', wt1);
dbms_output.put_line('working table name ' || wt1);
-- find shortest path from vertex 1 to vertex 53
opg_apis.find_sp( 'connectionsGE$', 1, 53,
wt1, dop => 1, stats_freq=>1000, path_output => path, weights_output => weights, options => null);
dbms_output.put_line('path ' || path);
dbms_output.put_line('weights ' || weights);
-- find shortest path from vertex 2 to vertex 36
opg_apis.find_sp( 'connectionsGE$', 2, 36,
wt1, dop => 1, stats_freq=>1000, path_output => path, weights_output => weights, options => null);
dbms_output.put_line('path ' || path);
dbms_output.put_line('weights ' || weights);
-- find shortest path from vertex 30 to vertex 4
opg_apis.find_sp( 'connectionsGE$', 30, 4,
wt1, dop => 1, stats_freq=>1000, path_output => path, weights_output => weights, options => null);
dbms_output.put_line('path ' || path);
dbms_output.put_line('weights ' || weights);
-- cleanup
opg_apis.find_sp_cleanup('connectionsGE$', wt1);
END;
/
The example's output may be as follows: three shortest paths have been found for the multiple pairs of vertices provided.
working table name "CONNECTIONSGE$$TWFS12" path 1 3 52 53 weights 4 3 1 1 1 path 2 36 weights 2 1 1 path 30 21 1 4 weights 4 3 1 1 1 PL/SQL procedure successfully completed.
Parent topic: SQL-Based Property Graph Analytics
4.6.2 Collaborative Filtering Overview and Examples
Collaborative filtering, also referred to as social filtering, filters information by using the recommendations of other people. Collaborative filtering is widely used in systems that recommend purchases based on purchases by others with similar preferences.
The following examples demonstrate SQL-based collaborative filtering analytics.
Example 4-21 Collaborative Filtering Setup and Computation
This example shows how to use SQL-based collaborative filtering, specifically using matrix factorization to recommend telephone brands to customers. This example assumes there exists a graph called "PHONES" in the database. This example graph contains customer and item vertices, and edges with a 'rating' label linking some customer vertices to other some item vertices. The rating labels have a numeric value corresponding to the rating that a specific customer (edge OUT vertex) assigned to the specified product (edge IN vertex).
The following figure shows this graph.
Figure 4-1 Phones Graph for Collaborative Filtering
Description of "Figure 4-1 Phones Graph for Collaborative Filtering"
set serveroutput on
DECLARE
wt_l varchar2(32); -- working tables
wt_r varchar2(32);
wt_l1 varchar2(32);
wt_r1 varchar2(32);
wt_i varchar2(32);
wt_ld varchar2(32);
wt_rd varchar2(32);
edge_tab_name varchar2(32) := 'phonesge$';
edge_label varchar2(32) := 'rating';
rating_property varchar2(32) := '';
iterations integer := 100;
min_error number := 0.001;
k integer := 5;
learning_rate number := 0.001;
decrease_rate number := 0.95;
regularization number := 0.02;
dop number := 2;
tablespace varchar2(32) := null;
options varchar2(32) := null;
BEGIN
-- prepare
opg_apis.cf_prep(edge_tab_name,wt_l,wt_r,wt_l1,wt_r1,wt_i,wt_ld,wt_rd);
dbms_output.put_line('working table wt_l ' || wt_l);
dbms_output.put_line('working table wt_r ' || wt_r);
dbms_output.put_line('working table wt_l1 ' || wt_l1);
dbms_output.put_line('working table wt_r1 ' || wt_r1);
dbms_output.put_line('working table wt_i ' || wt_i);
dbms_output.put_line('working table wt_ld ' || wt_ld);
dbms_output.put_line('working table wt_rd ' || wt_rd);
-- compute
opg_apis.cf(edge_tab_name,edge_label,rating_property,iterations,
min_error,k,learning_rate,decrease_rate,regularization,dop,
wt_l,wt_r,wt_l1,wt_r1,wt_i,wt_ld,wt_rd,tablespace,options);
END;
/
The flow starts by creating the temporary working tables that are later on passed to the computation. At the end of computation, the example may produce the following output. Note that if no working table name is provided, the preparation step will automatically generate a temporary table name and create it. Because the temporary working table name uses the session ID, your output will probably be different.working table wt_l "PHONESGE$$CFL57" working table wt_r "PHONESGE$$CFR57" working table wt_l1 "PHONESGE$$CFL157" working table wt_r1 "PHONESGE$$CFR157" working table wt_i "PHONESGE$$CFI57" working table wt_ld "PHONESGE$$CFLD57" working table wt_rd "PHONESGE$$CFRD57" PL/SQL procedure successfully completed.
Example 4-22 Collaborative Filtering: Validating the Intermediate Error
At the end of every computation, you can check the current error of the algorithm with the following query as long as the data in the working tables has not been already deleted. The following SQL query illustrates how to get the intermediate error of a current run of the collaborative filtering algorithm.
SELECT /*+ parallel(48) */ SQRT(SUM((w1-w2)*(w1-w2) +
<regularization>/2 * (err_reg_l+err_reg_r))) AS err
FROM <wt_i>;
Note that the regularization parameter and the working table name (parameter wt_i
) should be replaced according to the values used when running the OPG_APIS.CF algorithm. In the preceding previous example, replace <regularization>
with 0.02 and <wt_i>
with "PHONESGE$$CFI149" as follows:
SELECT /*+ parallel(48) */ SQRT(SUM((w1-w2)*(w1-w2) + 0.02/2 * (err_reg_l+err_reg_r))) AS err
FROM "PHONESGE$$CFI149";
This query may produce the following output.
ERR ---------- 4.82163662
f the value of the current error is too high or if the predictions obtained from the matrix factorization results of the collaborative filtering are not yet useful, you can run more iterations of the algorithm, by reusing the working tables and the progress made so far. The following example shows how to make predictions using the SQL-based collaborative filtering.
Example 4-23 Collaborative Filtering: Making Predictions
The result of the collaborative filtering algorithm is stored in the tables wt_l
and wt_r
, which are the two factors of a matrix product. These matrix factors should be used when making the predictions of the collaborative filtering.
In a typical flow of the algorithm, the two matrix factors can be used to make the predictions before calling the OPG_APIS.CF_CLEANUP procedure, or they can be copied and persisted into other tables for later use. The following example demonstrates the latter case:
DECLARE
wt_l varchar2(32); -- working tables
wt_r varchar2(32);
wt_l1 varchar2(32);
wt_r1 varchar2(32);
wt_i varchar2(32);
wt_ld varchar2(32);
wt_rd varchar2(32);
edge_tab_name varchar2(32) := 'phonesge$';
edge_label varchar2(32) := 'rating';
rating_property varchar2(32) := '';
iterations integer := 100;
min_error number := 0.001;
k integer := 5;
learning_rate number := 0.001;
decrease_rate number := 0.95;
regularization number := 0.02;
dop number := 2;
tablespace varchar2(32) := null;
options varchar2(32) := null;
BEGIN
-- prepare
opg_apis.cf_prep(edge_tab_name,wt_l,wt_r,wt_l1,wt_r1,wt_i,wt_ld,wt_rd);
-- compute
opg_apis.cf(edge_tab_name,edge_label,rating_property,iterations,
min_error,k,learning_rate,decrease_rate,regularization,dop,
wt_l,wt_r,wt_l1,wt_r1,wt_i,wt_ld,wt_rd,tablespace,options);
-- save only these two tables for later predictions
EXECUTE IMMEDIATE 'CREATE TABLE customer_mat AS SELECT * FROM ' || wt_l;
EXECUTE IMMEDIATE 'CREATE TABLE item_mat AS SELECT * FROM ' || wt_r;
-- cleanup
opg_apis.cf_cleanup('phonesge$',wt_l,wt_r,wt_l1,wt_r1,wt_i,wt_ld,wt_rd);
END;
/
This example will produce the only the following output.
PL/SQL procedure successfully completed.
Now that the matrix factors are saved in the tables customer_mat and item_mat, you can use the following query to check the "error" (difference) between the real values (those values that previously existed in the graph as 'ratings') and the estimated predictions (the result of the matrix multiplication in a certain customer row and item column).
Note that the following query is customized with a join on the vertex table in order return an NVARCHAR property of the vertices (for example, the name property) instead of a numeric ID. This query will return all the predictions for every single customer vertex to every item vertex in the graph.
SELECT /*+ parallel(48) */ MIN(vertex1.v) AS customer,
MIN(vertex2.v) AS item,
MIN(edges.vn) AS real,
SUM(l.v * r.v) AS predicted
FROM PHONESGE$ edges,
CUSTOMER_MAT l,
ITEM_MAT r,
PHONESVT$ vertex1,
PHONESVT$ vertex2
WHERE l.k = r.k
AND l.c = edges.svid(+)
AND r.p = edges.dvid(+)
AND l.c = vertex1.vid
AND r.p = vertex2.vid
GROUP BY l.c, r.p
ORDER BY l.c, r.p -- This order by clause is optional
;
This query may produce an output similar to the following (some rows are omitted for brevity).
CUSTOMER ITEM REAL PREDICTED ------------------------------------------------ Adam Apple 5 3.67375703 Adam Blackberry 3.66079652 Adam Danger 2.77049596 Adam Ericsson 4.21764858 Adam Figo 3.10631337 Adam Google 4 4.42429022 Adam Huawei 3 3.4289115 Ben Apple 2.82127589 Ben Blackberry 2 2.81132282 Ben Danger 3 2.12761307 Ben Ericsson 3 3.2389595 Ben Figo 2.38550534 Ben Google 3.39765075 Ben Huawei 2.63324582 ... Don Apple 1.3777496 Don Blackberry 1 1.37288909 Don Danger 1 1.03900439 Don Ericsson 1.58172236 Don Figo 1 1.16494421 Don Google 1.65921807 Don Huawei 1 1.28592648 Erik Apple 3 2.80809351 Erik Blackberry 3 2.79818695 Erik Danger 2.11767182 Erik Ericsson 3 3.2238255 Erik Figo 2.3743591 Erik Google 3 3.38177526 Erik Huawei 3 2.62094201
If you want to check only some rows to decide whether the prediction results are ready or more iterations of the algorithm should be run, the previous query can be wrapped in an outer query. The following example will select only the first 11 results.
SELECT /*+ parallel(48) */ * FROM (
SELECT /*+ parallel(48) */ MIN(vertex1.v) AS customer,
MIN(vertex2.v) AS item,
MIN(edges.vn) AS real,
SUM(l.v * r.v) AS predicted
FROM PHONESGE$ edges,
CUSTOMER_MAT l,
ITEM_MAT r,
PHONESVT$ vertex1,
PHONESVT$ vertex2
WHERE l.k = r.k
AND l.c = edges.svid(+)
AND r.p = edges.dvid(+)
AND l.c = vertex1.vid
AND r.p = vertex2.vid
GROUP BY l.c, r.p
ORDER BY l.c, r.p
) WHERE rownum <= 11;
This query may produce an output similar to the following.
CUSTOMER ITEM REAL PREDICTED ------------------------------------------------ Adam Apple 5 3.67375703 Adam Blackberry 3.66079652 Adam Danger 2.77049596 Adam Ericsson 4.21764858 Adam Figo 3.10631337 Adam Google 4 4.42429022 Adam Huawei 3 3.4289115 Ben Apple 2.82127589 Ben Blackberry 2 2.81132282 Ben Danger 3 2.12761307 Ben Ericsson 3 3.2389595
To get a prediction for a specific vertex (customer, item, or both) the query can be restricted with the desired ID values. For example, to get the predicted value of vertex 1 (customer) and vertex 105 (item), you can use the following query.
SELECT /*+ parallel(48) */ MIN(vertex1.v) AS customer,
MIN(vertex2.v) AS item,
MIN(edges.vn) AS real,
SUM(l.v * r.v) AS predicted
FROM PHONESGE$ edges,
CUSTOMER_MAT l,
ITEM_MAT r,
PHONESVT$ vertex1,
PHONESVT$ vertex2
WHERE l.k = r.k
AND l.c = edges.svid(+)
AND r.p = edges.dvid(+)
AND l.c = vertex1.vid
AND vertex1.vid = 1 /* Remove to get all predictions for item 105 */
AND r.p = vertex2.vid
AND vertex2.vid = 105 /* Remove to get all predictions for customer 1 */
/* Remove both lines to get all predictions */
GROUP BY l.c, r.p
ORDER BY l.c, r.p;
This query may produce an output similar to the following.
CUSTOMER ITEM REAL PREDICTED ------------------------------------------------ Adam Ericsson 4.21764858
Parent topic: SQL-Based Property Graph Analytics
4.7 Property Graph Query Language (PGQL)
PGQL is a SQL-like query language for property graph data structures that consist of nodes that are connected to other nodes by edges, each of which can have key-value pairs (properties) associated with them.
The language is based on the concept of graph pattern matching, which allows you to specify patterns that are matched against vertices and edges in a data graph.
The Oracle Spatial and Graph property graph support provides two ways to execute Property Graph Query Language (PGQL) queries through Java APIs:
-
Use the
oracle.pgx.api
Java package to query an in-memory snapshot of a graph that has been loaded into the in-memory analyst (PGX), as described in Using the In-Memory Analyst (PGX). -
Use the
oracle.pg.rdbms
Java package to directly query graph data stored in Oracle Database, as described in Executing PGQL Queries Directly Against Oracle Database.
- Topology Constraints with PGQL
- Constraints are Directional with PGQL
- Vertex and Edge Labels with PGQL
- Regular Path Queries with PGQL
- Aggregation and Sorting with PGQL
Related Topics
Parent topic: SQL-Based Property Graph Query and Analytics
4.7.1 Topology Constraints with PGQL
Pattern matching is done using topology constraints, which describe a pattern of connections between nodes in the graph, Value constraints (similar to their SQL equivalents) let you further constrain matches by specifying properties that those connections and nodes must have.
For example, assume a graph of TCP/IP connections on a computer network, and you want to detect cases where someone logged into one machine, from there into another, and from there into yet another. You would query for that pattern like this:
SELECT host1.id(), host2.id(), host3.id()
WHERE /* choose what to return */
(host1) -[c1 WITH toPort = 22 and opened = true]-> (host2) /* topology must match this pattern */
-[connection2 WITH toPort = 22 and opened = true]-> (host3),
connection1.bytes > 300, /* meaningful amount of data was exchanged */
connection2.bytes > 300,
connection1.start < connection2.start, /* second connection within time-frame of first */
connection2.start + connection2.duration < connection1.start + connection1.duration
GROUP BY host1.id(), host2.id(), host3.id() /* aggregate multiple matching connections */
ORDER BY DESC(connection1.when) /* reverse sort chronologically */
Parent topic: Property Graph Query Language (PGQL)
4.7.2 Constraints are Directional with PGQL
A topological constraint has a direction, as edges in graphs do. Thus, (a) <-[]- (b)
specifies a case where b has an edge pointing at a, whereas (a) -[]-> (b)
looks for an edge in the opposite direction.
The following example finds common friends of April and Chris who are older than both of them.
SELECT friend.name, friend.dob
WHERE /* note the arrow directions below */
(p1:person WITH name = 'April') -[:likes]-> (friend) <-[:likes]- (p2:person WITH name = 'Chris'),
friend.dob > p1.dob AND friend.dob > p2.dob
ORDER BY friend.dob DESC
Parent topic: Property Graph Query Language (PGQL)
4.7.3 Vertex and Edge Labels with PGQL
Labels are a way of attaching type information to edges and nodes in a graph, and can be used in constraints in graphs where not all nodes represent the same thing. For example:
SELECT p WHERE (p:person) -[e:likes]-> (m:movie WITH title='Star Wars'),
(p) -[e:likes]-> (m:movie WITH title='Avatar')
Parent topic: Property Graph Query Language (PGQL)
4.7.4 Regular Path Queries with PGQL
Regular path queries allow a pattern to be reused. The following example finds all of the common ancestors of Mario and Luigi.
PATH has_parent := () -[:has_father|has_mother]-> ()
SELECT ancestor.name
WHERE
(:Person WITH name = 'Mario') -/:has_parent*/-> (ancestor:Person),
(:Person WITH name = 'Luigi') -/:has_parent*/-> (ancestor)
The preceding path specification also shows the use of anonymous constraints, because there is no need to define names for intermediate edges or nodes that will not be used in additional constraints or query results. Anonymous elements can have constraints, such as [:has_father|has_mother]
: the edge does not get a variable name (because it will not be referenced elsewhere), but it is constrained.
Parent topic: Property Graph Query Language (PGQL)
4.7.5 Aggregation and Sorting with PGQL
Like SQL, PGQL has support for the following:
-
GROUP BY to create groups of solutions
-
MIN, MAX, SUM, and AVG aggregations
-
ORDER BY to sort results
And for many other familiar SQL constructs.
Parent topic: Property Graph Query Language (PGQL)
4.8 Executing PGQL Queries Directly Against Oracle Database
This topic explains how you can execute PGQL queries directly against the graph in Oracle Database (as opposed to in-memory).
Property Graph Query Language (PGQL) queries can be executed against disk-resident property graph data stored in Oracle Database. The graph data access layer (DAL) provides a Java API for executing PGQL queries. Logic in the DAL translates a submitted PGQL query into an equivalent SQL query, and the resulting SQL is executed on the database server. The DAL then wraps the SQL query results with a convenient PGQL result set API. This PGQL query execution flow is shown in the following figure.
The basic execution flow is:
-
The PGQL query is submitted to the DAL through a Java API.
-
The PGQL query is translated to SQL in the DAL.
-
The translated SQL is submitted to Oracle Database by JDBC.
-
The SQL result set is wrapped as a PGQL result set and returned to the caller.
The ability to execute PGQL queries directly against property graph data stored in Oracle Database provides several benefits.
-
PGQL provides a more natural way to express graph queries than SQL manually written to query schema tables, including VT$, GE$, and GT$.
-
PGQL queries can be executed without the need to load a snapshot of your graph data into PGX, so there is no need to worry about staleness of frequently updated graph data.
-
PGQL queries can be executed against graph data that is too large to fit in memory.
-
The robust and scalable Oracle SQL engine can be used to execute PGQL queries.
-
Mature tools for management, monitoring and tuning of Oracle Database can be used to tune and monitor PGQL queries.
- PGQL Features Supported
- Using the oracle.pg.rdbms Java Package to Execute PGQL Queries
- Performance Considerations for PGQL Queries
Parent topic: SQL-Based Property Graph Query and Analytics
4.8.1 PGQL Features Supported
PGQL is a SQL-like query language for querying property graph data. It is based on the concept of graph pattern matching and allows you to specify, among other things, topology constraints, paths, filters, sorting and aggregation.
The Java API for PGQL defined in the oracle.pg.rdbms
package supports the PGQL 1.1 specification with a few exceptions. The PGQL 1.1 specification can be found here (http://pgql-lang.org/spec/1.1/).
The following features of PGQL 1.1 are not supported.
- FROM clause – only a single graph can be queried at one time
- Undirected query edges
- TIME and TIME WITH TIME ZONE data types
- IS NULL and IS NOT NULL testing
The following enhancements to PGQL 1.1 are supported.
- CONTAINS Built-in Function
In addition, the following features of PGQL 1.1 require special consideration.
Parent topic: Executing PGQL Queries Directly Against Oracle Database
4.8.1.1 Temporal Types
The temporal types DATE, TIMESTAMP and TIMESTAMP WITH TIMEZONE are supported in PGQL queries.
All of these value types are represented internally using the Oracle SQL TIMESTAMP WITH TIME ZONE type. DATE values are automatically converted to TIMESTAMP WITH TIME ZONE by assuming the earliest time in UTC+0 timezone (for example, 2000-01-01 becomes 2000-01-01 00:00:00.00+00:00). TIMESTAMP values are automatically converted to TIMESTAMP WITH TIME ZONE by assuming UTC+0 timezone (for example, 2000-01-01 12:00:00.00 becomes 2000-01-01 12:00:00.00+00:00).
Temporal constants are written in PGQL queries as follows.
-
DATE 'YYYY-MM-DD'
-
TIMESTAMP 'YYYY-MM-DD HH24:MI:SS.FF'
-
TIMESTAMP WITH TIMEZONE 'YYYY-MM-DD HH24:MI:SS.FFTZH:TZM'
Some examples are DATE '2000-01-01', TIMESTAMP '2000-01-01 14:01:45.23', TIMESTAMP WITH TIMEZONE '2000-01-01 13:00:00.00-05:00', and TIMESTAMP WITH TIMEZONE '2000-01-01 13:00:00.00+01:00'.
In addition, temporal values can be obtained by casting string values to a temporal type. The supported string formats are:
-
DATE 'YYYY-MM-DD'
-
TIMESTAMP 'YYYY-MM-DD HH24:MI:SS.FF' and 'YYYY-MM-DD"T"HH24:MI:SS.FF'
-
TIMESTAMP WITH TIMEZONE 'YYYY-MM-DD HH24:MI:SS.FFTZH:TZM' and 'YYYY-MM-DD"T"HH24:MI:SS.FFTZH:TZM'.
Some examples are CAST ('2005-02-04' AS DATE), CAST ('1990-01-01 12:00:00.00' AS TIMESTAMP), CAST ('1985-01-01T14:05:05.00-08:00' AS TIMESTAMP WITH TIMEZONE).
When consuming results from a ResultSet
object, getObject
returns a java.sql.Timestamp
object for temporal types.
Bind variables can only be used for the TIMESTAMP WITH TIMEZONE temporal type in PGQL, and a setTimestamp
method that takes a java.sql.Timestamp
object as input is used to set the bind value. As a simpler alternative, you can use a string bind variable in a CAST statement to bind temporal values (for example, CAST (? AS TIMESTAMP WITH TIMEZONE)
followed by setString(1, "1985-01-01T14:05:05.00-08:00")
). See also Using Bind Variables in PGQL Queries for more information about bind variables.
Parent topic: PGQL Features Supported
4.8.1.2 Type Casting
Type casting is supported in PGQL with a SQL-style CAST (VALUE AS DATATYPE) syntax, for example CAST('25' AS INT), CAST (10 AS STRING), CAST ('2005-02-04' AS DATE), CAST(e.weight AS STRING). Supported casting operations are summarized in the following table. Y indicates that the conversion is supported, and N indicates that it is not supported. Casting operations on invalid values (for example, CAST('xyz' AS INT)) or unsupported conversions (for example, CAST (10 AS TIMESTAMP)) return NULL instead of raising a SQL exception.
Table 4-1 Type Casting Support in PGQL (From and To Types)
“to” type | from STRING | from INT | from LONG | from FLOAT | from DOUBLE | from BOOLEAN | from DATE | from TIMESTAMP | from TIMESTAMP WITH TIMEZONE |
---|---|---|---|---|---|---|---|---|---|
to STRING |
Y |
Y |
Y |
Y |
Y |
Y |
Y |
Y |
Y |
to INT |
Y |
Y |
Y |
Y |
Y |
Y |
N |
N |
N |
to LONG |
Y |
Y |
Y |
Y |
Y |
Y |
N |
N |
N |
to FLOAT |
Y |
Y |
Y |
Y |
Y |
Y |
N |
N |
N |
to DOUBLE |
Y |
Y |
Y |
Y |
Y |
Y |
N |
N |
N |
to BOOLEAN |
Y |
Y |
Y |
Y |
Y |
Y |
N |
N |
N |
to DATE |
Y |
N |
N |
N |
N |
N |
Y |
Y |
Y |
to TIMESTAMP |
Y |
N |
N |
N |
N |
N |
Y |
Y |
Y |
to TIMESTAMP WITH TIMEZONE |
Y |
N |
N |
N |
N |
N |
Y |
Y |
Y |
An example query that uses type casting is:
SELECT e.name, CAST (e.birthDate AS STRING) AS dob
WHERE (e), e.birthDate < CAST ('1980-01-01' AS DATE)
Parent topic: PGQL Features Supported
4.8.1.3 Missing Properties
When a vertex or edge is missing a property that is projected from a PGQL query, the result row containing that vertex or edge will be excluded from the query result rather than included with a NULL value.
For example, the following query:
SELECT v.name
MATCH (v)
Will exclude all vertices that are missing a name
property from the query result.
Parent topic: PGQL Features Supported
4.8.1.4 CONTAINS Built-in Function
A CONTAINS built-in function is supported. It is used in conjunction with an Oracle Text index on vertex and edge properties. CONTAINS returns true
if a value matches an Oracle Text search string and false
if it does not match.
An example query is:
SELECT v.name
WHERE (v), CONTAINS(v.abstract, 'Oracle')
See also Using a Text Index with PGQL Queries for more information about using full text indexes with PGQL.
Parent topic: PGQL Features Supported
4.8.2 Using the oracle.pg.rdbms Java Package to Execute PGQL Queries
Oracle Spatial and Graph property graph support provides a Java API in the oracle.pg.rdbms
package for executing PGQL queries against Oracle Database. This section explains how to use the Java API through a series of examples.
Note:
Effective with Release 19c, the oracle.pg.rdbms.OraclePgqlResultSet
interface is deprecated and will be removed in a future version. Instead, use the standardized interface oracle.pgql.lang.ResultSet
to retrieve values from a PGQL result set.
One difference between those two interfaces is that oracle.pgql.lang.ResultSet
does not provide APIs to retrieve vertex and edge objects. Existing code using those interfaces should be changed to project IDs rather than OracleVertex
and OracleEdge
objects. You can obtain an OracleVertex
or OracleEdge
object from the projected ID values by calling OracleVertex.getInstance()
or OracleEdge.getInstance()
. (For an example, see Example 4-37.)
To use PGQL query capabilities, import the following classes into your Java program:
import oracle.pg.rdbms.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
The following test_graph
data set in Oracle flat file format will be used in the examples in subtopics that follow. The data set includes a vertex file (test_graph.opv
) and an edge file (test_graph.ope
)
test_graph.opv
:
2,fname,1,Ray,, 2,lname,1,Green,, 2,mval,5,,,1985-01-01T12:00:00.000Z 2,age,2,,41, 0,bval,6,Y,, 0,fname,1,Bill,, 0,lname,1,Brown,, 0,mval,1,y,, 0,age,2,,40, 1,bval,6,Y,, 1,fname,1,John,, 1,lname,1,Black,, 1,mval,2,,27, 1,age,2,,30, 3,bval,6,N,, 3,fname,1,Susan,, 3,lname,1,Blue,, 3,mval,6,N,, 3,age,2,,35,
test_graph.ope
:
4,0,1,knows,mval,1,Y,, 4,0,1,knows,firstMetIn,1,MI,, 4,0,1,knows,since,5,,,1990-01-01T12:00:00.000Z 16,0,1,friendOf,strength,2,,6, 7,1,0,knows,mval,5,,,2003-01-01T12:00:00.000Z 7,1,0,knows,firstMetIn,1,GA,, 7,1,0,knows,since,5,,,2000-01-01T12:00:00.000Z 17,1,0,friendOf,strength,2,,7, 9,1,3,knows,mval,6,N,, 9,1,3,knows,firstMetIn,1,SC,, 9,1,3,knows,since,5,,,2005-01-01T12:00:00.000Z 10,2,0,knows,mval,1,N,, 10,2,0,knows,firstMetIn,1,TX,, 10,2,0,knows,since,5,,,1997-01-01T12:00:00.000Z 12,2,3,knows,mval,3,,342.5, 12,2,3,knows,firstMetIn,1,TX,, 12,2,3,knows,since,5,,,2011-01-01T12:00:00.000Z 19,2,3,friendOf,strength,2,,4, 14,3,1,knows,mval,1,a,, 14,3,1,knows,firstMetIn,1,CA,, 14,3,1,knows,since,5,,,2010-01-01T12:00:00.000Z 15,3,2,knows,mval,1,z,, 15,3,2,knows,firstMetIn,1,CA,, 15,3,2,knows,since,5,,,2004-01-01T12:00:00.000Z 5,0,2,knows,mval,2,,23, 5,0,2,knows,firstMetIn,1,OH,, 5,0,2,knows,since,5,,,2002-01-01T12:00:00.000Z 6,0,3,knows,mval,3,,159.7, 6,0,3,knows,firstMetIn,1,IN,, 6,0,3,knows,since,5,,,1994-01-01T12:00:00.000Z 8,1,2,knows,mval,6,Y,, 8,1,2,knows,firstMetIn,1,FL,, 8,1,2,knows,since,5,,,1999-01-01T12:00:00.000Z 18,1,3,friendOf,strength,2,,5, 11,2,1,knows,mval,2,,1001, 11,2,1,knows,firstMetIn,1,OK,, 11,2,1,knows,since,5,,,2003-01-01T12:00:00.000Z 13,3,0,knows,mval,5,,,2001-01-01T12:00:00.000Z 13,3,0,knows,firstMetIn,1,CA,, 13,3,0,knows,since,5,,,2006-01-01T12:00:00.000Z 20,3,1,friendOf,strength,2,,3,
- Basic Query Execution
- Using Bind Variables in PGQL Queries
- Using a Text Index with PGQL Queries
- Obtaining the SQL Translation for a PGQL Query
- Additional Options for PGQL Translation and Execution
- Querying Another User’s Property Graph
- Using Query Optimizer Hints with PGQL
Parent topic: Executing PGQL Queries Directly Against Oracle Database
4.8.2.1 Basic Query Execution
Two main Java Interfaces, OraclePgqlStatement
and oracle.pgql.lang.ResultSet
, are used for PGQL execution. This topic includes several examples of basic query execution.
Example 4-24 PgqlExample1.java
PgqlExample1.java
shows an example of executing a PGQL query and iterating through the query result. OraclePgqlExecutionFactory
is used to obtain an OraclePgqlStatement
from an OraclePropertyGraph
object. Next, it calls the executeQuery
method of OraclePgqlStatement
, which returns an oracle.pgql.lang.ResultSet
object. ResultSet
provides an interface for consuming the query result that is very similar to the java.sql.ResultSet
interface.
oracle.pgql.lang.ResultSet
provides a next()
method for moving through the query result and provides a close()
method to release resources after the application is through reading the query result. In addition, ResultSet
provides getters for String, Integer, Long, Float, Double, Boolean, LocalDateTime, OffsetDateTime, and it provides a generic getObject()
method for values of any type.
The ResultSet
and OraclePgqlStatement
objects should be closed after consuming the query result.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to execute a basic PGQL query against disk-resident
* PG data stored in Oracle Database and iterate through the result.
*/
public class PgqlExample1
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"SELECT v.fname AS fname, v.lname AS lname, v.mval AS mval "+
"MATCH (v)";
rs = ops.executeQuery(pgql, /* query string */
"" /* options */);
// Consume the result set using the oracle.pgql.lang.ResultSet interface
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
// get fname
String fname = rs.getString(1);
// get lname
String lname = rs.getString(2);
// get mval
Object mval = rs.getObject(3);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
System.out.println("[fname = "+fname+", lname = "+lname+", mval = "+mStr+"]");
}
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
}
PgqlExample1.java
gives the following output for test_graph
.
[fname = Susan, lname = Blue, mval = BOOLEAN: false] [fname = Bill, lname = Brown, mval = STRING: y] [fname = Ray, lname = Green, mval = DATE: 1985-01-01T04:00:00.000-08:00] [fname = John, lname = Black, mval = INTEGER: 27]
Example 4-25 PgqlExample2.java
PgqlExample2.java
shows a PGQL query with a temporal filter on an edge property.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to execute a PGQL query with a temporal edge
* property filter against disk-resident PG data stored in Oracle Database
* and iterate through the result.
*/
public class PgqlExample2
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"SELECT v.fname AS n1, v2.fname AS n2, e.firstMetIn AS loc "+
"MATCH (v)-[e:knows]->(v2) "+
"WHERE e.since > TIMESTAMP '2000-01-01 00:00:00.00+00:00'";
rs = ops.executeQuery(pgql, "");
// Print results
printResults(rs, 3);
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample2.java
gives the following output for test_graph
.
[STRING: Ray, STRING: John, STRING: OK] [STRING: Bill, STRING: Ray, STRING: OH] [STRING: Susan, STRING: Bill, STRING: CA] [STRING: John, STRING: Susan, STRING: SC] [STRING: Ray, STRING: Susan, STRING: TX] [STRING: John, STRING: Bill, STRING: GA] [STRING: Susan, STRING: John, STRING: CA] [STRING: Susan, STRING: Ray, STRING: CA]
Example 4-26 PgqlExample3.java
PgqlExample3.java
shows a PGQL query with grouping and aggregation.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to execute a PGQL query with aggregation
* against disk-resident PG data stored in Oracle Database and iterate
* through the result.
*/
public class PgqlExample3
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"SELECT v.fname AS fname, COUNT(v2) AS friendCnt "+
"MATCH (v)-[e:friendOf]->(v2) "+
"GROUP BY v "+
"ORDER BY friendCnt DESC";
rs = ops.executeQuery(pgql, "");
// Print results
printResults(rs, 2);
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample3.java
gives the following output for test_graph
.
[STRING: John, LONG: 2] [STRING: Bill, LONG: 1] [STRING: Ray, LONG: 1] [STRING: Susan, LONG: 1]
Example 4-27 PgqlExample4.java
PgqlExample4.java
shows a PGQL path query.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to execute a path query in PGQL against
* disk-resident PG data stored in Oracle Database and iterate
* through the result.
*/
public class PgqlExample4
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"PATH fof AS ()-[:friendOf|knows]->() "+
"SELECT v2.fname AS friend "+
"MATCH (v)-/:fof*/->(v2) "+
"WHERE v.fname = 'John' AND v != v2";
rs = ops.executeQuery(pgql, "");
// Print results
printResults(rs, 1);
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample4.java
gives the following output for test_graph
.
[[STRING: Susan] [STRING: Bill] [STRING: Ray]
4.8.2.2 Using Bind Variables in PGQL Queries
Bind variables can be used in PGQL queries for better performance and increased security. Constant scalar values in PGQL queries can be replaced with bind variables. Bind variables are denoted by a '?' (question mark). Consider the following two queries that select people who are older than a constant age value.
// people older than 30
SELECT v.fname AS fname, v.lname AS lname, v.age AS age
MATCH (v)
WHERE v.age > 30
// people older than 40
SELECT v.fname AS fname, v.lname AS lname, v.age AS age
MATCH (v)
WHERE v.age > 40
The SQL translations for these queries would use the constants 30 and 40 in a similar way for the age filter. The database would perform a hard parse for each of these queries. This hard parse time can often exceed the execution time for simple queries.
You could replace the constant in each query with a bind variable as follows.
SELECT v.fname AS fname, v.lname AS lname, v.age AS age
MATCH (v)
WHERE v.age > ?
This will allow the SQL engine to create a generic cursor for this query, which can be reused for different age values. As a result, a hard parse is no longer required to execute this query for different age values, and the parse time for each query will be drastically reduced.
In addition, applications that use bind variables in PGQL queries are less vulnerable to injection attacks than those that use string concatenation to embed constant values in PGQL queries.
See also Oracle Database SQL Tuning Guide for more information on cursor sharing and bind variables.
The OraclePgqlPreparedStatement
interface can be used to execute queries with bind variables as shown in PgqlExample5.java
. OraclePgqlPreparedStatement
provides several set methods for different value types that can be used to set values for query execution.
There are a few limitations with bind variables in PGQL. Bind variables can only be used for constant property values. That is, vertices and edges cannot be replaced with bind variables. Also, once a particular bind variable has been set to a type, it cannot be set to a different type. For example, if setInt(1, 30)
is executed for an OraclePgqlPreparedStatement
, you cannot call setString(1, "abc")
on that same OraclePgqlPreparedStatement
.
Example 4-28 PgqlExample5.java
PgqlExample5.java
shows how to use bind variables with a PGQL query.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to use bind variables with a PGQL query.
*/
public class PgqlExample5
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlPreparedStatement opps = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Query string with a bind variable (denoted by ?)
String pgql =
"SELECT v.fname AS fname, v.lname AS lname, v.age AS age "+
"MATCH (v) "+
"WHERE v.age > ?";
// Create an OraclePgqlPreparedStatement
opps = OraclePgqlExecutionFactory.prepareStatement(opg, pgql);
// Set filter value to 30
opps.setInt(1, 30);
// execute query
rs = opps.executeQuery("");
// Print query results
System.out.println("-- Values for v.age > 30 ------------------");
printResults(rs, 3);
// close result set
rs.close();
// set filter value to 40
opps.setInt(1, 40);
// execute query
rs = opps.executeQuery("");
// Print query results
System.out.println("-- Values for v.age > 40 ------------------");
printResults(rs, 3);
// close result set
rs.close();
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (opps != null) {
opps.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample5.java
has the following output for test_graph
.
-- Values for v.age > 30 ------------------ [STRING: Susan, STRING: Blue, INTEGER: 35] [STRING: Bill, STRING: Brown, INTEGER: 40] [STRING: Ray, STRING: Green, INTEGER: 41] -- Values for v.age > 40 ------------------ [STRING: Ray, STRING: Green, INTEGER: 41]
Example 4-29 PgqlExample6.java
PgqlExample6.java
shows a query with two bind variables: one String variable and one Timestamp variable.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use multiple bind variables with a PGQL query.
*/
public class PgqlExample6
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlPreparedStatement opps = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Query string with multiple bind variables
String pgql =
"SELECT v1.fname AS fname1, v2.fname AS fname2 "+
"MATCH (v1)-[e:knows]->(v2) "+
"WHERE e.since < ? AND e.firstMetIn = ?";
// Create an OraclePgqlPreparedStatement
opps = OraclePgqlExecutionFactory.prepareStatement(opg, pgql);
// Set e.since < 2006-01-01T12:00:00.00Z
Timestamp t = Timestamp.valueOf(OffsetDateTime.parse("2006-01-01T12:00:01.00Z").atZoneSameInstant(ZoneOffset.UTC).toLocalDateTime());
opps.setTimestamp(1, t);
// Set e.firstMetIn = 'CA'
opps.setString(2, "CA");
// execute query
rs = opps.executeQuery("");
// Print query results
System.out.println("-- Values for e.since < 2006-01-01T12:00:01.00Z AND e.firstMetIn = 'CA' --");
printResults(rs, 2);
// close result set
rs.close();
// Set e.since < 2000-01-01T12:00:00.00Z
t = Timestamp.valueOf(OffsetDateTime.parse("2000-01-01T12:00:00.00Z").atZoneSameInstant(ZoneOffset.UTC).toLocalDateTime());
opps.setTimestamp(1, t);
// Set e.firstMetIn = 'TX'
opps.setString(2, "TX");
// execute query
rs = opps.executeQuery("");
// Print query results
System.out.println("-- Values for e.since < 2000-01-01T12:00:00.00Z AND e.firstMetIn = 'TX' --");
printResults(rs, 2);
// close result set
rs.close();
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (opps != null) {
opps.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((java.util.Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample6.java
gives the following output for test_graph
.
-- Values for e.since < 2006-01-01T12:00:01.00Z AND e.firstMetIn = 'CA' -- [STRING: Susan, STRING: Bill] [STRING: Susan, STRING: Ray] -- Values for e.since < 2000-01-01T12:00:00.00Z AND e.firstMetIn = 'TX' -- [STRING: Ray, STRING: Bill]
4.8.2.3 Using a Text Index with PGQL Queries
PGQL queries executed against Oracle Database can use Oracle Text indexes created for vertex and edge properties. After creating a text index, you can use the CONTAINS operator to perform a full text search. CONTAINS has two arguments: a vertex or edge property, and an Oracle Text search string. Any valid Oracle Text search string can be used, including advanced features such as wildcards, stemming and soundex.
Example 4-30 PgqlExample7.java
PgqlExample7.java
shows how to execute a CONTAINS query.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use an Oracle Text index with a PGQL query.
*/
public class PgqlExample7
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlPreparedStatement opps = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
// try to drop property graph first
try {
OraclePropertyGraphUtils.dropPropertyGraph(oracle, szGraph);
}
catch (SQLException ex) { /*do nothing*/; }
// Get property graph instance
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create text index with SQL API
CallableStatement cs = null;
// text index on vertices
cs = oracle.getConnection().prepareCall(
"begin opg_apis.create_vertices_text_idx(:1,:2); end;"
);
cs.setString(1,szUser);
cs.setString(2,szGraph);
cs.execute();
cs.close();
// text index on edges
cs = oracle.getConnection().prepareCall(
"begin opg_apis.create_edges_text_idx(:1,:2); end;"
);
cs.setString(1,szUser);
cs.setString(2,szGraph);
cs.execute();
cs.close();
// Query using CONTAINS text search operator on vertex property
// Find all vertices with an lname property value that starts with 'B'
String pgql =
"SELECT v.fname AS fname, v.lname AS lname "+
"MATCH (v) "+
"WHERE CONTAINS(v.lname,'B%')";
// Create an OraclePgqlStatement
opps = OraclePgqlExecutionFactory.prepareStatement(opg, pgql);
// execute query
rs = opps.executeQuery("");
// print results
System.out.println("-- Vertex Property Query ---------------");
printResults(rs, 2);
// close result set and prepared statement
rs.close();
opps.close();
// Query using CONTAINS text search operator on edge property
// Find all knows edges with a firstMetIn property value that ends with 'A'
pgql =
"SELECT v1.fname AS fname1, v2.fname AS fname2, e.firstMetIn AS loc "+
"MATCH (v1)-[e:knows]->(v2) "+
"WHERE CONTAINS(e.firstMetIn,'%A')";
// Create an OraclePgqlStatement
opps = OraclePgqlExecutionFactory.prepareStatement(opg, pgql);
// execute query
rs = opps.executeQuery("");
// print results
System.out.println("-- Edge Property Query -----------------");
printResults(rs, 3);
// close result set and statement
rs.close();
opps.close();
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (opps != null) {
opps.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints an PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((java.util.Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample7.java
has the following output for test_graph
.
-- Vertex Property Query --------------- [STRING: Susan, STRING: Blue] [STRING: Bill, STRING: Brown] [STRING: John, STRING: Black] -- Edge Property Query ----------------- [STRING: Susan, STRING: Bill, STRING: CA] [STRING: John, STRING: Bill, STRING: GA] [STRING: Susan, STRING: John, STRING: CA] [STRING: Susan, STRING: Ray, STRING: CA]
4.8.2.4 Obtaining the SQL Translation for a PGQL Query
You can obtain the SQL translation for a PGQL query through methods in OraclePgqlStatement
and OraclePgqlPreparedStatement
. The raw SQL for a PGQL query can be useful for several reasons:
-
You can execute the SQL directly against the database with other SQL-based tools or interfaces (for example, SQL Plus or SQL Developer).
-
You can customize and tune the generated SQL to optimize performance or to satisfy a particular requirement of your application.
-
You can build a larger SQL query that joins a PGQL subquery with other data stored in Oracle Database (such as relational tables, spatial data, and JSON data).
Example 4-31 PgqlExample8.java
PgqlExample8.java
shows how to obtain the raw SQL translation for a PGQL query. The translateQuery
method of OraclePgqlStatement
returns an OraclePgqlSqlTrans
object that contains information about return columns from the query and the SQL translation itself.
The translated SQL returns different columns depending on the type of "logical" object or value projected from the PGQL query. A vertex or edge projected in PGQL has two corresponding columns projected in the translated SQL:
-
$IT : id type – NVARCHAR(1): 'V' for vertex or 'E' for edge
-
$ID : vertex or edge identifier – NUMBER: same content as VID or EID columns in VT$ and GE$ tables
A property value or constant scalar value projected in PGQL has four corresponding columns projected in the translated SQL:
-
$T : value type – NUMBER: same content as T column in VT$ and GE$ tables
-
$V: value – NVARCHAR2(15000): same content as V column in VT$ and GE$ tables
-
$VN: number value – NUMBER: same content as VN column in VT$ and GE$ tables
-
$VT: temporal value – TIMESTAMP WITH TIME ZONE: same content as VT column in VT$ and GE$ tables
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to obtain the SQL translation for a PGQL query.
*/
public class PgqlExample8
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// PGQL query to be translated
String pgql =
"SELECT v1, v1.fname AS fname1, e, e.since AS since "+
"MATCH (v1)-[e:knows]->(v2)";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Get the SQL translation
OraclePgqlSqlTrans sqlTrans = ops.translateQuery(pgql,"");
// Get the return column descriptions
OraclePgqlColumnDescriptor[] cols = sqlTrans.getReturnTypes();
// Print column descriptions
System.out.println("-- Return Columns -----------------------");
printReturnCols(cols);
// Print SQL translation
System.out.println("-- SQL Translation ----------------------");
System.out.println(sqlTrans.getSqlTranslation());
}
finally {
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints return columns for a SQL translation
*/
static void printReturnCols(OraclePgqlColumnDescriptor[] cols) throws Exception
{
StringBuffer buff = new StringBuffer("");
for (int i = 0; i < cols.length; i++) {
String colName = cols[i].getColName();
OraclePgqlColumnDescriptor.Type colType = cols[i].getColType();
int offset = cols[i].getSqlOffset();
String readableType = "";
switch(colType) {
case VERTEX:
readableType = "VERTEX";
break;
case EDGE:
readableType = "EDGE";
break;
case VALUE:
readableType = "VALUE";
break;
}
buff.append("colName=["+colName+"] colType=["+readableType+"] offset=["+offset+"]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample8.java
has the following output for test_graph
.
-- Return Columns ----------------------- colName=[v1] colType=[VERTEX] offset=[1] colName=[fname1] colType=[VALUE] offset=[3] colName=[e] colType=[EDGE] offset=[7] colName=[since] colType=[VALUE] offset=[9] -- SQL Translation ---------------------- SELECT n'V' AS "v1$IT", T0$0.SVID AS "v1$ID", T0$1.T AS "fname1$T", T0$1.V AS "fname1$V", T0$1.VN AS "fname1$VN", T0$1.VT AS "fname1$VT", n'E' AS "e$IT", T0$0.EID AS "e$ID", T0$0.T AS "since$T", T0$0.V AS "since$V", T0$0.VN AS "since$VN", T0$0.VT AS "since$VT" FROM "SCOTT".GRAPH1GE$ T0$0, "SCOTT".GRAPH1VT$ T0$1 WHERE T0$0.K=n'since' AND T0$1.K=n'fname' AND T0$0.SVID=T0$1.VID AND (T0$0.EL = n'knows' AND T0$0.EL IS NOT NULL)
Example 4-32 PgqlExample9.java
You can also obtain the SQL translation for PGQL queries with bind variables. In this case, the corresponding SQL translation will also contain bind variables. The OraclePgqlSqlTrans
interface has a getSqlBvList
method that returns an ordered List of Java Objects that should be bound to the SQL query (the first Object on the list should be set at position 1, and the second should be set at position 2, and so on).
PgqlExample9.java
shows how to get and execute the SQL for a PGQL query with bind variables.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to obtain and execute the SQL translation for a
* PGQL query that uses bind variables.
*/
public class PgqlExample9
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlPreparedStatement opps = null;
PreparedStatement ps = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Execute query to get a ResultSet object
String pgql =
"SELECT v1, v1.fname AS fname1, v1.age AS age, ? as constVal "+
"MATCH (v1) "+
"WHERE v1.fname = ? OR v1.age < ?";
// Create an OraclePgqlStatement
opps = OraclePgqlExecutionFactory.prepareStatement(opg, pgql);
// set bind values
opps.setDouble(1, 2.05d);
opps.setString(2, "Bill");
opps.setInt(3, 35);
// Get the SQL translation
OraclePgqlSqlTrans sqlTrans = opps.translateQuery("");
// Get the SQL String
String sqlStr = sqlTrans.getSqlTranslation();
// Get the return column descriptions
OraclePgqlColumnDescriptor[] cols = sqlTrans.getReturnTypes();
// Get the bind values
List<Object> bindVals = sqlTrans.getSqlBvList();
// Print column descriptions
System.out.println("-- Return Columns -----------------------");
printReturnCols(cols);
// Print SQL translation
System.out.println("-- SQL Translation ----------------------");
System.out.println(sqlStr);
// Print Bind Values
System.out.println("\n-- Bind Values --------------------------");
for (Object obj : bindVals) {
System.out.println(obj.toString());
}
// Execute Query
// Get PreparedStatement
ps = oracle.getConnection().prepareStatement("SELECT COUNT(*) FROM ("+sqlStr+")");
// Set bind values and execute the PreparedStatement
executePs(ps, bindVals);
// Set new bind values in the PGQL PreparedStatement
opps.setDouble(1, 3.02d);
opps.setString(2, "Ray");
opps.setInt(3, 30);
// Print Bind Values
bindVals = sqlTrans.getSqlBvList();
System.out.println("\n-- Bind Values --------------------------");
for (Object obj : bindVals) {
System.out.println(obj.toString());
}
// Execute the PreparedStatement with new bind values
executePs(ps, bindVals);
}
finally {
// close the SQL statement
if (ps != null) {
ps.close();
}
// close the statement
if (opps != null) {
opps.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Executes a SQL PreparedStatement with the input bind values
*/
static void executePs(PreparedStatement ps, List<Object> bindVals) throws Exception
{
ResultSet rs = null;
try {
// Set bind values
for (int idx = 0; idx < bindVals.size(); idx++) {
Object o = bindVals.get(idx);
// String
if (o instanceof java.lang.String) {
ps.setNString(idx + 1, (String)o);
}
// Int
else if (o instanceof java.lang.Integer) {
ps.setInt(idx + 1, ((Integer)o).intValue());
}
// Long
else if (o instanceof java.lang.Long) {
ps.setLong(idx + 1, ((Long)o).longValue());
}
// Float
else if (o instanceof java.lang.Float) {
ps.setFloat(idx + 1, ((Float)o).floatValue());
}
// Double
else if (o instanceof java.lang.Double) {
ps.setDouble(idx + 1, ((Double)o).doubleValue());
}
// Timestamp
else if (o instanceof java.sql.Timestamp) {
ps.setTimestamp(idx + 1, (Timestamp)o);
}
else {
ps.setString(idx + 1, bindVals.get(idx).toString());
}
}
// Execute query
rs = ps.executeQuery();
if (rs.next()) {
System.out.println("\n-- Execute Query: Result has "+rs.getInt(1)+" rows --");
}
}
finally {
// close the SQL ResultSet
if (rs != null) {
rs.close();
}
}
}
/**
* Prints return columns for a SQL translation
*/
static void printReturnCols(OraclePgqlColumnDescriptor[] cols) throws Exception
{
StringBuffer buff = new StringBuffer("");
for (int i = 0; i < cols.length; i++) {
String colName = cols[i].getColName();
OraclePgqlColumnDescriptor.Type colType = cols[i].getColType();
int offset = cols[i].getSqlOffset();
String readableType = "";
switch(colType) {
case VERTEX:
readableType = "VERTEX";
break;
case EDGE:
readableType = "EDGE";
break;
case VALUE:
readableType = "VALUE";
break;
}
buff.append("colName=["+colName+"] colType=["+readableType+"] offset=["+offset+"]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample9.java
has the following output for test_graph
.
– -- Return Columns ----------------------- colName=[v1] colType=[VERTEX] offset=[1] colName=[fname1] colType=[VALUE] offset=[3] colName=[age] colType=[VALUE] offset=[7] colName=[constVal] colType=[VALUE] offset=[11] -- SQL Translation ---------------------- SELECT n'V' AS "v1$IT", T0$0.VID AS "v1$ID", T0$0.T AS "fname1$T", T0$0.V AS "fname1$V", T0$0.VN AS "fname1$VN", T0$0.VT AS "fname1$VT", T0$1.T AS "age$T", T0$1.V AS "age$V", T0$1.VN AS "age$VN", T0$1.VT AS "age$VT", 4 AS "constVal$T", to_nchar(?,'TM9','NLS_Numeric_Characters=''.,''') AS "constVal$V", ? AS "constVal$VN", to_timestamp_tz(null) AS "constVal$VT" FROM "SCOTT".GRAPH1VT$ T0$0, "SCOTT".GRAPH1VT$ T0$1 WHERE T0$0.K=n'fname' AND T0$1.K=n'age' AND T0$0.VID=T0$1.VID AND ((T0$0.T = 1 AND T0$0.V = ?) OR T0$1.VN < ?) -- Bind Values -------------------------- 2.05 2.05 Bill 35 -- Execute Query: Result has 2 rows -- -- Bind Values -------------------------- 3.02 3.02 Ray 30 -- Execute Query: Result has 1 rows --
4.8.2.5 Additional Options for PGQL Translation and Execution
Several options are available to influence PGQL query translation and execution. The following are the main ways to set query options:
-
Through explicit arguments to
executeQuery
andtranslateQuery
-
Through flags in the
options
string argument ofexecuteQuery
andtranslateQuery
-
Through Java JVM arguments.
The following table summarizes the available query arguments for PGQL translation and execution.
Table 4-2 PGQL Translation and Execution Options
Option | Default | Explict Argument | Options Flag | JVM Argument |
---|---|---|---|---|
Degree of parallelism |
1 |
parallel |
none |
none |
Timeout |
unlimited |
timeout |
none |
none |
Dynamic sampling |
2 |
dynamicSampling |
none |
none |
Maximum number of results |
unlimited |
maxResults |
none |
none |
GT$ table usage |
off |
none |
USE_GT_TAB=T |
-Doracle.pg.rdbms.pgql.useGtTab=true |
CONNECT BY usage |
off |
none |
USE_RW=F |
-Doracle.pg.rdbms.pgql.useRW=false |
Distinct recursive WITH usage |
off |
none |
USE_DIST_RW=T |
-Doracle.pg.rdbms.pgql.useDistRW=true |
Maximum path length |
unlimited |
none |
MAX_PATH_LEN=n |
-Doracle.pg.rdbms.pgql.maxPathLen=n |
Set partial |
false |
none |
EDGE_SET_PARTIAL=T |
-Doracle.pg.rdbms.pgql.edgeSetPartial=true |
4.8.2.5.1 Query Options Controlled by Explicit Arguments
Some query options are controlled by explicit arguments to methods in the Java API.
-
The
executeQuery
method ofOraclePgqlStatement
has explicit arguments for timeout in seconds, degree of parallelism, query identifier (reserved for future use), optimizer dynamic sampling, and maximum number of results. -
The
translateQuery
method has explicit arguments for degree of parallelism, optimizer dynamic sampling, and maximum number of results.OraclePgqlPreparedStatement
also provides those same additional arguments forexecuteQuery
andtranslateQuery
.
Example 4-33 PgqlExample10.java
PgqlExample10.java
shows PGQL query execution with additional options controlled by explicit arguments.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to execute a PGQL query with various options.
*/
public class PgqlExample10
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"SELECT v1.fname AS fname1, v2.fname AS fname2 "+
"MATCH (v1)-[:friendOf]->(v2)";
rs = ops.executeQuery(pgql /* query string */,
100 /* timeout (sec): 0 is default and implies no timeout */,
2 /* parallel: 1 is default */,
1001 /* query id: 0 is default */,
6 /* dynamic sampling: 2 is default */,
50 /* max results: -1 is default and implies no limit */,
"" /* options */);
// Print query results
printResults(rs, 2);
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints a PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample10.java
gives the following output for test_graph
.
[STRING: John, STRING: Bill] [STRING: John, STRING: Susan] [STRING: Ray, STRING: Susan] [STRING: Susan, STRING: John] [STRING: Bill, STRING: John]
Parent topic: Additional Options for PGQL Translation and Execution
4.8.2.5.2 Using the GT$ Skeleton Table
The property graph relational schema defines a GT$ skeleton table that stores a single row for each edge in the graph, no matter how many properties an edge has. This skeleton table is not populated by default, but if it is populated, PGQL query execution can take advantage of the GT$ table and avoid sorting operations on the GE$ table in many cases, which gives a significant performance improvement.
You can add "USE_GT_TAB=T"
to the options
argument of executeQuery
and translateQuery
or use -Doracle.pg.rdbms.pgql.useGtTab=true
in the Java command line to turn on GT$ table usage.
Example 4-34 PgqlExample11.java
PgqlExample11.java
shows a query that uses the GT$ skeleton table.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use the GT$ skeleton table for faster
* PGQL query execution.
*/
public class PgqlExample11
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// populate GT$ skeleton table with distinct edges
CallableStatement cs = null;
cs = oracle.getConnection().prepareCall(
"begin opg_graphop.populate_skeleton_tab(:1,:2); end;"
);
cs.setString(1,szGraph);
cs.setInt(2,1);
cs.execute();
cs.close();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Execute query to get a ResultSet object
String pgql =
"SELECT id(v1), id(v2) "+
"MATCH (v1)-[knows]->(v2)";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Get the SQL translation without GT table
OraclePgqlSqlTrans sqlTrans = ops.translateQuery(pgql,"");
// Print SQL translation
System.out.println("-- SQL Translation without GT Table ----------------------");
System.out.println(sqlTrans.getSqlTranslation());
// Get the SQL translation with GT table
sqlTrans = ops.translateQuery(pgql,"USE_GT_TAB=T");
// Print SQL translation
System.out.println("-- SQL Translation with GT Table -------------------------");
System.out.println(sqlTrans.getSqlTranslation());
}
finally {
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
}
PgqlExample11.java
gives the following output for test_graph
.
-- SQL Translation without GT Table ---------------------- SELECT 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v2)$T", to_nchar(T0$0.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v2)$V", T0$0.DVID AS "id(v2)$VN", to_timestamp_tz(null) AS "id(v2)$VT" FROM (SELECT DISTINCT EID, SVID, DVID,EL FROM "SCOTT".GRAPH1GE$) T0$0 -- SQL Translation with GT Table ------------------------- SELECT 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v2)$T", to_nchar(T0$0.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v2)$V", T0$0.DVID AS "id(v2)$VN", to_timestamp_tz(null) AS "id(v2)$VT" FROM "SCOTT".GRAPH1GT$ T0$0
Parent topic: Additional Options for PGQL Translation and Execution
4.8.2.5.3 Path Query Options
A few options are available for executing path queries in PGQL. There are two basic evaluation methods available in Oracle SQL: CONNECT BY or recursive WITH clauses. Recursive WITH is the default evaluation method. In addition, you can further modify the recursive WITH evaluation method to include a DISTINCT modifier during the recursive step of query evaluation. Computing distinct vertices at each step helps prevent a combinatorial explosion in highly connected graphs. The DISTINCT modifier is not added by default because it requires a specific parameter setting in the database ("_recursive_with_control"=8
).
You can also control the maximum length of paths searched. Path length in this case is defined as the number of repetitions allowed when evaluating the * and + operators. The default maximum length is unlimited.
Path evaluation options are summarized as follows.
-
CONNECT BY: To use CONNECT BY, specify
'USE_RW=F'
in theoptions
argument or specify-Doracle.pg.rdbms.pgql.useRW=false
in the Java command line. -
Distinct Modifier in Recursive WITH: To use the DISTINCT modifier in the recursive step, first set
"_recursive_with_control"=8
in your database session, then specify'USE_DIST_RW=T
' in theoptions
argument or specify-Doracle.pg.rdbms.pgql.useDistRW=true
in the Java command line. You will encounter ORA-32486: unsupported operation in recursive branch of recursive WITH clause if"_recursive_with_control"
has not been set to 8 in your session. -
Path Length Restriction: To limit maximum number of repetitions when evaluating * and + to n, specify
'MAX_PATH_LEN=n'
in the queryoptions
argument or specify-Doracle.pg.rdbms.pgql.maxPathLen=n
in the Java command line.
Example 4-35 PgqlExample12.java
PgqlExample12.java
shows path query translations under various options.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use various options with PGQL path queries.
*/
public class PgqlExample12
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Set "_recursive_with_control"=8 to enable distinct optimization
// optimization for recursive with
Statement stmt = oracle.getConnection().createStatement();
stmt.executeUpdate("alter session set \"_recursive_with_control\"=8");
stmt.close();
// populate GT$ skeleton table with distinct edges
CallableStatement cs = null;
cs = oracle.getConnection().prepareCall(
"begin opg_graphop.populate_skeleton_tab(:1,:2); end;"
);
cs.setString(1,szGraph);
cs.setInt(2,1);
cs.execute();
cs.close();
// Path Query to illustrate options
String pgql =
"PATH fof AS ()-[:friendOf]->() "+
"SELECT id(v1), id(v2) "+
"MATCH (v1)-/:fof*/->(v2) "+
"WHERE id(v1) = 2";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// get SQL translation with defaults - Non-distinct Recursive WITH
OraclePgqlSqlTrans sqlTrans =
ops.translateQuery(pgql /* query string */,
2 /* parallel: default is 1 */,
2 /* dynamic sampling: default is 2 */,
-1 /* max results: -1 implies no limit */,
" USE_GT_TAB=T " /* options */);
System.out.println("-- Default Path Translation --------------------");
System.out.println(sqlTrans.getSqlTranslation()+"\n");
// get SQL translation with DISTINCT reachability optimization
sqlTrans =
ops.translateQuery(pgql /* query string */,
2 /* parallel: default is 1 */,
2 /* dynamic sampling: default is 2 */,
-1 /* max results: -1 implies no limit */,
" USE_DIST_RW=T USE_GT_TAB=T " /* options */);
System.out.println("-- DISTINCT RW Path Translation --------------------");
System.out.println(sqlTrans.getSqlTranslation()+"\n");
// get SQL translation with CONNECT BY
sqlTrans =
ops.translateQuery(pgql /* query string */,
2 /* parallel: default is 1 */,
2 /* dynamic sampling: default is 2 */,
-1 /* max results: -1 implies no limit */,
" USE_RW=F USE_GT_TAB=T " /* options */);
System.out.println("-- CONNECT BY Path Translation --------------------");
System.out.println(sqlTrans.getSqlTranslation()+"\n");
}
finally {
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
}
PgqlExample12.java
gives the following output for test_graph
.
-- Default Path Translation -------------------- SELECT /*+ PARALLEL(2) */ * FROM(SELECT 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v2)$T", to_nchar(T0$0.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v2)$V", T0$0.DVID AS "id(v2)$VN", to_timestamp_tz(null) AS "id(v2)$VT" FROM (/*Path[*/SELECT DISTINCT SVID, DVID FROM ( SELECT 2 AS SVID, 2 AS DVID FROM SYS.DUAL UNION ALL SELECT SVID,DVID FROM (WITH RW (ROOT, DVID) AS ( SELECT ROOT, DVID FROM (SELECT SVID ROOT, DVID FROM (SELECT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM "SCOTT".GRAPH1GT$ T0$0 WHERE T0$0.SVID = 2 AND (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL)) ) UNION ALL SELECT RW.ROOT, R.DVID FROM (SELECT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM "SCOTT".GRAPH1GT$ T0$0 WHERE (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL)) R, RW WHERE RW.DVID = R.SVID ) CYCLE DVID SET cycle_col TO 1 DEFAULT 0 SELECT ROOT SVID, DVID FROM RW))/*]Path*/) T0$0 WHERE T0$0.SVID = 2) -- DISTINCT RW Path Translation -------------------- SELECT /*+ PARALLEL(2) */ * FROM(SELECT 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v2)$T", to_nchar(T0$0.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v2)$V", T0$0.DVID AS "id(v2)$VN", to_timestamp_tz(null) AS "id(v2)$VT" FROM (/*Path[*/SELECT DISTINCT SVID, DVID FROM ( SELECT 2 AS SVID, 2 AS DVID FROM SYS.DUAL UNION ALL SELECT SVID,DVID FROM (WITH RW (ROOT, DVID) AS ( SELECT ROOT, DVID FROM (SELECT SVID ROOT, DVID FROM (SELECT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM "SCOTT".GRAPH1GT$ T0$0 WHERE T0$0.SVID = 2 AND (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL)) ) UNION ALL SELECT DISTINCT RW.ROOT, R.DVID FROM (SELECT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM "SCOTT".GRAPH1GT$ T0$0 WHERE (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL)) R, RW WHERE RW.DVID = R.SVID ) CYCLE DVID SET cycle_col TO 1 DEFAULT 0 SELECT ROOT SVID, DVID FROM RW))/*]Path*/) T0$0 WHERE T0$0.SVID = 2) -- CONNECT BY Path Translation -------------------- SELECT /*+ PARALLEL(2) */ * FROM(SELECT 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v2)$T", to_nchar(T0$0.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v2)$V", T0$0.DVID AS "id(v2)$VN", to_timestamp_tz(null) AS "id(v2)$VT" FROM (/*Path[*/SELECT DISTINCT SVID, DVID FROM ( SELECT 2 AS SVID, 2 AS DVID FROM SYS.DUAL UNION ALL SELECT SVID, DVID FROM (SELECT CONNECT_BY_ROOT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM( SELECT T0$0.SVID AS SVID, T0$0.DVID AS DVID FROM "SCOTT".GRAPH1GT$ T0$0 WHERE (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL)) T0$0 START WITH T0$0.SVID = 2 CONNECT BY NOCYCLE PRIOR DVID = SVID))/*]Path*/) T0$0 WHERE T0$0.SVID = 2)
The query plan for the first query with the default recursive WITH strategy should look similar to the following.
-- default RW --------------------------------------------------------------------------------------- | Id | Operation | Name | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | TEMP TABLE TRANSFORMATION | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6642_133DFF | | 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST | | | 4 | PX COORDINATOR | | | 5 | PX SEND QC (RANDOM) | :TQ20000 | | 6 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6642_133DFF | | 7 | PX BLOCK ITERATOR | | |* 8 | TABLE ACCESS FULL | GRAPH1GT$ | | 9 | PX COORDINATOR | | | 10 | PX SEND QC (RANDOM) | :TQ10000 | | 11 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D6642_133DFF | | 12 | NESTED LOOPS | | | 13 | PX BLOCK ITERATOR | | |* 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6642_133DFF | | 15 | PARTITION HASH ALL | | |* 16 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| GRAPH1GT$ | |* 17 | INDEX RANGE SCAN | GRAPH1XSG$ | | 18 | PX COORDINATOR | | | 19 | PX SEND QC (RANDOM) | :TQ30001 | | 20 | VIEW | | | 21 | HASH UNIQUE | | | 22 | PX RECEIVE | | | 23 | PX SEND HASH | :TQ30000 | | 24 | HASH UNIQUE | | | 25 | VIEW | | | 26 | UNION-ALL | | | 27 | PX SELECTOR | | | 28 | FAST DUAL | | | 29 | VIEW | | |* 30 | VIEW | | | 31 | PX BLOCK ITERATOR | | | 32 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6642_133DFF | ---------------------------------------------------------------------------------------
The query plan for the second query that adds a DISTINCT modifier in the recursive step should look similar to the following.
---------------------------------------------------------------------------------------------- | Id | Operation | Name | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | TEMP TABLE TRANSFORMATION | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D663B_133DFF | | 3 | UNION ALL (RECURSIVE WITH) BREADTH FIRST | | | 4 | PX COORDINATOR | | | 5 | PX SEND QC (RANDOM) | :TQ20000 | | 6 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D663B_133DFF | | 7 | PX BLOCK ITERATOR | | |* 8 | TABLE ACCESS FULL | GRAPH1GT$ | | 9 | PX COORDINATOR | | | 10 | PX SEND QC (RANDOM) | :TQ10001 | | 11 | LOAD AS SELECT (CURSOR DURATION MEMORY) | SYS_TEMP_0FD9D663B_133DFF | | 12 | SORT GROUP BY | | | 13 | PX RECEIVE | | | 14 | PX SEND HASH | :TQ10000 | | 15 | SORT GROUP BY | | | 16 | NESTED LOOPS | | | 17 | PX BLOCK ITERATOR | | |* 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D663B_133DFF | | 19 | PARTITION HASH ALL | | |* 20 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| GRAPH1GT$ | |* 21 | INDEX RANGE SCAN | GRAPH1XSG$ | | 22 | PX COORDINATOR | | | 23 | PX SEND QC (RANDOM) | :TQ30001 | | 24 | VIEW | | | 25 | HASH UNIQUE | | | 26 | PX RECEIVE | | | 27 | PX SEND HASH | :TQ30000 | | 28 | HASH UNIQUE | | | 29 | VIEW | | | 30 | UNION-ALL | | | 31 | PX SELECTOR | | | 32 | FAST DUAL | | | 33 | VIEW | | |* 34 | VIEW | | | 35 | PX BLOCK ITERATOR | | | 36 | TABLE ACCESS FULL | SYS_TEMP_0FD9D663B_133DFF | ----------------------------------------------------------------------------------------------
The query plan for the third query that uses CONNECTY BY should look similar to the following.
------------------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | VIEW | | | 2 | HASH UNIQUE | | | 3 | VIEW | | | 4 | UNION-ALL | | | 5 | FAST DUAL | | |* 6 | VIEW | | |* 7 | CONNECT BY NO FILTERING WITH START-WITH| | | 8 | PX COORDINATOR | | | 9 | PX SEND QC (RANDOM) | :TQ10000 | | 10 | PX BLOCK ITERATOR | | |* 11 | TABLE ACCESS FULL | GRAPH1GT$ | ------------------------------------------------------------------
Example 4-36 PgqlExample13.java
PgqlExample13.java
shows how to set length restrictions during path query evaluation.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use the maximum path length option for
* PGQL path queries.
*/
public class PgqlExample13
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Path Query to illustrate options
String pgql =
"PATH fof AS ()-[:friendOf]->() "+
"SELECT v1.fname AS fname1, v2.fname AS fname2 "+
"MATCH (v1)-/:fof*/->(v2) "+
"WHERE v1.fname = 'Ray'";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// execute query for 1-hop
rs = ops.executeQuery(pgql, " MAX_PATH_LEN=1 ");
// print results
System.out.println("-- Results for 1-hop ----------------");
printResults(rs, 2);
// close result set
rs.close();
// execute query for 2-hop
rs = ops.executeQuery(pgql, " MAX_PATH_LEN=2 ");
// print results
System.out.println("-- Results for 2-hop ----------------");
printResults(rs, 2);
// close result set
rs.close();
// execute query for 3-hop
rs = ops.executeQuery(pgql, " MAX_PATH_LEN=3 ");
// print results
System.out.println("-- Results for 3-hop ----------------");
printResults(rs, 2);
// close result set
rs.close();
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints a PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((java.util.Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
PgqlExample13.java
has the following output for test_graph
.
-- Results for 1-hop ---------------- [STRING: Ray, STRING: Ray] [STRING: Ray, STRING: Susan] -- Results for 2-hop ---------------- [STRING: Ray, STRING: Susan] [STRING: Ray, STRING: Ray] [STRING: Ray, STRING: John] -- Results for 3-hop ---------------- [STRING: Ray, STRING: Susan] [STRING: Ray, STRING: Bill] [STRING: Ray, STRING: Ray] [STRING: Ray, STRING: John]
Parent topic: Additional Options for PGQL Translation and Execution
4.8.2.5.4 Options for Partial Object Construction
When reading edges from a query result, there are two possible behaviors when adding the start and end vertex to any local caches:
-
Add only the vertex ID, which is available from the edge itself. This option is the default, for efficiency.
-
Add the vertex ID, and retrieve all properties for the start and end vertex. For this behavior, you can call
setPartial(true)
on eachOracleVertex
object constructed from your PGQL query result set.
Example 4-37 PgqlExample14.java
PgqlExample14.java
illustrates this difference in behavior. This program first executes a query to retrieve all edges, which causes the incident vertices to be added to a local cache. The second query retrieves all vertices. The program then prints each OracleVertex
object to show which properties have been loaded.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows the behavior of setPartial(true) for OracleVertex objects
* created from PGQL query results.
*/
public class PgqlExample14
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// Query to illustrate set partial
String pgql =
"SELECT id(e), label(e) "+
"MATCH (v1)-[e:knows]->(v2)";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// execute query for 1-hop
rs = ops.executeQuery(pgql, " ");
// print results
System.out.println("-- Results for edge query -----------------");
printResults(rs, 2);
// close result set
rs.close();
// Query to retrieve vertices
pgql =
"SELECT id(v) "+
"MATCH (v)";
// Get each vertex object in result and print with toString()
rs = ops.executeQuery(pgql, " ");
// iterate through result
System.out.println("-- Vertex objects retrieved from vertex query --");
while (rs.next()) {
Long vid = rs.getLong(1);
OracleVertex v = OracleVertex.getInstance(opg, vid);
System.out.println(v.toString());
}
// close result set
rs.close();
// Execute the same query but call setPartial(true) for each vertex
rs = ops.executeQuery(pgql, " ");
System.out.println("-- Vertex objects retrieved from vertex query with setPartial(true) --");
while (rs.next()) {
Long vid = rs.getLong(1);
OracleVertex v = OracleVertex.getInstance(opg, vid);
v.setPartial(true);
System.out.println(v.toString());
}
// close result set
rs.close();
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints a PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((java.util.Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
The output for PgqlExample14.java
is:
-- Results for edge query ----------------- [LONG: 11, STRING: knows] [LONG: 6, STRING: knows] [LONG: 10, STRING: knows] [LONG: 5, STRING: knows] [LONG: 4, STRING: knows] [LONG: 13, STRING: knows] [LONG: 12, STRING: knows] [LONG: 9, STRING: knows] [LONG: 8, STRING: knows] [LONG: 15, STRING: knows] [LONG: 7, STRING: knows] [LONG: 14, STRING: knows] -- Vertex objects retrieved from vertex query -- Vertex ID 3 [NULL] {} Vertex ID 0 [NULL] {} Vertex ID 2 [NULL] {} Vertex ID 1 [NULL] {} -- Vertex objects retrieved from vertex query with setPartial(true) -- Vertex ID 3 [NULL] {bval:bol:false, fname:str:Susan, lname:str:Blue, mval:bol:false, age:int:35} Vertex ID 0 [NULL] {bval:bol:true, fname:str:Bill, lname:str:Brown, mval:str:y, age:int:40} Vertex ID 2 [NULL] {fname:str:Ray, lname:str:Green, mval:dat:1985-01-01 04:00:00.0, age:int:41} Vertex ID 1 [NULL] {bval:bol:true, fname:str:John, lname:str:Black, mval:int:27, age:int:30}
Parent topic: Additional Options for PGQL Translation and Execution
4.8.2.6 Querying Another User’s Property Graph
You can query another user’s property graph data if you have been granted the appropriate privileges in the database. For example, to query GRAPH1 in SCOTT’s schema you must have READ privilege on SCOTT.GRAPH1GE$ and SCOTT.GRAPH1VT$, and you must also have READ privilege on SCOTT.GRAPH1GT$ if you want to query with the USE_GT_TAB=T
option.
Example 4-38 PgqlExample15.java
PgqlExample15.java
shows how another user can query a graph in SCOTT’s schema.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import oracle.pgql.lang.PgqlException;
import oracle.pgql.lang.ResultSet;
import java.util.*;
import java.text.*;
/**
* This example shows how to query a property graph located in another user's
* schema. READ privilege on GE$, VT$ and GT$ tables for the other user's
* property graph are required to avoid ORA-00942: table or view does not exist.
*/
public class PgqlExample15
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
ResultSet rs = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Get a property graph instance for Scott's graph
opg = OraclePropertyGraph.getInstance(oracle, "SCOTT", szGraph, 1, 1, null, null);
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// Execute query to get a ResultSet object
String pgql =
"SELECT v.fname AS fname, v.lname AS lname "+
"MATCH (v)";
rs = ops.executeQuery(pgql, "");
// Print query results
printResults(rs, 2);
}
finally {
// close the result set
if (rs != null) {
rs.close();
}
// close the statement
if (ops != null) {
ops.close();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
/**
* Prints a PGQL ResultSet
*/
static void printResults(ResultSet rs, int numCols) throws Exception
{
StringBuffer buff = new StringBuffer("");
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSSXXX");
while (rs.next()) {
buff.append("[");
for (int i = 1; i <= numCols; i++) {
// use generic getObject to handle all types
Object mval = rs.getObject(i);
String mStr = "";
if (mval instanceof java.lang.String) {
mStr = "STRING: "+mval.toString();
}
else if (mval instanceof java.lang.Integer) {
mStr = "INTEGER: "+mval.toString();
}
else if (mval instanceof java.lang.Long) {
mStr = "LONG: "+mval.toString();
}
else if (mval instanceof java.lang.Float) {
mStr = "FLOAT: "+mval.toString();
}
else if (mval instanceof java.lang.Double) {
mStr = "DOUBLE: "+mval.toString();
}
else if (mval instanceof java.sql.Timestamp) {
mStr = "DATE: "+sdf.format((java.util.Date)mval);
}
else if (mval instanceof java.lang.Boolean) {
mStr = "BOOLEAN: "+mval.toString();
}
if (i > 1) {
buff.append(",\t");
}
buff.append(mStr);
}
buff.append("]\n");
}
System.out.println(buff.toString());
}
}
The following SQL statements create database user USER2 and grant the necessary privileges. You can also use the OraclePropertyGraph.grantAccess
Java API to achieve the same effect.
SQL> grant connect, resource, unlimited tablespace to user2 identified by user2; Grant succeeded. SQL> grant read on scott.graph1vt$ to user2; Grant succeeded. SQL> grant read on scott.graph1ge$ to user2; Grant succeeded. SQL> grant read on scott.graph1gt$ to user2; Grant succeeded.
The output for PgqlExample15.java
for the test_graph
data set when connected to the database as USER2 is as follows. Note that test_graph
should have already been loaded as GRAPH1 by user SCOTT before running PgqlExample15
.
[STRING: Susan, STRING: Blue] [STRING: Bill, STRING: Brown] [STRING: Ray, STRING: Green] [STRING: John, STRING: Black]
4.8.2.7 Using Query Optimizer Hints with PGQL
The Java API allows query optimizer hints that influence the join type when executing PGQL queries. The executeQuery
and translateQuery
methods in OraclePgqlStatement
and OraclePgqlPreparedStatement
accept the following strings in the options argument to influence the query plan for the corresponding SQL query.
-
ALL_EDGE_NL – Use Nested Loop join for all joins that involve the $GE and $GT tables.
-
ALL_EDGE_HASH – Use HASH join for all joins that involve the $GE and $GT tables.
-
ALL_VERTEX_NL – Use Nested Loop join for all joins that involve the $VT table.
-
ALL_VERTEX_HASH – Use HASH join for all joins that involve the $VT table.
Example 4-39 PgqlExample16.java
PgqlExample16.java
shows how to use optimizer hints to influence the joins used for a graph traversal.
import oracle.pg.rdbms.*;
import oracle.pg.common.*;
import java.util.*;
import java.text.*;
import java.sql.*;
import java.time.*;
import java.time.format.*;
/**
* This example shows how to use query optimizer hints with PGQL queries.
*/
public class PgqlExample16
{
public static void main(String[] szArgs) throws Exception
{
int iArgIdx=0;
String szHost = szArgs[iArgIdx++];
String szPort = szArgs[iArgIdx++];
String szSID = szArgs[iArgIdx++];
String szUser = szArgs[iArgIdx++];
String szPassword = szArgs[iArgIdx++];
String szGraph = szArgs[iArgIdx++];
String szVertexFile = szArgs[iArgIdx++];
String szEdgeFile = szArgs[iArgIdx++];
Oracle oracle = null;
OraclePropertyGraph opg = null;
OraclePgqlStatement ops = null;
try {
// Create a connection to Oracle
oracle =
new Oracle("jdbc:oracle:thin:@"+szHost+":"+szPort +":"+szSID, szUser, szPassword);
// Create a property graph
opg = OraclePropertyGraph.getInstance(oracle, szGraph);
// Clear any existing data
opg.clearRepository();
// Load data
OraclePropertyGraphDataLoader opgLoader = OraclePropertyGraphDataLoader.getInstance();
opgLoader.loadData(opg, szVertexFile, szEdgeFile, 1);
// populate GT$ skeleton table with distinct edges
CallableStatement cs = null;
cs = oracle.getConnection().prepareCall(
"begin opg_graphop.populate_skeleton_tab(:1,:2); end;"
);
cs.setString(1,szGraph);
cs.setInt(2,1);
cs.execute();
cs.close();
// Query to illustrate join hints
String pgql =
"SELECT id(v1), id(v4) "+
"MATCH (v1)-[:friendOf]->(v2)-[:friendOf]->(v3)-[:friendOf]->(v4)";
// Create an OraclePgqlStatement
ops = OraclePgqlExecutionFactory.createStatement(opg);
// get SQL translation with hash join hint
OraclePgqlSqlTrans sqlTrans =
ops.translateQuery(pgql /* query string */,
" USE_GT_TAB=T ALL_EDGE_HASH " /* options */);
// print SQL translation
System.out.println("-- Query with ALL_EDGE_HASH --------------------");
System.out.println(sqlTrans.getSqlTranslation()+"\n");
// get SQL translation with nested loop join hint
sqlTrans =
ops.translateQuery(pgql /* query string */,
" USE_GT_TAB=T ALL_EDGE_NL " /* options */);
// print SQL translation
System.out.println("-- Query with ALL_EDGE_NL ---------------------");
System.out.println(sqlTrans.getSqlTranslation()+"\n");
}
finally {
// close the statement
if (ops != null) {
ops.close();
}
// close the property graph
if (opg != null) {
opg.shutdown();
}
// close oracle
if (oracle != null) {
oracle.dispose();
}
}
}
}
The output for PgqlExample16.java
for test_graph
is:
-- Query with ALL_EDGE_HASH -------------------- SELECT /*+ USE_HASH(T0$0 T0$1 T0$2) */ 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v4)$T", to_nchar(T0$2.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v4)$V", T0$2.DVID AS "id(v4)$VN", to_timestamp_tz(null) AS "id(v4)$VT" FROM "SCOTT".GRAPH1GT$ T0$0, "SCOTT".GRAPH1GT$ T0$1, "SCOTT".GRAPH1GT$ T0$2 WHERE T0$0.DVID=T0$1.SVID AND T0$1.DVID=T0$2.SVID AND (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL) AND (T0$1.EL = n'friendOf' AND T0$1.EL IS NOT NULL) AND (T0$2.EL = n'friendOf' AND T0$2.EL IS NOT NULL) -- Query with ALL_EDGE_NL --------------------- SELECT /*+ USE_NL(T0$0 T0$1 T0$2) */ 7 AS "id(v1)$T", to_nchar(T0$0.SVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v1)$V", T0$0.SVID AS "id(v1)$VN", to_timestamp_tz(null) AS "id(v1)$VT", 7 AS "id(v4)$T", to_nchar(T0$2.DVID,'TM9','NLS_Numeric_Characters=''.,''') AS "id(v4)$V", T0$2.DVID AS "id(v4)$VN", to_timestamp_tz(null) AS "id(v4)$VT" FROM "SCOTT".GRAPH1GT$ T0$0, "SCOTT".GRAPH1GT$ T0$1, "SCOTT".GRAPH1GT$ T0$2 WHERE T0$0.DVID=T0$1.SVID AND T0$1.DVID=T0$2.SVID AND (T0$0.EL = n'friendOf' AND T0$0.EL IS NOT NULL) AND (T0$1.EL = n'friendOf' AND T0$1.EL IS NOT NULL) AND (T0$2.EL = n'friendOf' AND T0$2.EL IS NOT NULL)
The query plan for the first query that uses ALL_EDGE_HASH should look similar to the following.
------------------------------------------ | Id | Operation | Name | ------------------------------------------ | 0 | SELECT STATEMENT | | |* 1 | HASH JOIN | | | 2 | PARTITION HASH ALL | | |* 3 | TABLE ACCESS FULL | GRAPH1GT$ | |* 4 | HASH JOIN | | | 5 | PARTITION HASH ALL| | |* 6 | TABLE ACCESS FULL| GRAPH1GT$ | | 7 | PARTITION HASH ALL| | |* 8 | TABLE ACCESS FULL| GRAPH1GT$ | ------------------------------------------
The query plan for the second query that uses ALL_EDGE_NL should look similar to the following.
------------------------------------------ | Id | Operation | Name | ------------------------------------------ | 0 | SELECT STATEMENT | | | 1 | NESTED LOOPS | | | 2 | NESTED LOOPS | | | 3 | PARTITION HASH ALL| | |* 4 | TABLE ACCESS FULL| GRAPH1GT$ | | 5 | PARTITION HASH ALL| | |* 6 | TABLE ACCESS FULL| GRAPH1GT$ | | 7 | PARTITION HASH ALL | | |* 8 | TABLE ACCESS FULL | GRAPH1GT$ | ------------------------------------------
4.8.3 Performance Considerations for PGQL Queries
Many factors affect the performance of PGQL queries in Oracle Database. The following are some recommended practices for query performance.
Query Optimizer Statistics
Good, up-to-date query optimizer statistics are critical for query performance. Ensure that you run OPG_APIS.ANALYZE_PG after any significant updates to your property graph data.
GT$ Skeleton Table
Without the GT$ skeleton table populated, many queries may require several sorts of the GE$ edge property table to evaluate graph traversals. Whenever possible, you should populate the GT$ skeleton table before running PGQL queries to avoid these expensive sorting operations.
Parallel Query Execution
Use parallel query execution to take advantage of Oracle’s parallel SQL engine. Parallel execution often gives a significant speedup versus serial execution. Parallel execution is especially critical for path queries evaluated using the recursive WITH strategy.
See also the Oracle Database VLDB and Partitioning Guide for more information about parallel query execution.
Optimizer Dynamic Sampling
Due to the inherent flexibility of the graph data model, static information may not always produce optimal query plans. In such cases, dynamic sampling can be used by the query optimizer to sample data at run time for better query plans. The amount of data sampled is controlled by the dynamic sampling level used. Dynamic sampling levels range from 0 to 11. The best level to use depends on a particular dataset and workload, but levels of 2 (default), 6 or 11 often give good results.
See also Supplemantal Dynamic Statistics in the Oracle Database SQL Tuning Guide.
Bind Variables
Use bind variables for constants whenever possible. The use of bind variables gives a very large reduction in query compilation time, which dramatically increases throughput for query workloads with queries that differ only in the constant values used. In addition, queries with bind variables are less vulnerable to injection attacks.
Path Queries
Path queries in PGQL that use the +
(plus sign) or *
(asterisk) operator to search for arbitrary length paths require special consideration because of their high computational complexity. You should use parallel execution and use the DISTINCT option for Recursive WITH (USE_DIST_RW=T) for the best performance. Also, for large, highly connected graphs, it is a good idea to use MAX_PATH_LEN=n to limit the number of repetitions of the recursive step to a reasonable number. A good strategy can be to start with a small repetition limit, and iteratively increase the limit to find more and more results.
Parent topic: Executing PGQL Queries Directly Against Oracle Database