3 Using the In-Memory Analyst (PGX)

The in-memory analyst feature of Oracle Spatial and Graph supports a set of analytical functions.

This chapter provides examples using the in-memory analyst (also referred to as Property Graph In-Memory Analytics, and often abbreviated as PGX in the Javadoc, command line, path descriptions, error messages, and examples). It contains the following major topics:

3.1 Reading a Graph into Memory

This topic provides an example of reading graph interactively into memory using the shell interface.

These are the major steps:

3.1.1 Connecting to an In-Memory Analyst Server Instance

To start the in-memory analyst shell:

  1. Open a terminal session on the system where property graph support is installed.
  2. Either start a local (embedded) in-memory analyst instance or connect to a remote in-memory analyst instance
    • Java example of starting a local (embedded) instance:
      import java.util.Map;
      import java.util.HashMap;
      import oracle.pgx.api.*;
      import oracle.pgx.config.PgxConfig.Field;
       
      String url = Pgx.EMBEDDED_URL; // local JVM
      ServerInstance instance = Pgx.getInstance(url);
      instance.startEngine(); // will use default configuration
      PgxSession session = instance.createSession("test");
    • Java example of connecting to a remote instance:
      import java.util.Map;
      import java.util.HashMap;
      import oracle.pgx.api.*;
      import oracle.pgx.config.PgxConfig.Field;
       
      String url = "http://my-server.com:8080/pgx" // replace with base URL of your setup
      ServerInstance instance = Pgx.getInstance(url);
      PgxSession session = instance.createSession("test");
  3. In the shell, enter the following commands, but select only one of the commands to start or connect to the desired type of instance:
    export PGX_HOME=$ORACLE_HOME/md/property_graph/pgx
    cd $PGX_HOME
    ./bin/pgx --help
    ./bin/pgx --version
     
    # start embedded shell
    ./bin/pgx
     
    # start remote shell
    ./bin/pgx --base_url http://my-server.com:8080/pgx

    For the embedded shell, the output should be similar to the following:

    10:43:46,666 [main] INFO Ctrl$2 - >>> PGX engine running.
    pgx>
  4. Optionally, show the predefined variables:
    pgx> instance
    ==> PGX Server Instance running on embedded mode
    pgx> session
    ==> PGX session pgxShell registered at PGX Server Instance running on embedded mode
    pgx> analyst
    ==> Analyst for PGX session pgxShell registered at PGX Server Instance running on embedded mode
    pgx>

    Examples in some other topics assume that the instance and session variables have been set as shown here.

If the in-memory analyst software is installed correctly, you will see an engine-running log message and the in-memory analyst shell prompt (pgx>):

The variables instance, session, and analyst are ready to use.

In the preceding example in this topic, the shell started a local instance because the pgx command did not specify a remote URL.

3.1.2 Using the Shell Help

The in-memory analyst shell provides a help system, which you access using the :help command.

3.1.3 Providing Graph Metadata in a Configuration File

This topic presents an example of providing graph metadata in a configuration file. Follow these steps oo create a directory and some example files.

  1. Create a directory to hold the example files that you will create. For example:

    mkdir -p ${ORACLE_HOME}/md/property_graph/examples/pgx/graphs/
  2. In that directory, create a text file named sample.adj.json with the following content for the graph configuration file. This configuration file describes how the in-memory analyst reads the graph.

    {
      "uri": "sample.adj", 
      "format": "adj_list",
      "node_props": [{ 
        "name": "prop", 
        "type": "integer" 
      }],
      "edge_props": [{ 
        "name": "cost", 
        "type": "double" 
      }],
      "separator": " "
    }
  3. In the same directory, create a text file named sample.adj with the following content for the graph data:

    128 10 1908 27.03 99 8.51 
    99 2 333 338.0
    1908 889
    333 6 128 51.09

In the configuration file, the uri field provides the location of the graph data. This path resolves relative to the parent directory of the configuration file. When the in-memory analyst loads the graph, it searches for a file named sample.adj containing the graph data.

The other fields in the configuration file indicate that the graph data is provided in adjacency list format, and consists of one node property of type integer and one edge property of type double.

The following figure shows a property graph created from the data:

Figure 3-1 Property Graph Rendered by sample.adj Data

Description of Figure 3-1 follows
Description of "Figure 3-1 Property Graph Rendered by sample.adj Data"

3.1.4 Reading Graph Data into Memory

To read a graph into memory, you must pass the following information:

  • The path to the graph configuration file that specifies the graph metadata

  • A unique alphanumeric name that you can use to reference the graph

    An error results if you previously loaded a different graph with the same name.

Example: Using the Shell to Read a Graph

pgx> graph = session.readGraphWithProperties("<ORACLE_HOME>/md/property_graph/examples/pgx/graphs/sample.adj.json", "sample");
==> PGX Graph named sample bound to PGX session pgxShell ...
pgx> graph.getNumVertices()
==> 4

Example: Using Java to Read a Graph

import oracle.pgx.api.*;
 
PgxGraph graph = session.readGraphWithProperties("<ORACLE_HOME>/md/property_graph/examples/pgx/graphs/sample.adj.json");

The following topics contain additional examples of reading a property graph into memory:

3.1.4.1 Read a Graph Stored in Oracle Database into Memory

To read a property graph stored in Oracle Database, you can create a JSON based configuration file as follows. Note that the hosts, store name, graph name, and other information must be customized for your own setup.

% cat /tmp/my_graph_oracle.json
{"loading":{"load_edge_label":false},
"vertex_props":[
{"default":"default_name","name":"name","type":"string"}
],
"password":"<YOUR_PASSWORD>",
"db_engine":"RDBMS",
"max_num_connections":8,
"username":"scott",
"error_handling":{},"format":"pg","jdbc_url":"jdbc:oracle:thin:@127.0.0.1:1521:<SID>",
"name":"connections",
"edge_props":[
{"default":"1000000","name":"cost","type":"double"}
]
}

Then, read the configuration file into memory. The following example snippet reads the file into memory, generates an undirected graph (named U) from the original data, and counts the number of triangles.

pgx> g = session.readGraphWithProperties("/tmp/my_graph_oracle.json", "connections")
pgx> analyst.countTriangles(g, false)
==> 8
3.1.4.2 Read a Graph Stored in the Local File System into Memory

The following command uses the configuration file from Providing Graph Metadata in a Configuration File and the name my-graph:

pgx> g = session.readGraphWithProperties<ORACLE_HOME>/md/property_graph/examples/pgx/graphs/sample.adj.json", "my-graph")

3.2 Reading Custom Graph Data

You can read your own custom graph data.

This example creates a graph, alters it, and shows how to read it properly. This graph uses the adjacency list format, but the in-memory analyst supports several graph formats.

The main steps are:

3.2.1 Creating a Simple Graph File

This example creates a small, simple graph in adjacency list format with no vertex or edge properties. Each line contains the vertex (node) ID, followed by the vertex IDs to which its outgoing edges point:

1 2
2 3 4
3 4
4 2

In this list, a single space separates the individual tokens. The in-memory analyst supports other separators, which you can specify in the graph configuration file.

The following figure shows the data rendered as a property graph with 4 vertices and 5 edges. (There are two edges between vertex 2 and vertex 4, each pointing in a direction opposite form the other.)

Figure 3-2 Simple Custom Property Graph

Description of Figure 3-2 follows
Description of "Figure 3-2 Simple Custom Property Graph"

Reading a graph into the in-memory analyst requires a graph configuration. You can provide the graph configuration using either of these methods:

  • Write the configuration settings in JSON format into a file

  • Using a Java GraphConfigBuilder object.

The following examples show both methods.

JSON Configuration

{
    "uri": "graph.adj",
    "format":"adj_list",
    "separator":" "
}

Java Configuration

import oracle.pgx.config.FileGraphConfig;
import oracle.pgx.config.Format;
import oracle.pgx.config.GraphConfigBuilder;
FileGraphConfig config = GraphConfigBuilder 
   .forFileFormat(Format.ADJ_LIST) 
   .setUri("graph.adj") 
   .setSeparator(" ") 
   .build();

3.2.2 Adding a Vertex Property

The graph in Creating a Simple Graph File consists of vertices and edges, without vertex or edge properties. Vertex properties are positioned directly after the source vertex ID in each line. The graph data would look like this if you added a double vertex (node) property with values 0.1, 2.0, 0.3, and 4.56789 to the graph:

1 0.1 2
2 2.0 3 4
3 0.3 4
4 4.56789 2

Note:

The in-memory analyst supports only homogeneous graphs, in which all vertices have the same number and type of properties.

For the in-memory analyst to read the modified data file, you must add a vertex (node) property in the configuration file or the builder code. The following examples provide a descriptive name for the property and set the type to double.

JSON Configuration

{
    "uri": "graph.adj",
    "format":"adj_list",
    "separator":" ",
    "node_props":[{
        "name":"double-prop",
        "type":"double"
    }]
}

Java Configuration

import oracle.pgx.common.types.PropertyType;
import oracle.pgx.config.FileGraphConfig;
import oracle.pgx.config.Format;
import oracle.pgx.config.GraphConfigBuilder;

FileGraphConfig config = GraphConfigBuilder.forFileFormat(Format.ADJ_LIST) 
    .setUri("graph.adj") 
    .setSeparator(" ") 
    .addNodeProperty("double-prop", PropertyType.DOUBLE) 
    .build();

3.2.3 Using Strings as Vertex Identifiers

The previous examples used integer vertex (node) IDs. The default in In-Memory Analytics is integer vertex IDs, but you can define a graph to use string vertex IDs instead.

This data file uses "node 1", "node 2", and so forth instead of just the digit:

"node 1" 0.1 "node 2"
"node 2" 2.0 "node 3" "node 4"
"node 3" 0.3 "node 4"
"node 4" 4.56789 "node 2"

Again, you must modify the graph configuration to match the data file:

JSON Configuration

{
    "uri": "graph.adj",
    "format":"adj_list",
    "separator":" ",
    "node_props":[{
        "name":"double-prop",
        "type":"double"
    }],
    "node_id_type":"string"
}

Java Configuration

import oracle.pgx.common.types.IdType;
import oracle.pgx.common.types.PropertyType;
import oracle.pgx.config.FileGraphConfig;
import oracle.pgx.config.Format;
import oracle.pgx.config.GraphConfigBuilder;

FileGraphConfig config = GraphConfigBuilder.forFileFormat(Format.ADJ_LIST) 
    .setUri("graph.adj") 
    .setSeparator(" ") 
    .addNodeProperty("double-prop", PropertyType.DOUBLE) 
    .setNodeIdType(IdType.STRING) 
    .build();

Note:

string vertex IDs consume much more memory than integer vertex IDs.

Any single or double quotes inside the string must be escaped with a backslash (\).

Newlines (\n) inside strings are not supported.

3.2.4 Adding an Edge Property

This example adds an edge property of type string to the graph. The edge properties are positioned after the destination vertex (node) ID.

"node1" 0.1 "node2" "edge_prop_1_2"
"node2" 2.0 "node3" "edge_prop_2_3" "node4" "edge_prop_2_4"
"node3" 0.3 "node4" "edge_prop_3_4"
"node4" 4.56789 "node2" "edge_prop_4_2"

The graph configuration must match the data file:

JSON Configuration

{
    "uri": "graph.adj",
    "format":"adj_list",
    "separator":" ",
    "node_props":[{
        "name":"double-prop",
        "type":"double"
    }],
    "node_id_type":"string",
     "edge_props":[{
        "name":"edge-prop",
        "type":"string"
    }]
}

Java Configuration

import oracle.pgx.common.types.IdType;
import oracle.pgx.common.types.PropertyType;
import oracle.pgx.config.FileGraphConfig;
import oracle.pgx.config.Format;
import oracle.pgx.config.GraphConfigBuilder;

FileGraphConfig config = GraphConfigBuilder.forFileFormat(Format.ADJ_LIST) 
    .setUri("graph.adj") 
    .setSeparator(" ") 
    .addNodeProperty("double-prop", PropertyType.DOUBLE) 
    .setNodeIdType(IdType.STRING) 
    .addEdgeProperty("edge-prop", PropertyType.STRING) 
    .build();

3.3 Storing Graph Data on Disk

After reading a graph into memory using either Java or the Shell, you can store it on disk in different formats. You can then use the stored graph data as input to the in-memory analyst at a later time.

Storing graphs over HTTP/REST is currently not supported.

The options include:

3.3.1 Storing the Results of Analysis in a Vertex Property

This example reads a graph into memory and analyzes it using the Pagerank algorithm. This analysis creates a new vertex property to store the PageRank values.

Using the Shell to Run PageRank

pgx> g = session.readGraphWithProperties("<ORACLE_HOME>/md/property_graph/examples/pgx/graphs/sample.adj.json", "my-graph")
==> ...
pgx> rank = analyst.pagerank(g, 0.001, 0.85, 100)

Using Java to Run PageRank

PgxGraph g = session.readGraphWithProperties("<ORACLE_HOME>/md /property_graph/examples/pgx/graphs/sample.adj.json", "my-graph");
VertexProperty<Integer, Double> rank = session.createAnalyst().pagerank(g, 0.001, 0.85, 100);

3.3.2 Storing a Graph in Edge-List Format on Disk

This example stores the graph, the result of the Pagerank analysis, and all original edge properties as a file in edge-list format on disk.

To store a graph, you must specify:

  • The graph format

  • A path where the file will be stored

  • The properties to be stored. Specify VertexProperty.ALL or EdgeProperty.ALL to store all properties, or VertexProperty.NONE or EdgePropery.NONE to store no properties. To specify individual properties, pass in the VertexProperty or EdgeProperty objects you want to store.

  • A flag that indicates whether to overwrite an existing file with the same name

The following examples store the graph data in /tmp/sample_pagerank.elist, with the /tmp/sample_pagerank.elist.json configuration file. The return value is the graph configuration for the stored file. You can use it to read the graph again.

Using the Shell to Store a Graph

pgx> config = g.store(Format.EDGE_LIST, "/tmp/sample_pagerank.elist", [rank], EdgeProperty.ALL, false)
==> {"uri":"/tmp/sample_pagerank.elist","edge_props":[{"type":"double","name":"cost"}],"vertex_id_type":"integer","loading":{},"format":"edge_list","attributes":{},"vertex_props":[{"type":"double","name":"pagerank"}],"error_handling":{}}

Using Java to Store a Graph

import oracle.pgx.api.*;
import oracle.pgx.config.*;
 
FileGraphConfig config = g.store(Format.EDGE_LIST, "/tmp/sample_pagerank.elist", Collections.singletonList(rank), EdgeProperty.ALL, false);

3.4 Executing Built-in Algorithms

The in-memory analyst contains a set of built-in algorithms that are available as Java APIs.

This topic describes the use of the in-memory analyst using Triangle Counting and Pagerank analytics as examples.

3.4.1 About the In-Memory Analyst

The in-memory analyst contains a set of built-in algorithms that are available as Java APIs. The details of the APIs are documented in the Javadoc that is included in the product documentation library. Specifically, see the BuiltinAlgorithms interface Method Summary for a list of the supported in-memory analyst methods.

For example, this is the Pagerank procedure signature:

/**
   * Classic pagerank algorithm. Time complexity: O(E * K) with E = number of edges, K is a given constant (max
   * iterations)
   *
   * @param graph
   *          graph
   * @param e
   *          maximum error for terminating the iteration
   * @param d
   *          damping factor
   * @param max
   *          maximum number of iterations
   * @return Vertex Property holding the result as a double
   */
  public <ID extends Comparable<ID>> VertexProperty<ID, Double> pagerank(PgxGraph graph, double e, double d, int max);

3.4.2 Running the Triangle Counting Algorithm

For triangle counting, the sortByDegree boolean parameter of countTriangles() allows you to control whether the graph should first be sorted by degree (true) or not (false). If true, more memory will be used, but the algorithm will run faster; however, if your graph is very large, you might want to turn this optimization off to avoid running out of memory.

Using the Shell to Run Triangle Counting

pgx> analyst.countTriangles(graph, true)
==> 1

Using Java to Run Triangle Counting

import oracle.pgx.api.*;
 
Analyst analyst = session.createAnalyst();
long triangles = analyst.countTriangles(graph, true);

The algorithm finds one triangle in the sample graph.

Tip:

When using the in-memory analyst shell, you can increase the amount of log output during execution by changing the logging level. See information about the :loglevel command with :h :loglevel.

3.4.3 Running the Pagerank Algorithm

Pagerank computes a rank value between 0 and 1 for each vertex (node) in the graph and stores the values in a double property. The algorithm therefore creates a vertex property of type double for the output.

In the in-memory analyst, there are two types of vertex and edge properties:

  • Persistent Properties: Properties that are loaded with the graph from a data source are fixed, in-memory copies of the data on disk, and are therefore persistent. Persistent properties are read-only, immutable and shared between sessions.

  • Transient Properties: Values can only be written to transient properties, which are session private. You can create transient properties by calling createVertexProperty and createEdgeProperty on PgxGraph objects.

This example obtains the top three vertices with the highest Pagerank values. It uses a transient vertex property of type double to hold the computed Pagerank values. The Pagerank algorithm uses the following default values for the input parameters: error (tolerance = 0.001, damping factor = 0.85, and maximum number of iterations = 100.

Using the Shell to Run Pagerank

pgx> rank = analyst.pagerank(graph, 0.001, 0.85, 100);
==> ...
pgx> rank.getTopKValues(3)
==> 128=0.1402019732468347
==> 333=0.12002296283541904
==> 99=0.09708583862990475

Using Java to Run Pagerank

import java.util.Map.Entry;
import oracle.pgx.api.*;
 
Analyst analyst = session.createAnalyst();
VertexProperty<Integer, Double> rank = analyst.pagerank(graph, 0.001, 0.85, 100);
for (Entry<Integer, Double> entry : rank.getTopKValues(3)) {
 System.out.println(entry.getKey() + "=" + entry.getValue());
}

3.5 Creating Subgraphs

You can create subgraphs based on a graph that has been loaded into memory. You can use filter expressions or create bipartite subgraphs based on a vertex (node) collection that specifies the left set of the bipartite graph.

For information about reading a graph into memory, see Reading Graph Data into Memory.

3.5.1 About Filter Expressions

Filter expressions are expressions that are evaluated for each edge. The expression can define predicates that an edge must fulfil to be contained in the result, in this case a subgraph.

Consider the graph in Providing Graph Metadata in a Configuration File, which consists of four vertices (nodes) and four edges. For an edge to match the filter expression src.prop == 10, the source vertex prop property must equal 10. Two edges match that filter expression, as shown in the following figure.

Figure 3-3 Edges Matching src.prop == 10

Description of Figure 3-3 follows
Description of "Figure 3-3 Edges Matching src.prop == 10"

The following figure shows the graph that results when the filter is applied. The filter excludes the edges associated with vertex 333, and the vertex itself.

Figure 3-4 Graph Created by the Simple Filter

Description of Figure 3-4 follows
Description of "Figure 3-4 Graph Created by the Simple Filter"

Using filter expressions to select a single vertex or a set of vertices is difficult. For example, selecting only the vertex with the property value 10 is impossible, because the only way to match the vertex is to match an edge where 10 is either the source or destination property value. However, when you match an edge you automatically include the source vertex, destination vertex, and the edge itself in the result.

3.5.2 Using a Simple Filter to Create a Subgraph

The following examples create the subgraph described in About Filter Expressions.

Using the Shell to Create a Subgraph

subgraph = graph.filter(new VertexFilter("vertex.prop == 10"))

Using Java to Create a Subgraph

import oracle.pgx.api.*;
import oracle.pgx.api.filter.*;

PgxGraph graph = session.readGraphWithProperties(...);
PgxGraph subgraph = graph.filter(new VertexFilter("vertex.prop == 10"));

3.5.3 Using a Complex Filter to Create a Subgraph

This example uses a slightly more complex filter. It uses the outDegree function, which calculates the number of outgoing edges for an identifier (source src or destination dst). The following filter expression matches all edges with a cost property value greater than 50 and a destination vertex (node) with an outDegree greater than 1.

dst.outDegree() > 1 && edge.cost > 50

One edge in the sample graph matches this filter expression, as shown in the following figure.

Figure 3-5 Edges Matching the outDegree Filter

Description of Figure 3-5 follows
Description of "Figure 3-5 Edges Matching the outDegree Filter"

The following figure shows the graph that results when the filter is applied. The filter excludes the edges associated with vertixes 99 and 1908, and so excludes those vertices also.

Figure 3-6 Graph Created by the outDegree Filter

Description of Figure 3-6 follows
Description of "Figure 3-6 Graph Created by the outDegree Filter"

3.5.4 Using a Vertex Set to Create a Bipartite Subgraph

You can create a bipartite subgraph by specifying a set of vertices (nodes), which are used as the left side. A bipartite subgraph has edges only between the left set of vertices and the right set of vertices. There are no edges within those sets, such as between two nodes on the left side. In the in-memory analyst, vertices that are isolated because all incoming and outgoing edges were deleted are not part of the bipartite subgraph.

The following figure shows a bipartite subgraph. No properties are shown.

The following examples create a bipartite subgraph from the simple graph created in Providing Graph Metadata in a Configuration File. They create a vertex collection and fill it with the vertices for the left side.

Using the Shell to Create a Bipartite Subgraph

pgx> s = graph.createVertexSet()
==> ...
pgx> s.addAll([graph.getVertex(333), graph.getVertex(99)])
==> ...
pgx> s.size()
==> 2
pgx> bGraph = graph.bipartiteSubGraphFromLeftSet(s)
==> PGX Bipartite Graph named sample-sub-graph-4

Using Java to Create a Bipartite Subgraph

import oracle.pgx.api.*;
 
VertexSet<Integer> s = graph.createVertexSet();
s.addAll(graph.getVertex(333), graph.getVertex(99));
BipartiteGraph bGraph = graph.bipartiteSubGraphFromLeftSet(s);

When you create a subgraph, the in-memory analyst automatically creates a Boolean vertex (node) property that indicates whether the vertex is on the left side. You can specify a unique name for the property.

The resulting bipartite subgraph looks like this:

Vertex 1908 is excluded from the bipartite subgraph. The only edge that connected that vertex extended from 128 to 1908. The edge was removed, because it violated the bipartite properties of the subgraph. Vertex 1908 had no other edges, and so was removed also.

3.6 Using Automatic Delta Refresh to Handle Database Changes

You can automatically refresh (auto-refresh) graphs periodically to keep the in-memory graph synchronized with changes to the underlying property graph in the database.

3.6.1 Configuring the In-Memory Server for Auto-Refresh

Because auto-refresh can create many snapshots and therefore may lead to a high memory usage, by default the option to enable auto-refresh for graphs is available only to administrators.

To allow all users to auto-refresh graphs, you must include the following line into the in-memory analyst configuration file (located in $ORACLE_HOME/md/property_graph/pgx/conf/pgx.conf):

{
  "allow_user_auto_refresh": true
}

3.6.2 Configuring Basic Auto-Refresh

Auto-refresh is configured in the loading section of the graph configuration. The example in this topic sets up auto-refresh to check for updates every minute, and to create a new snapshot when the data source has changed.

The following block (JSON format) enables the auto-refresh feature in the configuration file of the sample graph:

{
  "format": "pg",
  "jdbc_url": "jdbc:oracle:thin:@mydatabaseserver:1521/dbName",
  "username": "scott",
  "password": "<password>",
  "name": "my_graph",
  "vertex_props": [{
    "name": "prop",
    "type": "integer"
  }],
  "edge_props": [{
    "name": "cost",
    "type": "double"
  }],
  "separator": " ",
  "loading": {
    "auto_refresh": true,
    "update_interval_sec": 60
  },
}

Notice the additional loading section containing the auto-refresh settings. You can also use the Java APIs to construct the same graph configuration programmatically:

GraphConfig config = GraphConfigBuilder.forPropertyGraphRdbms()
  .setJdbcUrl("jdbc:oracle:thin:@mydatabaseserver:1521/dbName")
  .setUsername("scott")
  .setPassword("<password>")
  .setName("my_graph")
  .addVertexProperty("prop", PropertyType.INTEGER)
  .addEdgeProperty("cost", PropertyType.DOUBLE)
  .setAutoRefresh(true)
  .setUpdateIntervalSec(60)
  .build();

3.6.3 Reading the Graph Using the In-Memory Analyst or a Java Application

After creating the graph configuration, you can load the graph into the in-memory analyst using the regular APIs.

pgx> G = session.readGraphWithProperties("graphs/my-config.pg.json")

After the graph is loaded, a background task is started automatically, and it periodically checks the data source for updates.

3.6.4 Checking Out a Specific Snapshot of the Graph

The database is queried every minute for updates. If the graph has changed in the database after the time interval passed, the graph is reloaded and a new snapshot is created in-memory automatically.

You can "check out" (move a pointer to a different version of) the available in-memory snapshots of the graph using the getAvailableSnapshots() method of PgxSession. Example output is as follows:

pgx> session.getAvailableSnapshots(G)
==> GraphMetaData [getNumVertices()=4, getNumEdges()=4, memoryMb=0, dataSourceVersion=1453315103000, creationRequestTimestamp=1453315122669 (2016-01-20 10:38:42.669), creationTimestamp=1453315122685 (2016-01-20 10:38:42.685), vertexIdType=integer, edgeIdType=long]
==> GraphMetaData [getNumVertices()=5, getNumEdges()=5, memoryMb=3, dataSourceVersion=1452083654000, creationRequestTimestamp=1453314938744 (2016-01-20 10:35:38.744), creationTimestamp=1453314938833 (2016-01-20 10:35:38.833), vertexIdType=integer, edgeIdType=long]

The preceding example output contains two entries, one for the originally loaded graph with 4 vertices and 4 edges, and one for the graph created by auto-refresh with 5 vertices and 5 edges.

To check out out a specific snapshot of the graph, use the setSnapshot() methods of PgxSession and give it the creationTimestamp of the snapshot you want to load.

For example, if G is pointing to the newer graph with 5 vertices and 5 edges, but you want to analyze the older version of the graph, you need to set the snapshot to 1453315122685. In the in-memory analyst shell:

pgx> G.getNumVertices()
==> 5
pgx> G.getNumEdges()
==> 5

pgx> session.setSnapshot( G, 1453315122685 )
==> null

pgx> G.getNumVertices()
==> 4
pgx> G.getNumEdges()
==> 4

You can also load a specific snapshot of a graph directly using the readGraphAsOf() method of PgxSession. This is a shortcut for loading a graph with readGraphWithProperty() followed by a setSnapshot(). For example:

pgx> G = session.readGraphAsOf( config, 1453315122685 )

If you do not know or care about what snapshots are currently available in-memory, you can also specify a time span of how “old” a snapshot is acceptable by specifying a maximum allowed age. For example, to specify a maximum snapshot age of 60 minutes, you can use the following:

pgx> G = session.readGraphWithProperties( config, 60l, TimeUnit.MINUTES )

If there are one or more snapshots in memory younger (newer) than the specified maximum age, the youngest (newest) of those snapshots will be returned. If all the available snapshots are older than the specified maximum age, or if there is no snapshot available at all, then a new snapshot will be created automatically.

3.6.5 Advanced Auto-Refresh Configuration

You can specify advanced options for auto-refresh configuration.

Internally, the in-memory analyst fetches the changes since the last check from the database and creates a new snapshot by applying the delta (changes) to the previous snapshot. There are two timers: one for fetching and caching the deltas from the database, the other for actually applying the deltas and creating a new snapshot.

Additionally, you can specify a threshold for the number of cached deltas. If the number of cached changes grows above this threshold, a new snapshot is created automatically. The number of cached changes is a simple sum of the number of vertex changes plus the number of edge changes.

The deltas are fetched periodically and cached on the in-memory analyst server for two reasons:

  • To speed up the actual snapshot creation process

  • To account for the case that the database can "forget" changes after a while

You can specify both a threshold and an update timer, which means that both conditions will be checked before new snapshot is created. At least one of these parameters (threshold or update timer) must be specified to prevent the delta cache from becoming too large. The interval at which the source is queried for changes must not be omitted.

The following parameters show a configuration where the data source is queried for new deltas every 5 minutes. New snapshots are created every 20 minutes or if the cached deltas reach a size of 1000 changes.

{
  "format": "pg",
  "jdbc_url": "jdbc:oracle:thin:@mydatabaseserver:1521/dbName",
  "username": "scott",
  "password": "<your_password>",
  "name": "my_graph",

  "loading": {
    "auto_refresh": true,
    "fetch_interval_sec": 300,
    "update_interval_sec": 1200,
    "update_threshold": 1000,
    "create_edge_id_index": true,
    "create_edge_id_mapping": true
  }
}

3.7 Deploying to Apache Tomcat

You can deploy the in-memory analyst to Apache Tomcat or Oracle WebLogic. This example shows how to deploy In-Memory Analytics as a web application with Apache Tomcat.

The in-memory analyst ships with BASIC Auth enabled, which requires a security realm. Tomcat supports many different types of realms. This example configures the simplest one, MemoryRealm. See the Tomcat Realm Configuration How-to for information about the other types.

  1. Copy the in-memory analyst WAR file into the Tomcat webapps directory. For example:
    cp $PGX_HOME/server/pgx-webapp-<VERSION>.war $CATALINA_HOME/webapps/pgx.war
    
    Where PGX_HOME should be defined as:
    export PGX_HOME=$ORACLE_HOME/md/property_graph/pgx

    Do not copy a file named pgx-webapp-<VERSION>-wls.war, which is specific to WebLogic Server.

  2. Open $CATALINA_HOME/conf/server.xml in an editor and add the following realm class declaration under the <Engine> element:
    <Realm className="org.apache.catalina.realm.MemoryRealm" />
    
  3. Open CATALINA_HOME/conf/tomcat-users.xml in an editor and define a user for the USER role. Replace scott and <password> in this example with an appropriate user name and password:
    <role rolename="USER" />
    <user username="scott" password="<password>" roles="USER" />
    
  4. Ensure that port 8080 is not already in use.
  5. Start Tomcat:
    cd $CATALINA_HOME
    ./bin/startup.sh
    
  6. Verify that Tomcat is working:
    cd $PGX_HOME
    ./bin/pgx --base_url http://scott:<password>@localhost:8080/pgx
    

Note:

Oracle recommends BASIC Auth only for testing. Use stronger authentication mechanisms for all other types of deployments.

3.7.1 About the Authentication Mechanism

The in-memory analyst web deployment uses BASIC Auth by default. You should change to a more secure authentication mechanism for a production deployment.

To change the authentication mechanism, modify the security-constraint element of the web.xml deployment descriptor in the web application archive (WAR) file.

3.8 Deploying to Oracle WebLogic Server

You can deploy the in-memory analysts to Apache Tomcat or Oracle WebLogic Server. This example shows how to deploy the in-memory analyst as a web application with Oracle WebLogic Server.

3.8.1 Installing Oracle WebLogic Server

To download and install the latest version of Oracle WebLogic Server, see

http://www.oracle.com/technetwork/middleware/weblogic/documentation/index.html

3.8.2 Deploying the In-Memory Analyst

To deploy the in-memory analyst to Oracle WebLogic, use commands like the following. Substitute your administrative credentials and WAR file for the values shown in this example:

. $MW_HOME/user_projects/domains/mydomain/bin/setDomainEnv.sh
. $MW_HOME/wlserver/server/bin/setWLSEnv.sh
java weblogic.Deployer -adminurl http://localhost:7001 -username username -password password -deploy -source $PGX_HOME/server/pgx-webapp-<version>-wls.war
Where PGX_HOME should be defined as:
export PGX_HOME=$ORACLE_HOME/md/property_graph/pgx

If the script runs successfully, you will see a message like this one:

Target state: deploy completed on Server myserver

3.8.3 Verifying That the Server Works

Verify that you can connect to the server by entering a command in the following format:

$PGX_HOME/bin/pgx --base_url http://scott:<password>@localhost:7001/pgx

3.8.4 About the Authentication Mechanism

The in-memory analyst web deployment uses BASIC Auth by default. You should change to a more secure authentication mechanism for a production deployment.

To change the authentication mechanism, modify the security-constraint element of the web.xml deployment descriptor in the web application archive (WAR) file.

3.9 Connecting to the In-Memory Analyst Server

After the property graph in-memory analyst is installed in a computer running Oracle Database -- or on a client system without Oracle Database server software as a web application on Apache Tomcat or Oracle WebLogic Server -- you can connect to the in-memory analyst server.

3.9.1 Connecting with the In-Memory Analyst Shell

The simplest way to connect to an in-memory analyst instance is to specify the base URL of the server. The following base URL can connect the SCOTT user to the local instance listening on port 8080:

http://scott:<password>@localhost:8080/pgx

To start the in-memory analyst shell with this base URL, you use the --base_url command line argument

cd $PGX_HOME
./bin/pgx --base_url http://scott:<password>@localhost:8080/pgx

You can connect to a remote instance the same way. However, the in-memory analyst currently does not provide remote support for the Control API.

3.9.1.1 About Logging HTTP Requests

The in-memory analyst shell suppresses all debugging messages by default. To see which HTTP requests are executed, set the log level for oracle.pgx to DEBUG, as shown in this example:

pgx> :loglevel oracle.pgx DEBUG
===> log level of oracle.pgx logger set to DEBUG
pgx> session.readGraphWithProperties("sample_http.adj.json", "sample")
10:24:25,056 [main] DEBUG RemoteUtils - Requesting POST http://scott:<password>@localhost:8080/pgx/core/session/session-shell-6nqg5dd/graph HTTP/1.1 with payload {"graphName":"sample","graphConfig":{"uri":"http://path.to.some.server/pgx/sample.adj","separator":" ","edge_props":[{"type":"double","name":"cost"}],"node_props":[{"type":"integer","name":"prop"}],"format":"adj_list"}}
10:24:25,088 [main] DEBUG RemoteUtils - received HTTP status 201
10:24:25,089 [main] DEBUG RemoteUtils - {"futureId":"87d54bed-bdf9-4601-98b7-ef632ce31463"}
10:24:25,091 [pool-1-thread-3] DEBUG PgxRemoteFuture$1 - Requesting GET http://scott:<password>@localhost:8080/pgx/future/session/session-shell-6nqg5dd/result/87d54bed-bdf9-4601-98b7-ef632ce31463 HTTP/1.1
10:24:25,300 [pool-1-thread-3] DEBUG RemoteUtils - received HTTP status 200
10:24:25,301 [pool-1-thread-3] DEBUG RemoteUtils - {"stats":{"loadingTimeMillis":0,"estimatedMemoryMegabytes":0,"numEdges":4,"numNodes":4},"graphName":"sample","nodeProperties":{"prop":"integer"},"edgeProperties":{"cost":"double"}}

This example requires that the graph URI points to a file that the in-memory analyst server can access using HTTP or HDFS.

3.9.2 Connecting with Java

You can specify the base URL when you initialize the in-memory analyst using Java. An example is as follows. A URL to an in-memory analyst server is provided to the getInMemAnalyst API call.

import oracle.pg.rdbms.*;
import oracle.pgx.api.*;
 
PgRdbmsGraphConfigcfg = GraphConfigBuilder.forPropertyGraphRdbms().setJdbcUrl("jdbc:oracle:thin:@127.0.0.1:1521:orcl") 
   .setUsername("scott").setPassword("<password>")  .setName("mygraph") 
   .setMaxNumConnections(2) .setLoadEdgeLabel(false) 
   .addVertexProperty("name", PropertyType.STRING, "default_name")  
   .addEdgeProperty("weight", PropertyType.DOUBLE, "1000000")  
   .build();OraclePropertyGraph opg = OraclePropertyGraph.getInstance(cfg);
ServerInstance remoteInstance = Pgx.getInstance("http://scott:<password>@hostname:port/pgx");
PgxSession session = remoteInstance.createSession("my-session");
 
PgxGraph graph = session.readGraphWithProperties(opg.getConfig());

3.9.3 Connecting with the PGX REST API

You can connect to an in-memory analyst instance using the REST API PGX endpoints. This enables you to interact with the in-memory analyst in a language other than Java to implement your own client.

The examples in this topic assume that:

  • Linux with curl is installed. curl is a simple command-line utility to interact with REST endpoints.)
  • The PGX server is up and running on http://localhost:7007.
  • The PGX server has authentication/authorization disabled; that is, $ORACLE_HOME/md/property_graph/pgx/conf/server.conf contains "enable_tls": false. (This is a non-default setting and not recommended for production).
  • PGX allows reading graphs from the local file system; that is, $ORACLE_HOME/md/property_graph/pgx/conf/pgx.conf contains "allow_local_filesystem": true. (This is a non-default setting and not recommended for production).

For the Swagger specification, you can see a full list of supported endpoints in JSON by opening http://localhost:7007/swagger.json in your browser.

Step 1: Obtain a CSRF token

Request a CSRF token:

curl -v http://localhost:7007/token

The response will look like this:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 7007 (#0)
> GET /token HTTP/1.1
> Host: localhost:7007
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 201
< SET-COOKIE: _csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0;Version=1; HttpOnly
< Content-Length: 0

As you can see in the response, this will set a cookie _csrf_token to a token value. 9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0 is used as an example token for the following requests. For any write requests, PGX server requires the same token to be present in both cookie and payload.

Step 2: Create a session

To create a new session, send a JSON payload:

curl -v --cookie '_csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0' -H 'content-type: application/json' -X POST http://localhost:7007/core/v1/sessions -d '{"source":"my-application", "idleTimeout":0, "taskTimeout":0, "timeUnitName":"MILLISECONDS", "_csrf_token":"9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0"}'

Replace my-application with a value describing the application that you are running. This value can be used by server administrators to map sessions to their applications. Setting idle and task timeouts to 0 means the server will determine when the session and submitted tasks time out. You must provide the same CSRF token in both the cookie header and the JSON payload.

The response will look similar to the following:

*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 7007 (#0)
> POST /core/v1/sessions HTTP/1.1
> Host: localhost:7007
> User-Agent: curl/7.47.0
> Accept: */*
> Cookie: _csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0
> content-type: application/json
> Content-Length: 159
> 
* upload completely sent off: 159 out of 159 bytes
< HTTP/1.1 201
< SET-COOKIE: SID=abae2811-6dd2-48b0-93a8-8436e078907d;Version=1; HttpOnly
< Content-Length: 0

The response sets a cookie to the session ID value that was created for us. Session ID abae2811-6dd2-48b0-93a8-8436e078907d is used as an example for subsequent requests.

Step 3: Read a graph

Note:

if you want to analyze a pre-loaded graph or a graph that is already published by another session, you can skip this step. All you need to access pre-loaded or published graphs is the name of the graph.

To read a graph, send the graph configuration as JSON to the server as shown in the following example (replace <graph-config> with the JSON representation of an actual PGX graph config).

curl -v -X POST --cookie '_csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0;SID=abae2811-6dd2-48b0-93a8-8436e078907d'  http://localhost:7007/core/v1/loadGraph -H 'content-type: application/json'  -d  '{"graphConfig":<graph-config>,"graphName":null,"csrf_token":"9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0"}'

Here an example of a graph config that reads a property graph from the Oracle database:

{
  "format": "pg",
  "db_engine": "RDBMS",
  "jdbc_url":"jdbc:oracle:thin:@127.0.0.1:1521:orcl122",
  "username":"scott",
  "password":"tiger",
  "max_num_connections": 8,
  "name": "connections",
  "vertex_props": [
    {"name":"name", "type":"string"},
    {"name":"role", "type":"string"},
    {"name":"occupation", "type":"string"},
    {"name":"country", "type":"string"},
    {"name":"political party", "type":"string"},
    {"name":"religion", "type":"string"}
  ],
  "edge_props": [
    {"name":"weight", "type":"double", "default":"1"}
  ],
  "edge_label": true,
  "loading": {
    "load_edge_label": true
  }
}

Passing "graphName": null tells the server to generate a name.

The server will reply something like the following:

* upload completely sent off: 315 out of 315 bytes
< HTTP/1.1 202
< Location: http://localhost:7007/core/v1/futures/8a46ef65-01a9-4bd0-87d3-ffe9dfd2ce3c/status
< Content-Type: application/json;charset=utf-8
< Content-Length: 51
< Date: Mon, 05 Nov 2018 17:22:22 GMT
<
* Connection #0 to host localhost left intact
{"futureId":"8a46ef65-01a9-4bd0-87d3-ffe9dfd2ce3c"}

About Asynchronous Requests

Most of the PGX REST endpoints are asynchronous. Instead of keeping the connection open until the result is ready, PGX server submits as task and immediately returns a future ID with status code 200, which then can be used by the client to periodically request the status of the task or request the result value once done.

From the preceding response, you can request the future status like this:

curl -v --cookie 'SID=abae2811-6dd2-48b0-93a8-8436e078907d'  http://localhost:7007/core/v1/futures/8a46ef65-01a9-4bd0-87d3-ffe9dfd2ce3c/status

Which will return something like:

< HTTP/1.1 200
< Content-Type: application/json;charset=utf-8
< Content-Length: 730
< Date: Mon, 05 Nov 2018 17:35:19 GMT
< 
* Connection #0 to host localhost left intact
{"id":"eb17f75b-e4c1-4a66-81a0-4ff0f8b4cb92","links":[{"href":"http://localhost:7007/core/v1/futures/eb17f75b-e4c1-4a66-81a0-4ff0f8b4cb92/status","rel":"self","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/futures/eb17f75b-e4c1-4a66-81a0-4ff0f8b4cb92","rel":"abort","method":"DELETE","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/futures/eb17f75b-e4c1-4a66-81a0-4ff0f8b4cb92/status","rel":"canonical","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/futures/eb17f75b-e4c1-4a66-81a0-4ff0f8b4cb92/value","rel":"related","method":"GET","interaction":["async-polling"]}],"progress":"succeeded","completed":true,"intervalToPoll":1}

Besides the status (succeeded in this case), this output also includes links to cancel the task (DELETE) and to retrieve the result of the task once completed (GET <future-id>/value):

curl -X GET --cookie 'SID=abae2811-6dd2-48b0-93a8-8436e078907d' http://localhost:7007/core/v1/futures/cdc15a38-3422-42a1-baf4-343c140cf95d/value

Which will return details about the loaded graph, including the name that was generated by the server (sample):

{"id":"sample","links":[{"href":"http://localhost:7007/core/v1/graphs/sample","rel":"self","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/graphs/sample","rel":"canonical","method":"GET","interaction":["async-polling"]}],"nodeProperties":{"prop1":{"id":"prop1","links":[{"href":"http://localhost:7007/core/v1/graphs/sample/properties/prop1","rel":"self","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/graphs/sample/properties/prop1","rel":"canonical","method":"GET","interaction":["async-polling"]}],"dimension":0,"name":"prop1","entityType":"vertex","type":"integer","transient":false}},"vertexLabels":null,"edgeLabel":null,"metaData":{"id":null,"links":null,"numVertices":4,"numEdges":4,"memoryMb":0,"dataSourceVersion":"1536029578000","config":{"format":"adj_list","separator":" ","edge_props":[{"type":"double","name":"cost"}],"error_handling":{},"vertex_props":[{"type":"integer","name":"prop1"}],"vertex_uris":["PATH_TO_FILE"],"vertex_id_type":"integer","loading":{}},"creationRequestTimestamp":1541242100335,"creationTimestamp":1541242100774,"vertexIdType":"integer","edgeIdType":"long","directed":true},"graphName":"sample","edgeProperties":{"cost":{"id":"cost","links":[{"href":"http://localhost:7007/core/v1/graphs/sample/properties/cost","rel":"self","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/graphs/sample/properties/cost","rel":"canonical","method":"GET","interaction":["async-polling"]}],"dimension":0,"name":"cost","entityType":"edge","type":"double","transient":false}},"ageMs":0,"transient":false}

For simplicity, the remaining steps omit the additional requests to request the status or value of asynchronous tasks.

Step 4: Create a property

Before you can run the PageRank algorithm on the loaded graph, you must create a vertex property of type DOUBLE on the graph, which can hold the computed ranking values:

curl -v -X POST --cookie '_csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0;SID=abae2811-6dd2-48b0-93a8-8436e078907d'  http://localhost:7007/core/v1/graphs/sample/properties -H 'content-type: application/json'  -d '{"entityType":"vertex","type":"double","name":"pagerank", "hardName":false,"dimension":0,"_csrf_token":"9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0"}'

Requesting the result of the returned future will return something like:

{"id":"pagerank","links":[{"href":"http://localhost:7007/core/v1/graphs/sample/properties/pagerank","rel":"self","method":"GET","interaction":["async-polling"]},{"href":"http://localhost:7007/core/v1/graphs/sample/properties/pagerank","rel":"canonical","method":"GET","interaction":["async-polling"]}],"dimension":0,"name":"pagerank","entityType":"vertex","type":"double","transient":true}

Step 5: Run the PageRank algorithm on the loaded graph

The following example shows how to run an algorithm (PageRank in this case). The algorithm ID is part of the URL, and the parameters to be passed into the algorithm are part of the JSON payload:

curl -v -X POST --cookie '_csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0;SID=abae2811-6dd2-48b0-93a8-8436e078907d' http://localhost:7007/core/v1/analyses/pgx_builtin_k1a_pagerank/run -H  'content-type: application/json' -d  '{"args":[{"type":"GRAPH","value":"sample"},{"type":"DOUBLE_IN","value":0.001},{"type":"DOUBLE_IN","value":0.85},{"type":"INT_IN","value":100},{"type":"BOOL_IN","value":true},{"type":"NODE_PROPERTY","value":"pagerank"}],"expectedReturnType":"void","workloadCharacteristics":["PARALLELISM.PARALLEL"],"_csrf_token":"9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0"}'

Once the future is completed, the result will look something like this:

{"success":true,"canceled":false,"exception":null,"returnValue":null,"executionTimeMs":50}

Step 6: Execute a PGQL query

To query the results of the PageRank algorithm, you can run a PGQL query as shown in the following example:

curl -v -X POST --cookie '_csrf_token=9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0;SID=abae2811-6dd2-48b0-93a8-8436e078907d' http://localhost:7007/core/v1/pgql/run -H 'content-type: application/json'  -d '{"pgqlQuery":"SELECT x.pagerank MATCH (x) WHERE x.pagerank > 0","semantic":"HOMOMORPHISM", "schemaStrictnessMode":true, "graphName" : "sample", "_csrf_token":"9bf51c8f-1c75-455e-9b57-ec3ca1c63cc0"}'

The result is a set of links you can use to interact with the result set of the query:

{"id":"pgql_1","links":[{"href":"http://localhost:7007/core/v1/pgqlProxies/pgql_1","rel":"self","method":"GET","interaction":["sync"]},{"href":"http://localhost:7007/core/v1/pgqlResultProxies/pgql_1/elements","rel":"related","method":"GET","interaction":["sync"]},{"href":"http://localhost:7007/core/v1/pgqlResultProxies/pgql_1/results","rel":"related","method":"GET","interaction":["sync"]},{"href":"http://localhost:7007/core/v1/pgqlProxies/pgql_1","rel":"canonical","method":"GET","interaction":["async-polling"]}],"exists":true,"graphName":"sample","resultSetId":"pgql_1","numResults":4}

To request the first 2048 elements of the result set, send:

curl -X GET  --cookie 'SID=abae2811-6dd2-48b0-93a8-8436e078907d' http://localhost:7007/core/v1/pgqlProxies/pgql_1/results?size=2048

The response looks something like this:

{"id":"/pgx/core/v1/pgqlProxies/pgql_1/results","links":[{"href":"http://localhost:7007/core/v1/pgqlProxies/pgql_1/results","rel":"self","method":"GET","interaction":["sync"]},{"href":"http://localhost:7007/core/v1/pgqlProxies/pgql_1/results","rel":"canonical","method":"GET","interaction":["async-polling"]}],"count":4,"totalItems":4,"items":[[0.3081206521195582],[0.21367103988538017],[0.21367103988538017],[0.2645372681096815]],"hasMore":false,"offset":0,"limit":4,"showTotalResults":true}

3.10 Managing Property Graph Snapshots

Oracle Spatial and Graph Property Graph lets you manage property graph snapshots.

You can persist different versions of a property graph as binary snapshots in the database. The binary snapshots represent a subgraph of graph data computed at runtime that may be needed for a future use. The snapshots can be read back later as input for the in-memory analytics, or as an output stream that can be used by the parallel property graph data loader.

Note:

Managing property graph snapshots is intended for advanced users.

You can store binary snapshots in the <graph_name>SS$ table of the property graph using the Java API OraclePropertyGraphUtils.storeBinaryInMemoryGraphSnapshot. This operation requires a connection to the Oracle database holding the property graph instance, the name of the graph and its owner, the ID of the snapshot, and an input stream from which the binary snapshot can be read. You can also specify the time stamp of the snapshot and the degree of parallelism to be used when storing the snapshot in the table.

You can read a stored binary snapshot using oraclePropertyGraphUtils.readBinaryInMemGraphSnapshot. This operation requires a connection to the Oracle database holding the property graph instance, the name of the graph and its owner, the ID of the snapshot to read, and an output stream where the binary file snapshot will be written into. You can also specify the degree of parallelism to be used when reading the snapshot binary-file from the table.

The following code snippet creates a property graph from the data file in Oracle Flat-file format, adds a new vertex, and exports the graph into an output stream using GraphML format. This output stream represents a binary file snapshot, and it is stored in the property graph snapshot table. Finally, this example reads back the file from the snapshot table and creates a second graph from its contents.

String szOPVFile = "../../data/connections.opv"; 
String szOPEFile = "../../data/connections.ope"; 
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args, szGraphName); 
opgdl = OraclePropertyGraphDataLoader.getInstance(); 
opgdl.loadData(opg, szOPVFile, szOPEFile, 2 /* dop */, 1000, true, 
               "PDML=T,PDDL=T,NO_DUP=T,"); 

// Add a new vertex
Vertex v = opg.addVertex(Long.valueOf("1000"));
v.setProperty("name", "Alice");
opg.commit();

System.out.pritnln("Graph " + szGraphName + " total vertices: " + 
                   opg.countVertices(dop));
System.out.pritnln("Graph " + szGraphName + " total edges: " + 
                   opg.countEdges(dop));


// Get a snapshot of the current graph as a file in graphML format. 
OutputStream os = new ByteArrayOutputStream();
OraclePropertyGraphUtils.exportGraphML(opg, 
                                       os /* output stream */, 
                                       System.out /* stream to show progress */);

// Save the snapshot into the SS$ table
InputStream is = new ByteArrayInputStream(os.toByteArray());
OraclePropertyGraphUtils.storeBinaryInMemGraphSnapshot(szGraphName, 
                                                     szGraphOwner /* owner of the 
                                                                   property graph */,
                                                     conn /* database connection */, 
                                                     is,
                                                     (long) 1 /* snapshot ID */,
                                                     1 /* dop */);
os.close();
is.close();

// Read the snapshot back from the SS$ table
OutputStream snapshotOS = new ByteArrayOutputStream();
OraclePropertyGraphUtils.readBinaryInMemGraphSnapshot(szGraphName, 
                                                    szGraphOwner /* owner of the 
                                                                   property graph */,
                                                    conn /* database connection */, 
                                                    new OutputStream[] {snapshotOS},
                                                    (long) 1 /* snapshot ID */,
                                                    1 /* dop */);

InputStream snapshotIS = new ByteArrayInputStream(snapshotOS.toByteArray());
String szGraphNameSnapshot = szGraphName + "_snap";
OraclePropertyGraph opg = OraclePropertyGraph.getInstance(args,szGraphNameSnapshot); 

OraclePropertyGraphUtils.importGraphML(opg, 
                                       snapshotIS /* input stream */, 
                                       System.out /* stream to show progress */);

snapshotOS.close();
snapshotIS.close();


System.out.pritnln("Graph " + szGraphNameSnapshot + " total vertices: " + 
                   opg.countVertices(dop));
System.out.pritnln("Graph " + szGraphNameSnapshot + " total edges: " + 
                   opg.countEdges(dop));

The preceding example will produce output similar as the following:

Graph test total vertices: 79
Graph test total edges: 164
Graph test_snap total vertices: 79
Graph test_snap total edges: 164