Skip Headers
Oracle® Database Semantic Technologies Developer's Guide
11g Release 2 (11.2)

Part Number E25609-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
PDF · Mobi · ePub

5 Fine-Grained Access Control for RDF Data

The default control of access to the Oracle Database semantic data store is at the model level: the owner of a model may grant select, delete, and insert privileges on the model to other users by granting appropriate privileges on the view named RDFM_<model_name>. However, for applications with stringent security requirements, you can enforce a fine-grained access control mechanism by using either the Virtual Private Database feature or the Oracle Label Security option of Oracle Database:

Some factors to consider in choosing whether use VPD or OLS with RDF data include the following:

The application programming interface (API) for implementing VPD or OLS with semantic data is provided in the SEM_RDFSA PL/SQL package. Chapter 13 provides reference information about the programs in the SEM_RDFSA package.

VPD and OLS support for RDF data is included in the semantic technologies support for Release 11.2. (For information about enabling, downgrading, or removing semantic technologies support, see Appendix A.)

This chapter contains the following major sections:

5.1 Virtual Private Database (VPD) for RDF Data

The Virtual Private Database (VPD) feature is a row-level security mechanism that restricts access to specific rows in a relational table or view using a security policy and an application context. The security policy includes a policy function that dynamically generates predicates that are enforced for each row returned for the user query. The security predicates returned by the policy function associated with a table are typically expressed using the columns in the table and are thus dependent on the table metadata. Effectively, the security predicates ensure that the rows returned for a user query satisfy additional conditions that are applied on the contents of the row.

When the relational data is mapped to RDF, the data stored in a specific relational table represent triples describing instances of a specific RDF class. In this representation, the columns in the relational table map to RDF properties that are used to describe a resource. This mapping may be further extended to the application of VPD policies.

A VPD policy applied to RDF data restricts users' access to instances of a specific RDF class or property by applying predicates, in the form of graph patterns and filter conditions, on the instance data. For example, a VPD policy may be defined to restrict access to instances of a Contract RDF class only to the users belonging to a specific department. Furthermore, access to the hasContractValue property for a resource identified as instance of the Contract RDF class may be restricted only to the manager of the contract. VPD support for RDF data allows security conditions or data access constraints to be associated with RDF classes and properties, so that access to corresponding instance data is restricted.

A data access constraint associated with an RDF class or property specifies a graph query pattern that must be enforced for all corresponding data instances that are returned as the query result. For example, a SPARQL query pattern to find the due dates for all instances of a Contract class, {?contract :hasDueDate ?due}, may activate a data access constraint that ensures that the information returned pertains to contracts belonging to a specific department. This is achieved by logically rewriting the user's graph query pattern to include additional graph patterns, as shown in the following example:

{ ?contract   :hasDueDate  ?due . 
  ?contract   :drivenBy    dept:Dept1 }

Furthermore, the values bound into the rewritten graph query pattern may make use of session context to enforce dynamic access restrictions. In the following example, the sys_context function in the object position of the triple pattern binds the appropriate department value based on the session context:

{ ?contract   :hasDueDate   ?due . 
  ?contract   :drivenBy
             "sys_context('sa$appctx','user_dept'}"^^orardf:instruction }

In a relational data model, the metadata, in the form of table definition, always exists along with the data (the rows stored in the table); thus, the VPD policies defined using the metadata are well formed and the security conditions are generated using a procedural logic in the policy function.

However, the RDF data model allows data with no accompanying metadata, and therefore the class information for instance data may not always be available for a given RDF graph. For example, in an RDF graph a resource known to be a contract might not accompany a triple that asserts that the resource is an instance of Contract class. Usually such triples can be inferred using available domain and range specifications for the properties describing the resource.

Similarly, a VPD policy relies on the properties' domain and range specifications for deriving the class information for the instance data and for enforcing appropriate data access constraints. However, to avoid runtime dependencies on the user data, a VPD policy maintains the minimal metadata required to derive the class information in its dictionary, separate from the asserted and inferred triples. This also ensures that the metadata maintained by a VPD policy is complete even when some necessary information is missing from the asserted triples and that a VPD policy, with its data access constraints and the metadata, is self-contained and portable with no external dependencies.

A VPD policy with specific data access constraints and RDF metadata specifications can be used to enforce access restrictions for the data stored in an RDF model. Each SPARQL query issued on the model is analyzed to deduce the class information for the resources accessed in the query, and appropriate data access constraints are applied. To facilitate the compile-time analysis and derivation of class information for instance data, a graph query pattern with an unbound predicate is restricted when a VPD policy is in effect. For example, a graph pattern of the following form, anywhere in a SPARQL query pattern, raises an exception when any underlying model has a VPD policy:

{ <contract:projectHLS>  ?pred  ?obj }

VPD policies are only enforced for SEM_MATCH queries expressed in SPARQL syntax. All other forms of data access (such as classic syntax for graph pattern or direct query on the model view) are not permitted.

5.1.1 VPD Policy for RDF Data

A VPD policy for RDF data is a named dictionary entity that can be used to enforce access restrictions for the data stored in one or more RDF models. A VPD policy defined for RDF data has unique characteristics, and it cannot be reused to enforce security policies for relational data. An RDF-VPD policy defined in the database includes the following:

  • The RDF Schema statements or metadata necessary for deriving class information for the data referenced in a SPARQL user query

  • The data access constraints that enforce access restrictions for the instance data

  • Application context that allows conditional inclusions of groups of data access constraints based on the runtime environment

An RDF-VPD policy is defined, owned, and maintained by a user with a security administrator role in an organization. This user must have at least EXECUTE privileges on the SYS.DBMS_RLS package. The owner of an RDF-VPD policy can maintain the metadata associated with the policy, define new data access constraints, and apply the policy to one or more RDF models.

A SPARQL query issued on an RDF model with a VPD policy is analyzed, and zero or more data access constraints defined in the policy are enforced such that the data instances that are returned as the query result also satisfy these constraints. The exact data access constraints enforced for a user query vary, based on the resources referenced in the query and the application context. For example, a policy that restricts a manager's access to the hasContractValue property may be relaxed for a user with the Vice President role.

Based on the role of the user, as captured in the application context, specific constraints to be applied are determined at runtime. To facilitate this dynamic inclusion of subsets of constraints defined in a VPD policy, the data access constraints are arranged into named groups that can be activated and deactivated based on the application context. During query analysis, only the constraints defined in the active groups are considered for enforcement.

The constraint groups within a VPD policy are managed using an application context and its package implementation. Each VPD policy can specify the namespace for a context created with the CREATE CONTEXT command. Each attribute associated with the context is treated as the name of a constraint group that can be activated by initializing its value to 1. For example, setting the value for MANAGER attribute of the context associated with a VPD policy to 1 will activate the constraints associated with MANAGER group for the user session. The logic that initializes specific constraint groups based on the user context is typically embedded in the package associated with the context type. The following example shows an excerpt from a sample implementation for one such package:

CREATE CONTEXT contracts_constr_ctx using sec_admin.contracts_ctx_pack;
 
begin
  -- create the VPD policy with a context -- 
  sem_rdfsa.create_vpd_policy(policy_name    => 'CONTRACTS_POLICY',
                              policy_context => 'contracts_constr_ctx');
end;
/
 
create or replace package sec_admin.contracts_ctx_pack as
  procedure init_constr_groups;
end;
/
 
create or replace package body sec_admin.contracts_ctx_pack as
  procedure init_contr_groups is 
    hrdata EmpRole%rowtype; 
  begin
    -- specific users with FULL access to the data associated with 
    -- the policy -- 
    if (sys_context('userenv', 'session_user') = 'RDF_ADMIN') then 
      dbms_session.set_context('contracts_constr_ctx',
                                sem_rdfsa.VPD_FULL_ACCESS, 1); 
      return;
    end if; 
 
    SELECT * into hrdata FROM EmpRole WHERE guid = 
                          sys_context('userenv','session_user');
 
    if (hrdata.emprole = 'VP') then 
      -- if the user logged in has VP role, activate the constraint
      -- group named VP and keep all other groups inactive. 
      dbms_session.set_context('contracts_constr_ctx','VP', '1'); 
    elsif (hrdata.emprole = 'MANAGER') then 
      dbms_session.set_context('contracts_constr_ctx', 'MANAGER', '1'); 
    elsif ...
      ...  
    else 
      raise_application_error(-20010, 'unknown user role'); 
    end if;
 
  exception when others then 
    -- enforce constraints for all groups --
    dbms_session.clear_all_context('contracts_constr_ctx');
  end init_contr_groups; 
end;
/

By default, when a namespace is not associated with an RDF-VPD policy or when a specific constraint group is not activated in a session, all the constraints defined in the policy are active and they are enforced for each user query. However, when a specific constraint group is activated by setting the corresponding namespace-attribute value to 1, only the constraints belonging to the group and any other constraints that are not associated with any group are enforced. For a given session, one or more constraint groups may be activated, in which case all the applicable constraints are enforced conjunctively.

At the time of creation, the data access constraints defined in a RDF-VPD policy may specify the name of a constraint group (explained in Section 5.1.3, "Data Access Constraints"). Within a database session, appropriate groups of constraints are activated based on the session context set by the context package. For all subsequent SPARQL queries in the database session, the constraints belonging to the active groups are consulted for enforcing appropriate security policies.

Maintenance operations on an RDF model with a VPD policy require unconditional access to data in the model. These operations include creation of an entailment using at least one VPD protected model, and load or data manipulation operations. You can grant unconditional access to the data stored in an RDF model by initializing a reserved attribute for the namespace associated with the VPD policy. The reserved attribute is defined by the package constant sem_rdfsa.VPD_FULL_ACCESS, and the context package implementation shown in the preceding example grants FULL access to the RDF_ADMIN user.

DML operations on the application table are not validated for VPD constraint violations, so only a user with FULL access to the corresponding model can add or modify existing triples.

You can use the SEM_MATCH operator to query an RDF model with a VPD policy in a standard SQL query, and to perform a multi-model query on a combination of VPD-enabled models and models with no VPD policy. However, when more than one model in a multi-model query is VPD-enabled, they must all be associated with the same VPD policy. A VPD policy associated with an RDF model is automatically extended to any data inferred from the model. When multiple RDF models are specified during inference, all VPD-enabled models within the set should use the same VPD policy.

5.1.2 RDF Metadata for Enforcing VPD Policies

The types of RDF metadata used to enforce VPD policies include the following:

  • Domain and range information for the properties used in the graph

  • Subclass relationships in the graph

  • Subproperty relationships in the graph

  • Equivalent properties in the graph

The RDF metadata associated with a VPD policy is specified as one or more RDF Schema statements using one of the following property URIs:

  • http://www.w3.org/2000/01/rdf-schema#domain

  • http://www.w3.org/2000/01/rdf-schema#range

  • http://www.w3.org/2000/01/rdf-schema#subClassOf

  • http://www.w3.org/2000/01/rdf-schema#subPropertyOf

  • http://www.w3.org/2002/07/owl#equivalentProperty

For example, the following RDF Schema statement associated with contracts_policy asserts that the domain of the hasContractValue property is a Contract class. Note that range specification for the predicates can be skipped if they are not relevant or if they are of literal type

begin
  sem_rdfsa.maint_vpd_metadata(
        policy_name => 'contracts_policy',
        t_subject   => '<http://www.myorg.com/pred/hasContractValue>',
        t_predicate => '<http://www.w3.org/2000/01/rdf-schema#domain>',
        t_object    => '<http://www.myorg.com/classes/Contract>');
end;
/

An RDF-VPD policy maintains its metadata separate from the asserted and inferred triples. You can derive this metadata programmatically from the RDF models and the corresponding entailments. For example, if the domain and range information for the properties and subclass and subproperty relationships are already established in the asserted or inferred triples, you can use a SQL query on the underlying model views to populate the metadata for an RDF-VPD policy.

The domain and range information for the properties aid the query analysis in determining the RDF class type for the terms and unbound variables referenced in the query. This information is further used to enforce appropriate data access constraints on the data accessed by the query. The metadata relating to the subclass property is used to ensure that a data access constraint defined for a specific class in a class hierarchy is automatically enforced for all its subclasses. Similarly, the subproperty specification in a VPD policy is used to enforce any constraints associated with a property to all its subproperties.

The RDF Schema statements associated with a VPD policy are not used to infer additional statements, and the security administrator should ensure that the metadata captured in a VPD policy is complete by cross checking it with inferred data. For example, a subproperty schema statement does not automatically infer the domain and range information for the property based on the domain and range specified for the super-property.

Certain owl and rdfs properties in the asserted triples, when left unchecked, may be used to infer data that may be used to circumvent the VPD policies. For example, when the new property is defined as a super-property of a property that has a specific data access constraint, the inferred data may duplicate all instances of the subproperty using the super-property. Unless the VPD policy explicitly defines access constraints for the super-property, the inferred data may be used to circumvent the access restrictions.

The ability to infer new data is only granted to users with FULL access, and such users should ensure that the metadata associated with the VPD policy is complete in light of newly inferred data. Specifically, the metadata associated with the VPD policy should be maintained if some new rdfs:subClassOf, rdfs:superClassOf, rdfs:subPropertyOf, rdfs:superPropertyOf, or owl:equivalentProperty assertions are generated during inference. Also, any new properties introduced by the rulebases used for inference may need domain and range specifications, as well as data access constraints, if they are associated with some sensitive information.

In a VPD policy, a property can be declared to be equivalent to another property so that the domain and range information, as well as any constraints defined for the original property, are automatically duplicated for the equivalent property. However, within a VPD policy, additional metadata or data access constraints cannot be directly assigned to the property declared to be an equivalent of another property.

5.1.3 Data Access Constraints

The data access constraints associated with a VPD policy fall into two general categories, based on the types of access restrictions that they enforce:

  • Those that restrict access to instances of specific RDF classes

  • Those that restrict to assertions using specific RDF properties

The access restrictions are enforced conditionally, based on the application context and the characteristics of the resources being accessed in a SPARQL query. Data access constraints restrict access to instances of an RDF class or property using some properties associated with the resource. For example, access to a resource that is a member of the Contract class may be restricted only to the users who work on the contract, identified using the hasMember property associated with the resource. Similarly, access to the hasContractValue property for a resource may be restricted to a user identified as the manager of the contract using hasManager property associated with the same resource.

Each data access constraint is expressed using two graph patterns identified as a match pattern and an apply pattern. The match pattern of a constraint determines the type of access restriction it enforces and binds one or more variables to the corresponding data instances accessed in the user query. For example, the following match pattern is defined for instances of the Contract class, and it binds a variable to all such instances accessed through a SPARQL query:

{ ?contract  rdf:type  <http://www.myorg.com/classes/Contract> }

Similarly, a match pattern for a constraint involving an RDF property matches the instances of the property accessed in a SPARQL query, and binds two variables to the resources in the subject and object position of such instances. For example, the match pattern for a constraint on the hasContractValue property is defined as follows:

{ ?contract  <http://www.myorg.com/pred/hasContractValue>  ?cvalue }

The apply pattern of a data access constraint defines additional graph patterns to be applied on the resources that match the match pattern before they can be used to construct the query results. One or more variables defined in the match pattern of a data access constraint are used in the corresponding apply pattern to enforce the access restrictions on the identified resources. For example, the following match pattern and apply pattern combination ensures that the hasContractValue of a contract can be accessed only if Andy is the manager of the contract being accessed.:

Match:  { ?contract  pred:hasContractValue  ?cvalue  }
Apply:  { ?contract  pred:hasManager        emp:Andy }

A data access constraint with its match and apply patterns expressed in SPARQL syntax can be added to a VPD policy to enforce access restrictions on the data stored in RDF models that are associated with the VPD policy. The following example, which adds a constraint to the VPD policy, assumes that the VPD policy is defined with appropriate namespace map for the pred and emp namespace prefixes. (To associate a namespace map with a VPD policy, use the SEM_RDFSA.CREATE_VPD_POLICY procedure.)

begin
  sem_rdfsa.add_vpd_constraint(
          policy_name   => 'contracts_policy',
          constr_name   => 'andy_constraint_1',
          match_pattern => '{?contract  pred:hasContractValue ?cvalue }',
          apply_pattern => '{?contract  pred:hasManager       emp:Andy }', 
          constr_group  => 'andy');
end;
/

The ability to arrange data access constraints into groups could ensure that the previous constraint is applied only for the sessions associated with Andy. However, to avoid proliferation of structurally similar constraints for each user, you can define a common constraint that uses the application context in the object position of the apply graph patterns, as shown in the following example:

begin
  sem_rdfsa.add_vpd_constraint(
          policy_name   => 'contracts_policy',
          constr_name   => 'manager_constraint_1',
          match_pattern => '{?contract  pred:hasContractValue ?cvalue }',
          apply_pattern => '{?contract  pred:hasManager     
             "sys_context(''sa$appctx'',''app_user_uri''}"^^orardf:instruction }',
          constr_group  => 'manager');
end;
/

In the preceding example. the data access constraint, defined within the manager constraint group, can be activated for all sessions involving users with a manager role. In this case, the secure application context can be programmed to initialize the attribute app_user_uri of the sa$appctx namespace with the URI for the user logged in. For example, when user Andy logs into the application, the app_user_uri attribute can be initialized to <http://www.myorg.com/employee/Andy>, in which case the constraint will ensure that user Andy can view the value for a contract only if user Andy manages the contract. Generally, the sys_context function can be used in the object position of any graph pattern to allow dynamic URIs or literal values to be bound at the time of query execution. Note that if the context is not initialized properly, the preceding constraint will fail for all data instances and effectively restrict the user from accessing any data.

A SPARQL query issued on an RDF model with a VPD policy is analyzed using the match patterns of all the active data access constraints that are defined in the policy. In the next example, the SPARQL query refers to the hasContractValue property, thereby enforcing the constraint if the group is active. Logically, the enforcement of a constraint is equivalent to rewriting the original SPARQL graph pattern to include the apply patterns for all the relevant constraints, using appropriate variables and terms from the user query. With the previous access restriction on the hasContractValue property, the following SPARQL graph pattern passed to a SEM_MATCH operator is logically rewritten as shown in the following example:

Query:     
{ ?contr  pred:drivenBy         ?dept . 
  ?contr  pred:hasContractValue ?val }
 
Rewritten query:     
{ ?contr  pred:drivenBy         ?dept . 
  ?contr  pred:hasContractValue ?val .
  ?contr  pred:hasManager
                "sys_context('sa$appctx','app_user_uri'}"^^orardf:instruction }

When the match pattern of a data access constraint on an RDF property matches the pattern being accessed in a user query, the equivalent VPD-enforced query appends the corresponding apply patterns to the SPARQL query using the variables and terms appearing in the matched pattern. When a SPARQL query has nested graph patterns, the data access constraints are applied to appropriate basic query graph pattern block. In the following example, the hasContractValue property is referenced in the OPTIONAL graph pattern, and therefore the corresponding apply pattern is enforced just for this block of the graph pattern.

Query:     
{ ?contr  pred:drivenBy         ?dept . 
   OPTIONAL { ?contr  pred:hasContractValue ?val } } 
 
Rewritten query:     
{ ?contr  pred:drivenBy         ?dept . 
   OPTIONAL { ?contr  pred:hasContractValue ?val .
              ?contr  pred:hasManager
                "sys_context('sa$appctx','app_user_uri'}"^^orardf:instruction }

The apply pattern for a data access constraint can be any valid basic graph pattern with multiple triple patterns and a FILTER clause. For example, the access constraint on the hasContractValue property for a user with VP role may stipulate that the user can access the property only if he or she is the Vice President of the department driving the contract. The match and apply patterns for such constraint can be defined as follows:

Match:  { ?contract  pred:hasContractValue  ?cvalue  }
Apply:  { ?contract  pred:drivenBy          ?dept . 
          ?dept      pred:hasVP
               "sys_context('sa$appctx','app_user_uri'}"^^orardf:instruction }

A match pattern defined for a data access constraint associated with an RDF class identifies all variables and terms that are known to be instances of the class. The RDF metadata defined in the VPD policy is used to determine the type for each variable and the term in a SPARQL query, and the appropriate access constraints are applied on these variables and terms. For example, the following VPD constraint ensures that a resource that is a member of the Contract class can only be accessed by a user who has a hasMember relationship with the resource:

Match:  { ?contract  rdf:type  <http://www.myorg.com/classes/Contract> }
Apply:  { ?contract  pred:hasMember           
               "sys_context('sa$appctx','app_user_uri'}"^^orardf:instruction }

The class information for a variable or term appearing in a SPARQL query is derived using the domain and range information for the properties appearing in the query. In the SPARQL query in the next example, if the VPD policy has an RDF Schema statement that asserts that the domain of the drivenBy property is the Contract class, the variable ?contr is known to hold instances of the Contract class. Therefore, with the previously defined access restriction for the Contract class, the user query is rewritten to include an appropriate apply pattern, as shown in the following example:

Query:     
{ ?contr  pred:drivenBy    ?dept . 
  ?contr  pred:hasDueDate  ?due } 
 
Rewritten query: 
{ ?contr  pred:drivenBy    ?dept . 
  ?contr  pred:hasDueDate  ?due  . 
  ?contr  pred:hasMember           
               "sys_context('sa$appctx','app_user_uri'}"^^orardf:instruction }

When a basic graph pattern in a SPARQL query matches multiple data access constraints, the corresponding apply patterns are combined to form a conjunctive graph pattern, which is subsequently enforced for the specific graph pattern by logically rewriting the SPARQL query. While considering the data access constraints to be enforced for a given SPARQL query, the class and property hierarchy associated with the VPD policy is consulted to automatically enforce all applicable constraints.

  • A variable or term identified as an instance of a specific RDF class enforces constraints associated with the class and all its superclasses.

  • A constraint associated with a property is enforced when the user query references the property or any property defined as its subproperty or an equivalent property.

You can use the sys_context function in a data access constraint to enforce context-dependent access restrictions with structurally similar graph patterns. You can dynamically activate and deactivate constraint groups, based on the application context, to enforce alternate access restrictions using structurally different graph patterns.

5.1.4 RDFVPD_POLICIES View

The MDSYS.RDFVPD_POLICIES view contains information about all VPD policies defined in the schema or the policies to which the user has FULL access. If the same policy is associated with multiple models, this view has one entry for each such association. This view exists only after the semantic network and a VPD policy have been created.

The MDSYS.RDFVPD_POLICIES view contains the columns shown in Table 5-1.

Table 5-1 MDSYS.RDFVPD_POLICIES View Columns

Column Name Data Type Description

POLICY_OWNER

VARCHAR2(32)

Owner of the VPD policy.

POLICY_NAME

VARCHAR2(32)

Name of the VPD policy.

NAMESPACE_MAP

RDF_ALIASES

Mapping for namespace entries that are used in the VPD constraint definitions.

CONTEXT_NAME

VARCHAR2(32)

Name of the context used to manage constraint groups.


5.1.5 RDFVPD_MODELS View

The MDSYS.RDFVPD_MODELS view contains information about RDF models and their associated VPD policies. This view exists only after the semantic network and a VPD policy have been created.

The MDSYS.RDFVPD_MODELS view contains the columns shown in Table 5-2.

Table 5-2 MDSYS.RDFVPD_MODELS View Columns

Column Name Data Type Description

MODEL_NAME

VARCHAR2(25)

Name of the model.

POLICY_OWNER

VARCHAR2(32)

Owner of the VPD policy.

POLICY_NAME

VARCHAR2(32)

Name of the VPD policy.

OPERATION_TYPE

VARCHAR2(9)

Type of operation for which the VPD policy is enforced: QUERY or DML.


5.1.6 RDFVPD_POLICY_CONSTRAINTS View

The MDSYS.RDFVPD_POLICY_CONSTRAINTS view contains information about the constraints defined in the VPD policy that are accessible to the current user. This view exists only after the semantic network and a VPD policy have been created.

The MDSYS.RDFVPD_POLICY_CONSTRAINTS view contains the columns shown in Table 5-3.

Table 5-3 MDSYS.RDFVPD_POLICY_CONSTRAINTS View Columns

Column Name Data Type Description

POLICY_OWNER

VARCHAR2(32)

Owner of the VPD policy.

POLICY_NAME

VARCHAR2(32)

Name of the VPD policy.

CONSTRAINT_NAME

VARCHAR2(32)

Name of the constraint.

MATCH_PATTERN

VARCHAR2(1000)

Match pattern for the constraint.

APPLY_PATTERN

VARCHAR2(4000)

Apply pattern for the constraint.

CONSTRAINT_GROUP

VARCHAR2(32)

Name of the constraint group to which the constraint belongs. (Not case-sensitive.).


5.1.7 RDFVPD_PREDICATE_MDATA View

The MDSYS.RDFVPD_PREDICATE_MDATA view contains information about the predicate metadata associated with a VPD policy. This view exists only after the semantic network and a VPD policy have been created.

The MDSYS.RDFVPD_PREDICATE_MDATA view contains the columns shown in Table 5-4.

Table 5-4 MDSYS.RDFVPD_PREDICATE_MDATA View Columns

Column Name Data Type Description

POLICY_OWNER

VARCHAR2(32)

Owner of the VPD policy.

POLICY_NAME

VARCHAR2(32)

Name of the VPD policy.

PREDICATE

VARCHAR2(4000)

URI for the predicate for which the domain and range information is defined.

HASDOMAIN

VARCHAR2(4000)

URI representing the domain of the predicate.

HASRANGE

VARCHAR2(4000)

URI representing the range of the predicate.


5.1.8 RDFVPD_RESOURCE_REL View

The MDSYS.RDFVPD_RESOURCE_REL view contains information about the subclass, subproperty, and equivalence property relationships that are defined between resources in a VPD policy. This view exists only after the semantic network and a VPD policy have been created.

The MDSYS.RDFVPD_RESOURCE_REL view contains the columns shown in Table 5-5.

Table 5-5 MDSYS.RDFVPD_RESOURCE_REL View Columns

Column Name Data Type Description

POLICY_OWNER

VARCHAR2(32)

Owner of the VPD policy.

POLICY_NAME

VARCHAR2(32)

Name of the VPD policy.

SUBJECT_RESOURCE

VARCHAR2(4000)

Subject resource.

OBJECT_RESOURCE

VARCHAR2(4000)

Object resource.

RELATIONSHIP_TYPE

VARCHAR2(4000)

Relationship that exists between the subject resource and the object resource.


5.2 Oracle Label Security (OLS) for RDF Data

Oracle Label Security (OLS) for RDF data provides two options for securing semantic data:

To specify an option, use the SEM_RDFSA.APPLY_OLS_POLICY procedure with the appropriate rdfsa_options parameter value.

To switch from one option to the other, remove the existing policy by using the SEM_RDFSA.REMOVE_OLS_POLICY procedure, and then apply the new policy by using the SEM_RDFSA.APPLY_OLS_POLICY procedure with the appropriate rdfsa_options parameter value.

5.2.1 Triple-Level Security

The triple-level security option provides a thin layer of RDF-specific capabilities on top of the Oracle Database native support for label security. This option provides better performance and is easier to use than the resource-level security (described in Section 5.2.2), especially for performing inference while using OLS. The main difference is that with triple-level security there is no need to assign labels, explicitly or implicitly, to individual triple resources (subjects, properties, objects).

Note:

To use triple-level security, you must first install Patch 9819833 SEMANTIC TECHNOLOGIES 11G R2 FIX BUNDLE 2 (available from My Oracle Support).

To use triple-level security, specify SEM_RDFSA.TRIPLE_LEVEL_ONLY as the rdfsa_options parameter value when you execute the SEM_RDFSA.APPLY_OLS_POLICY procedure. For example:

EXECUTE sem_rdfsa.apply_ols_policy('defense', SEM_RDFSA.TRIPLE_LEVEL_ONLY);

Do not specify any of the other available parameters for the SEM_RDFSA.APPLY_OLS_POLICY procedure.

When you use triple-level security, OLS is applied to each semantic model in the network. That is, label security is applied to the relevant internal tables and to all the application tables; there is no need to manually apply policies to the application tables of existing semantic models. However, if you need to create additional models after applying the OLS policy, you must use the SEM_OLS.APPLY_POLICY_TO_APP_TAB procedure to apply OLS to the application table before creating the model. Similarly, if you have dropped a semantic model and you no longer need to protect the application table, you can use the SEM_OLS.REMOVE_POLICY_FROM_APP_TAB procedure. (These procedures are described in Chapter 10.)

With triple-level security, duplicate triples with different labels can be inserted in the semantic model. (Such duplicates are not allowed with resource-level security.) For example, assume that you have a triple with a very sensitive label, such as:

(<urn:X>,<urn:P>,<urn:Y>, "TOPSECRET")

This does not prevent a low-privileged (UNCLASSIFIED) user from inserting the triple (<urn:X>,<urn:P>,<urn:Y>, "UNCLASSIFIED"). Because SPARQL and SEM_MATCH do not return label information, a query will return both rows (assuming the user has appropriate privileges), and it will not be easy to distinguish between the TOPSECRET and UNCLASSIFIED triples.

To filter out such low-security triples when querying the semantic models, you can one or more the following options with SEM_MATCH:

  • POLICY_NAME specifies the OLS policy name.

  • MIN_LABEL specifies the minimum label for triples that are included in the query

In other words, every triple that contains a label that is strictly dominated by MIN_LABEL is not included in the query. For example, to filter out the "UNCLASSIFIED" triple, you could use the following query (assuming the OLS policy name is DEFENSE and that the query user has read privileges over UNCLASSIFIED and TOPSECRET triples):

SELECT s,p,y FROM table(sem_match('{?s ?p ?y}' , 
  sem_models(TEST'), null, null, null, null, 
  'MIN_LABEL=TOPSECRET POLICY_NAME=DEFENSE'));

Note that the filtering in the preceding example occurs in addition to the security checks performed by the native OLS software.

After a triple has been inserted, you can view and update the label information can be done through the CTXT1 column in the application table for the semantic model (assuming that you have the WRITEUP and WRITEDOWN privileges to modify the labels).

There are no restrictions on who can perform inference or bulk loading with triple-level security; all of the inferred or bulk loaded triples are inserted with the user's session row label. Note that you can change the session labels by using the SA_UTL package. (For more information, see Oracle Label Security Administrator's Guide.)

5.2.2 Resource-Level Security

The resource-level security option enables you to assign one or more security labels that define a security level for table rows. Conceptually, a table in a relational data model can be mapped to an equivalent RDF graph. Specifically, a row in a relational table can be mapped to a set of triples, each asserting some facts about a specific Subject. In this scenario, the subject represents the primary key for the row and each non-key column-value combination from the row is mapped to a predicate-object value combination for the corresponding triples.

A row in a relational data model is identified by its key, and OLS, as a row-level access control mechanism, effectively restricts access to the values associated with the key. With this conceptual mapping between relational and RDF data models, restricting access to a row in a relational table is equivalent to restricting access to a subgraph involving a specific subject. In a model that supports sensitivity labels for each triple, this is enforced by applying the same label to all the triples involving the given subject. However, you can also achieve greater flexibility by allowing the individual triples to have different labels, while maintaining a minimum bound for all the labels.

OLS support for RDF data employs a multilevel approach in which sensitivity labels associated with the triple components (subject, predicate, object) collectively form a minimum bound for the sensitivity label for the triple. With this approach, a data sensitivity label associated with an RDF resource (used as subject, predicate, or object) restricts unauthorized users from accessing any triples involving the resource and from creating new triples with the resource. For example, projectHLS as a subject may have a minimum sensitivity label, which ensures that all triples describing this subject have a sensitivity label that at least covers the label for projectHLS. Additionally, hasContractValue as a predicate may have a higher sensitivity label; and when this predicate is used with projectHLS to form a triple, that triple minimally has a label that covers both the subject and the predicate labels, as in the following example:

Triple 1: <http://www.myorg.com/contract/projectHLS> :ownedBy
                               <http://www.myorg.com/department/Dept1>
Triple 2: <http://www.myorg.com/contract/projectHLS> :hasContractValue
                               "100000"^^xsd:integer

Sensitivity labels are associated with the RDF resources (URIs) based on the position in which they appear in a triple. For example, the same RDF resource may appear in different positions (subject, predicate, or object) in different triples. Three unique labels can be assigned to each resource, so that the appropriate label is used to determine the label for a triple based on the position of the resource in the triple. You can choose the specific resource positions to be secured in a database instance when you apply an OLS policy to the RDF repository. You can secure subjects, objects, predicates, or any combination, as explained in separate sections to follow. The following example applies an OLS policy named defense to the RDF repository and allows sensitivity labels to be associated with RDF subjects and predicates.

begin
  sem_rdfsa.apply_ols_policy(
        policy_name   => 'defense',
        rdfsa_options => sem_rdfsa.SECURE_SUBJECT+
                         sem_rdfsa.SECURE_PREDICATE); 
end;
/

The same RDF resource can appear in both the subject and object positions (and sometime even as the predicate), and such a resource can have distinct sensitivity labels based on its position. A triple using the resource at a specific position should have a label that covers the label corresponding to the resource's position. In such cases, the triple can be asserted or accessed only by the users with labels that cover the triple and the resource labels.

In a specific RDF repository, security based on data classification techniques can be turned on for subjects, predicates, objects, or a combination of these. This ensures that all the triples added to the repository automatically conform to the label relationships described above.

5.2.2.1 Securing RDF Subjects

An RDF resource (typically a URI) appears in the subject position of a triple when an assertion is made about the resource. In this case, a sensitivity label associated with the resource has following characteristics:

  • The label represents the minimum sensitivity label for any triple using the resource as a subject. In other words, the sensitivity label for the triple should dominate or cover the label for the subject.

  • The label for a newly added triple is initialized to the user initial row label or is generated using the label function, if one is specified. Such operations are successful only if the triple's label dominates the label associated with the triple's subject.

  • Only a user with an access label that dominates the subject's label and the triple's label can read the triple.

By default, the sensitivity label for a subject is derived from the user environment when an RDF resource is used in the subject position of a triple for the first time. The default sensitivity label in this case is set to the user's initial row label (the default that is assigned to all rows inserted by the user).

However, you can categorize an RDF resource as a subject and assign a sensitivity label to it even before it is used in a triple. The following example assigns a sensitivity label named SECRET:HLS:US to the projectHLS resource, thereby restricting the users who are able to define new triples about this resource and who are able to access existing triples with this resource as the subject:

begin
  sem_rdfsa.set_resource_label(
         model_name   => 'contracts',
         resource_uri => '<http://www.myorg.com/contract/projectHLS>',
         label_string => 'SECRET:HLS:US',
         resource_pos => 'S');
end;

5.2.2.2 Securing RDF Predicates

An RDF predicate defines the relationship between a subject and an object. You can use sensitivity labels associated with RDF predicates to restrict access to specific types of relationships with all subjects.

RDF predicates are analogous to columns in a relational table, and the ability to restrict access to specific predicates is equivalent to securing relational data at the column level. As in the case of securing the subject, the predicate's sensitivity label creates a minimum bound for any triples using this predicate. It is also the minimum authorization that a user must posses to define a triple with the predicate or to access a triple with the predicate.

The following example assigns the label HSECRET:FIN (in this scenario, a label that is Highly Secret and that also belongs to the Finance department) to triples with the hasContractValue predicate, to ensure that only a user with such clearance can define the triple or access it:

begin
  sem_rdfsa.set_predicate_label( 
         model_name   => 'contracts',
         predicate    => '<http://www.myorg.com/pred/hasContractValue>',
         label_string => 'HSECRET:FIN');
end;  
/

You can secure predicates in combination with subjects. In such cases, the triples using a subject and a predicate are ensured to have a sensitivity label that at least covers the labels for both the subject and the predicate. Extending the preceding example, if projectHLS as a subject is secured with label SECRET:HLS:US and if hasContractValue as a predicate is secured with label HSECRET:FIN:, a triple assigning a monetary value for projectHLS should at least have a label HSECRET:HLS,FIN:US. Effectively, a user's label must dominate this triple's label to be able to define or access the triple.

5.2.2.3 Securing RDF Objects

An RDF triple can have an URI or a literal in its object position. The URI in object position of a triple represents some resource. You can secure a resource in the object position by associating a sensitivity label to it, to restrict the ability to use the resource as an object in triples.

Typically, a resource (URI or non-literal) appearing in the object position of a triple may itself be described using additional RDF statements. Effectively, an RDF resource in the object position could appear in the subject position in some other triples. When the RDF resources are secured at the object position without explicit sensitivity labels, the label associated with the same resource in the subject position is used as the default label for the object.

5.2.2.4 Generating Labels for Inferred Triples

RDF data model allows for specification of declarative rules, enabling it to infer the presence of RDF statements that are not explicitly added to the repository. The following shows some simple declarative rules associated with the logic that projects can be owned by departments and departments have Vice Presidents, and in such cases the project leader is by default the Vice President of the department that owns the project.

RuleID -> projectLedBy
Antecedent Expression -> (?proj :ownedBy ?dept) (?dept :hasVP ?person)
Consequent Expression -> (?proj :isLedBy ?person)

An RDF rule uses some explicitly asserted triples as well as previously inferred triples as antecedents, and infers one or more consequent triples. Traditionally, the inference process is executed as an offline operation to pregenerate all the inferred triples and to make them available for subsequent query operations.

When the underlying RDF graph is secured using OLS, any additional data inferred from the graph should also be secured to avoid exposing the data to unauthorized users. Additionally, the inference process should run with higher privileges, specifically with full access to data, in order to ensure completeness.

OLS support for RDF data offers techniques to generate sensitivity labels for inferred triples based on labels associated with one or more RDF artifacts. It provides label generation techniques that you can invoke at the time of inference. Additionally, it provides an extensibility framework, which allows an extensible implementation to receive a set of possible labels for a specific triple and determine the most appropriate sensitivity label for the triple based on some application-specific logic. The techniques that you can use for generating the labels for inferred triples include the following (each technique, except for Use Antecedent Labels, is associated with a SEM_RDFSA package constant):

  • Use Rule Label (SEM_RDFSA.LABELGEN_RULE): An inferred triple is directly generated by a specific rule, and it may be indirectly dependent on other rules through its antecedents. Each rule may have a sensitivity label, which is used as the sensitivity label for all the triples directly inferred by the rule.

  • Use Subject Label (SEM_RDFSA.LABELGEN_SUBJECT): Derives the label for the inferred triple by considering any sensitivity labels associated with the subject in the new triple. Each inferred triple has a subject, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the subject in the newly inferred triple may have one or more labels associated with it. With the Use Subject Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.

  • Use Predicate Label (SEM_RDFSA.LABELGEN_PREDICATE): Derives the label for the inferred triple by considering any sensitivity labels associated with the predicate in the new triple. Each inferred triple has a predicate, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the predicate in the newly inferred triple may have one or more labels associated with it. With the Use Predicate Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.

  • Use Object Label (SEM_RDFSA.LABELGEN_OBJECT): Derives the label for the inferred triple by considering any sensitivity labels associated with the object in the new triple. Each inferred triple has an object, which could in turn be a subject, predicate, or object in any of the triple's antecedents. When such RDF resources are secured, the object in the newly inferred triple may have one or more labels associated with it. With the Use Object Label technique, the label for the inferred triple is set to the unique label associated with the RDF resource. When more than one label exists for the resource, you can implement the extensible logic to determine the most relevant label for the new triple.

  • Use Dominating Label (SEM_RDFSA.LABELGEN_DOMINATING): Each inferred triple minimally has four direct components: subject, predicate, object, and the rule that produced the triple. With the Use Dominating Label technique, at the time of inference the label generator computes the most dominating of the sensitivity labels associated with each of the component and assigns it as the sensitivity label for the inferred triple. Exception labels are assigned when a clear dominating relationship cannot be established between various labels.

  • Use Antecedent Labels: In addition to the four direct components for each inferred triple (subject, predicate, object, and the rule that produced the triple), a triple may have one or more antecedent triples, which are instrumental in deducing the new triple. With the Use Antecedent Labels technique, the labels for all the antecedent triples are considered, and conflict resolution criteria are implemented to determine the most appropriate label for the new triple. Since an inferred triple may be dependent on other inferred triples, a strict order is followed while generating the labels for all the inferred triples.

    The Use Antecedent Labels technique requires that you use a custom label generator. For information about creating and using a custom label generator, see Section 5.2.2.5.

The following example creates an entailment (rules index) for the contracts data using a specific rule base. This operation can only be performed by a user with FULL access privilege with the OLS policy applied to the RDF repository. In this case, the labels generated for the inferred triples are based on the labels associated with their predicates, as indicated by the use of the SEM_RDFSA.LABELGEN_PREDICATE package constant in the label_gen parameter.

begin
  sem_rdfsa.create_entailment(
         index_name_in   => 'contracts_inf',
         models_in       => SDO_RDF_Models('contracts'),
         rulebases_in    => SDO_RDF_Rulebases('contracts_rb'),
         options         => 'USER_RULES=T',
         label_gen       => sem_rdfsa.LABELGEN_PREDICATE);
end;

When the predefined or extensible label generation implementation cannot compute a unique label to be applied to an inferred triple, an exception label is set for the triple. Such triples are not accessible by any user other than the user with full access to RDF data (also the user initiating the inference process). The triples with exception labels are clearly marked, so that a privileged user can access them and apply meaningful labels manually. After the sensitivity labels are applied to inferred triples, only users with compatible labels can access these triples. The following example updates the sensitivity label for triples for which an exception label was set:

update mdsys.rdfi_contracts_inf 
     set ctxt1 = char_to_label('defense', 'SECRET:HLS:US')
     where ctxt1 = -1;

Inferred triples accessed through generated labels might not be same as conceptual triples inferred directly from the user accessible triples and rules. The labels generated using system-defined or custom implementations cannot be guaranteed to be precise. See the information about Fine-Grained Access Control (VPD and OLS) Considerations in the Usage Notes for the SEM_APIS.CREATE_ENTAILMENT procedure in Chapter 9 for details.

5.2.2.5 Using Labels Based on Application Logic

The MDSYS.RDFSA_LABELGEN type is used to apply appropriate label generator logic at the time of index creation; however, you can also extend this type to implement a custom label generator and generate labels based on application logic. The label generator is specified using the label_gen parameter with the SEM_APIS.CREATE_ENTAILMENT procedure. To use a system-defined label generator, specify a SEM_RDFSA package constant, as explained in Section 5.2.2.4; to use a custom label generator, you must implement a custom label generator type and specify an instance of that type instead of a SEM_RDFSA package constant.

To create a custom label generator type, you must have the UNDER privilege on the RDFSA_LABELGEN type. In addition, to create an index for RDF data , you must should have the EXECUTE privilege on this type. The following example grants these privileges to a user named RDF_ADMIN:

GRANT under, execute ON mdsys.rdfsa_labelgen TO rdf_admin;

The custom label generator type must implement a constructor, which should set the dependent resources and specify the getNumericLabel method to return the label computed from the information passed in, as shown in the following example:

CREATE OR REPLACE TYPE CustomSPORALabel UNDER mdsys.rdfsa_labelgen  (
   constructor function CustomSPORALabel return self as result,
   overriding member function getNumericLabel (
                                    subject   rdfsa_resource,
                                    predicate rdfsa_resource,
                                    object    rdfsa_resource,
                                    rule      rdfsa_resource,
                                    anteced   rdfsa_resource)
        return number);

The label generator constructor uses a set of constants defined in the SEM_RDFSA package to indicate the list of resources on which the label generator relies. The dependent resources are identified as an inferred triple's subject, its predicate, its object, the rule that produced the triple, and its antecedents. A custom label generator can rely on any subset of these resources for generating the labels, and you can specify this in its constructor by using the constants defined in SEM_RDFSA package : USE_SUBJECT_LABEL, USE_PREDICATE_LABEL, USE_OBJECT_LABEL, USE_RULE_LABEL, USE_ANTCED_LABEL. The following example creates the type body and specifies the constructor:

Example 5-1 creates the type body, specifying the constructor function and the getNumericLabel member function. (Application-specific logic is not included in this example.)

Example 5-1 Creating a Custom Label Generator Type

CREATE OR REPLACE TYPE BODY CustomSPORALabel AS
 
   constructor function CustomSPORALabel return self as result as
   begin
     self.setDepResources(sem_rdfsa.USE_SUBJECT_LABEL+
                          sem_rdfsa.USE_PREDICATE_LABEL+
                          sem_rdfsa.USE_OBJECT_LABEL+
                          sem_rdfsa.USE_RULE_LABEL+
                          sem_rdfsa.USE_ANTECED_LABELS);
     return;
   end CustomSPORALabel;
   
   overriding member function getNumericLabel (
                                    subject   rdfsa_resource,
                                    predicate rdfsa_resource,
                                    object    rdfsa_resource,
                                    rule      rdfsa_resource,
                                    anteced   rdfsa_resource)
        return number as
     labellst mdsys.int_array := mdsys.int_array(); 
   begin
    -- Find dominating label of S P O R A –
    –- Application specific logic for computing the triple label –
    -- Copy over all labels to labellst --
    for li in 1 .. subject.getLabelCount() loop
      labellst.extend; 
      labellst(labellst.COUNT) = subject.getLabel(li); 
    end loop; 
    --- Copy over other labels as well --- 
    --- Find a dominating of all the labels. Generates –1 if no
    --- dominating label within the set
    return self.findDominatingOf(labellst); 
   end getNumericLabel;
  end CustomSPORALabel;  
  /

In Example 5-1, the sample label generator implementation uses all the resources contributing to the inferred triple for generating a sensitivity label for the triple. Thus, the constructor uses the setDepResources method defined in the superclass to set all its dependent components. The list of dependent resources set with this step determines the exact list of values passed to the label generating routine.

The getNumericLabel method is the label generation routine that has one argument for each resource that an inferred triple may depend on. Some arguments may be null values if the corresponding dependent resource is not set in the constructor implementation.

The label generator implementation can make use of a general-purpose static routine defined in the RDFSA_LABELGEN type to find a domination label for a given set of labels. A set of labels is passed in an instance of MDSYS.INT_ARRAY type, and the method finds a dominating label among them. If no such label exists, an exception label –1 is returned.

After you have implemented the custom label generator type, you can use the custom label generator for inferred data by assigning an instance of this type to the label_gen parameter in the SEM_APIS.CREATE_ENTAILMENT procedure, as shown in the following example:

begin
  sem_apis.create_entailment(
         index_name_in  => 'contracts_rdfsinf',
         models_in      => SDO_RDF_Models('contracts'),
         rulebases_in   => SDO_RDF_Rulebases('RDFS'),
         options        => '',
         label_gen      => CustomSPORALabel());
end;
/

5.2.2.6 RDFOLS_SECURE_RESOURCE View

The MDSYS.RDFOLS_SECURE_RESOURCE view contains information about resources secured with Oracle Label Security (OLS) policies and the sensitivity labels associated with these resources.

Select privileges on this view can be granted to appropriate users. To view the resources associated with a specific model, you must also have select privileges on the model (or the corresponding RDFM_model-name view).

The MDSYS.RDFOLS_SECURE_RESOURCE view contains the columns shown in Table 5-6.

Table 5-6 MDSYS.RDFOLS_SECURE_RESOURCE View Columns

Column Name Data Type Description

MODEL_NAME

VARCHAR2(25)

Name of the model.

MODEL_ID

NUMBER

Internal identifier for the model.

RESOURCE_ID

NUMBER

Internal identifier for the resource; to be joined with MDSYS.RDF_VALUE$.VALUE_ID column for information about the resource.

RESOURCE_TYPE

VARCHAR2(16)

One of the following string values to indicate the resource type for which the label is assigned: SUBJECT, PREDICATE, OBJECT, GLOBAL.

CTXT1

NUMBER

Sensitivity label assigned to the resource.