38.14 PREDICTION_COST

Syntax

prediction_cost::=

Analytic Syntax

prediction_cost_analytic::=

cost_matrix_clause::=

mining_attribute_clause::=

mining_analytic_clause::=

See Also:

"Analytic Functions" for information on the syntax, semantics, and restrictions of mining_analytic_clause

Purpose

PREDICTION_COST returns a cost for each row in the selection. The cost refers to the lowest cost class or to the specified class. The cost is returned as BINARY_DOUBLE.

PREDICTION_COST can perform classification or anomaly detection. For classification, the returned cost refers to a predicted target class. For anomaly detection, the returned cost refers to a classification of 1 (for typical rows) or 0 (for anomalous rows).

You can use PREDICTION_COST in conjunction with the PREDICTION function to obtain the prediction and the cost of the prediction.

cost_matrix_clause

Costs are a biasing factor for minimizing the most harmful kinds of misclassifications. For example, false positives might be considered more costly than false negatives. Costs are specified in a cost matrix that can be associated with the model or defined inline in a VALUES clause. All classification algorithms can use costs to influence scoring.

Decision Tree is the only algorithm that can use costs to influence the model build. The cost matrix used to build a Decision Tree model is also the default scoring cost matrix for the model.

The following cost matrix table specifies that the misclassification of 1 is five times more costly than the misclassification of 0.

ACTUAL_TARGET_VALUE PREDICTED_TARGET_VALUE       COST
------------------- ---------------------- ----------
                  0                      0          0
                  0                      1          1
                  1                      0          5
                  1                      1          0

In cost_matrix_clause:

  • COST MODEL indicates that scoring should be performed by taking into account the scoring cost matrix associated with the model. If the cost matrix does not exist, then the function returns an error.

  • COST MODEL AUTO indicates that the existence of a cost matrix is unknown. If a cost matrix exists, then the function uses it to return the lowest cost prediction. Otherwise the function returns the highest probability prediction.

  • The VALUES clause specifies an inline cost matrix for class_value. For example, you could specify that the misclassification of 1 is five times more costly than the misclassification of 0 as follows:

       PREDICTION (nb_model COST (0,1) VALUES ((0, 1),(1, 5)) USING *)
    

    If a model that has a scoring cost matrix is invoked with an inline cost matrix, then the inline costs are used.

    See Also:

    Oracle Data Mining User's Guide for more information about cost-sensitive prediction.

Syntax Choice

PREDICTION_COST can score the data in one of two ways: It can apply a mining model object to the data, or it can dynamically mine the data by executing an analytic clause that builds and applies one or more transient mining models. Choose Syntax or Analytic Syntax:

  • Syntax — Use the first syntax to score the data with a pre-defined model. Supply the name of a model that performs classification or anomaly detection.

  • Analytic Syntax — Use the analytic syntax to score the data without a pre-defined model. The analytic syntax uses mining_analytic_clause , which specifies if the data should be partitioned for multiple model builds. The mining_analytic_clause supports a query_partition_clause and an order_by_clause. (See "analytic_clause::=".)

    • For classification, specify FOR expr, where expr is an expression that identifies a target column that has a character data type.

    • For anomaly detection, specify the keywords OF ANOMALY.

The syntax of the PREDICTION_COST function can use an optional GROUPING hint when scoring a partitioned model. See GROUPING Hint.

mining_attribute_clause

mining_attribute_clause identifies the column attributes to use as predictors for scoring. When the function is invoked with the analytic syntax, these predictors are also used for building the transient models. The mining_attribute_clause behaves as described for the PREDICTION function. (See "mining_attribute_clause".)

See Also:

Note:

The following example is excerpted from the Data Mining sample programs. For more information about the sample programs, see Appendix A in Oracle Data Mining User's Guide.

Example

This example predicts the ten customers in Italy who would respond to the least expensive sales campaign (offering an affinity card).

SELECT cust_id 
FROM (SELECT cust_id,rank()
       OVER (ORDER BY PREDICTION_COST(DT_SH_Clas_sample, 1 COST MODEL USING *)
            ASC, cust_id) rnk
        FROM mining_data_apply_v
        WHERE country_name = 'Italy')
  WHERE rnk <= 10
  ORDER BY rnk;
 
   CUST_ID
----------
    100081
    100179
    100185
    100324
    100344
    100554
    100662
    100733
    101250
    101306