5.2 The CREATE_MODEL Procedure
The CREATE_MODEL
procedure in the DBMS_DATA_MINING
package uses the specified data to create a mining model with the specified name and mining function. The model can be created with configuration settings and user-specified transformations.
PROCEDURE CREATE_MODEL( model_name IN VARCHAR2, mining_function IN VARCHAR2, data_table_name IN VARCHAR2, case_id_column_name IN VARCHAR2, target_column_name IN VARCHAR2 DEFAULT NULL, settings_table_name IN VARCHAR2 DEFAULT NULL, data_schema_name IN VARCHAR2 DEFAULT NULL, settings_schema_name IN VARCHAR2 DEFAULT NULL, xform_list IN TRANSFORM_LIST DEFAULT NULL);
5.2.1 Choosing the Mining Function
Explains about providing mining function to CREATE_MODEL
.
The mining function is a required argument to the CREATE_MODEL
procedure. A data mining function specifies a class of problems that can be modeled and solved.
Data mining functions implement either supervised or unsupervised learning. Supervised learning uses a set of independent attributes to predict the value of a dependent attribute or target. Unsupervised learning does not distinguish between dependent and independent attributes. Supervised functions are predictive. Unsupervised functions are descriptive.
Note:
In data mining terminology, a function is a general type of problem to be solved by a given approach to data mining. In SQL language terminology, a function is an operator that returns a value.
In Oracle Data Mining documentation, the term function, or mining function refers to a data mining function; the term SQL function or SQL Data Mining function refers to a SQL function for scoring (applying data mining models).
You can specify any of the values in the following table for the mining_function
parameter to CREATE_MODEL
.
Table 5-2 Mining Model Functions
Related Topics
5.2.2 Choosing the Algorithm
Learn about providing the algorithm settings for a model.
The ALGO_NAME
setting specifies the algorithm for a model. If you use the default algorithm for the mining function, or if there is only one algorithm available for the mining function, you do not need to specify the ALGO_NAME
setting. Instructions for specifying model settings are in "Specifying Model Settings".
Table 5-3 Data Mining Algorithms
Related Topics
5.2.3 Supplying Transformations
5.2.3.1 Creating a Transformation List
The following are the ways to create a transformation list:
-
The
STACK
interface inDBMS_DATA_MINING_TRANSFORM
.The
STACK
interface offers a set of pre-defined transformations that you can apply to an attribute or to a group of attributes. For example, you can specify supervised binning for all categorical attributes. -
The
SET_TRANSFORM
procedure inDBMS_DATA_MINING_TRANSFORM
.The
SET_TRANSFORM
procedure applies a specified SQL expression to a specified attribute. For example, the following statement appends a transformation instruction forcountry_id
to a list of transformations calledmy_xforms
. The transformation instruction dividescountry_id
by 10 before algorithmic processing begins. The reverse transformation multipliescountry_id
by 10.dbms_data_mining_transform.SET_TRANSFORM (my_xforms, 'country_id', NULL, 'country_id/10', 'country_id*10');
The reverse transformation is applied in the model details. If
country_id
is the target of a supervised model, the reverse transformation is also applied to the scored target.
5.2.3.2 Transformation List and Automatic Data Preparation
Understand the interaction between transformation list and Automatic Data Preparation (ADP).
The transformation list argument to CREATE_MODEL
interacts with the PREP_AUTO
setting, which controls ADP:
-
When ADP is on and you specify a transformation list, your transformations are applied with the automatic transformations and embedded in the model. The transformations that you specify are executed before the automatic transformations.
-
When ADP is off and you specify a transformation list, your transformations are applied and embedded in the model, but no system-generated transformations are performed.
-
When ADP is on and you do not specify a transformation list, the system-generated transformations are applied and embedded in the model.
-
When ADP is off and you do not specify a transformation list, no transformations are embedded in the model; you must separately prepare the data sets you use for building, testing, and scoring the model.