• Aucun résultat trouvé

Lift Chart Using SAS Macro LIFT

Dans le document Data Mining Using (Page 186-191)

Supervised Learning Methods: Prediction

Step 1. Fit a simple BLR model:

5.6 Lift Chart Using SAS Macro LIFT

The LIFT macro is a powerful SAS application for producing a LIFT chart (if–then analysis) in regression models. Options are available for graph-ically comparing the predicted response from the full model and the predicted response from the reduced model (keeping the variable of interest at a constant level). SAS procedures REG, GLM, and LOGISTIC can be used in the macro. In addition to these SAS procedures, the GPLOT procedure is also utilized in the LIFT macro. No procedures or options are currently available in the SAS system to produce LIFT charts automatically.

Software requirements for using the LIFT macro include:

3456_Book.book Page 177 Wednesday, November 20, 2002 11:34 AM

SAS/CORE, SAS/BASE, SAS/STAT, and SAS/GRAPH must be licensed and installed at the site.

SAS version 8.0 and above is recommended for full utilization.

An active Internet connection is required for downloading the LIFT macro from the book website if the companion CD-ROM is not available.

5.6.1 Steps Involved in Running the Lift Macro

1. Create an SAS dataset (permanent or temporary) containing at least one response (target) variable and many continuous or categorical predictor (input) variables.

2. If the companion CD-ROM is not available, first verify that the Internet connection is active. Open the LIFT.sas macro-call file in the SAS PROGRAM EDITOR window. Instructions are given in the Appendix regarding downloading the macro-call and sample data files from the book website. If the companion CD-ROM is available, the LIFT.sas macro-call file can be found in the mac-call folder in the CD-ROM. Open the LIFT.sas macro-call file in the SAS PROGRAM EDITOR window. Click the RUN icon to submit the macro-call file LIFT.sas to open the macro-call window called LIFT (Figure 5.16).

3. Input the appropriate parameters in the macro-call window by following the instructions provided in the LIFT macro help file in Section 5.6.2. After inputting all the required macro parameters, be sure the cursor is in the last input field and the RESULTS VIEWER window is closed, then hit the ENTER key (not the RUN icon) to submit the macro.

4. Examine the LOG window (only in DISPLAY mode) for any macro execution errors. If any errors are reported in the LOG window, activate the PROGRAM EDITOR window, resubmit the LIFT.sas macro-call file, check the macro input values, and correct any input errors.

5. If no errors are found in the LOG window, activate the PROGRAM EDITOR window, resubmit the LIFT.sas macro-call file, and change the macro input value from DISPLAY to any other desirable format (see Section 5.6.2). The SAS output files from the LIFT analysis and LIFT charts can be saved as user-specified-format files in the user-specified folder.

5.6.2 Help File for Using SAS Macro LIFT

1. Macro-call parameter: Input SAS dataset name (r equired parameter).

3456_Book.book Page 178 Wednesday, November 20, 2002 11:34 AM

Descriptions and explanation: Include the name of the tempo-rary (member_name) or permanent (libname.member_name) SAS dataset on which the regression analysis is to be performed.

Options/examples:

Permanent SAS dataset: gf.sales (LIBNAME: gf; SAS dataset name: sales)

Temporary SAS dataset: sales (SAS dataset name)

2. Macro-call parameter: Input class variable names (optional statement).

Descriptions and explanation: To include categorical variables from the SAS dataset as predictors in regression modeling, input the names of these variables. Use this with SAS procedures GLM and LOGISTIC (version 8.0 and later).

Options/examples:

month manager (regression modeling using PROC GLM or LOGISTIC)

Blank (if the macro input field is left blank, PROC REG or LOGISTIC)

3. Macro-call parameter: Input response variable name (required parameter).

Descriptions and explanation: Input the response variable name from the SAS dataset to be modeled as the target variable.

Options/examples:

Y (name of a continuous response; PROC REG or GLM) Y (name of a binary response; PROC LOGISTIC)

4. Macro-call parameter: Input variable name of interest (required option).

Descriptions and explanation: Input the name of the predictor variable to control in the if–then analysis. The variable of interest can be continuous (PROC REG, GLM, or LOGISTIC) or a binary categorical variable (GLM or LOGISTIC).

Option/example:

X1 Manager

5. Macro-call parameter: Input the SAS PROC name (required statement).

Descriptions and explanation: Input the SAS procedure name for fitting the regression models.

Options/examples:

REG: MLR with continuous response and continuous pre-dictor variables

GLM: MLR with continuous response and continuous and categorical predictor variables

3456_Book.book Page 179 Wednesday, November 20, 2002 11:34 AM

LOGISTIC: Logistic regression with binary response and continuous and categorical predictor variables

(For details about specifying model statements, refer to the SAS online manuals on PROC REG,34 PROC GLM,35 and PROC LOGISTIC.36) 6. Macro-call parameter: LIFT variable fixed value (required

statement).

Descriptions and explanation: Input the fixed value for the variable of interest.

Options/examples:

50000 (fixed value for variable X1; applicable in REG, GLM, and LOGISTIC)

D (fixed categorical level when the variable of interest is categorical; valid only in GLM and LOGISTIC)

7. Macro-call parameter: Input model terms (required statement).

Descriptions and explanation: This macro input is equivalent to the right side of the equal sign in the PROC REG, PROC GLM, or PROC LOGISTIC model statement. In the case of fitting MLR with continuous variables using PROC REG, input names of the continuous variables, quadratic, and cross-product terms. Note that quadratic and cross-product terms should be created in the SAS dataset before specifying them in the macro field; however, to fit a regression model with indicator variables using PROC GLM/LOGISTIC (version 8.0), input continuous predictor variable names, categorical variable names you specified in macro input field #3, and any possible quadratic and interaction terms.

Options/examples:

X1 X2 X3 X2X3 X2SQ (MLR using PROC REG, where X1, X2, and X3 are linear predictors; X2X3, the interaction term;

and X2SQ, the quadratic term for X2)

X1 SOURCE X1*SOURCE (regression with indicator variable SOURCE using PROC GLM/LOGISTIC, where X1 is the linear predictor; SOURCE, the indicator variable; and X1*SOURCE, the interaction term between X1 and SOURCE)

8. Macro-call parameter: Other optional statements (optional statement).

Descriptions and explanation: Input any other optional state-ments associated with the SAS procedures; see the example below.

Options/examples:

1.LSMEANS source/pdiff ; contrast ‘source a vs. b’ source –1 1 0

(only valid in GLM and LOGISTIC)

Note that if you are using more than one statement, use a

“;” at the end of the first statement.

3456_Book.book Page 180 Wednesday, November 20, 2002 11:34 AM

2.Test x1–x2=0; Restrict intercept=0 (only valid in PROC REG)

Note that if you are using more than one statement, use a

“;” at the end of the first statement.

(For details about specifying other PROC statements, refer to the SAS online manuals on PROC REG,34 PROC GLM,35 and PROC LOGISTIC.36)

9. Macro-call parameter: Input ID variable (optional statement).

Descriptions and explanation: For a unique ID variable that can be used to identify each record in the database, input that variable name here. This will be used as the ID variable so that any out-lier/influential observations can be detected. If no ID variable is available in the dataset, leave this field blank. This macro can create an ID variable based on the observation number from the database.

Option/example:

ID NUM

10. Macro-call parameter: Input weight variable (optional parameter).

Descriptions and explanation: To perform a weighted regression analysis (weights to adjust for influential observations or heterosce-dasticity) and the weight exists in the dataset, input the name of the weight variable. If this parameter is left blank, a weight of 1 will be used for all observations.

Options/examples:

Wt (name of the weight variable) Blank (1 will be used as the weight)

11. Macro-call parameter: Input any optional procedure options (optional parameter).

Descriptions and explanation: To include any optional PROC options (REG and GLM: NOPRINT; LOGISTIC: DESCENDING), specify these options here. If PROC LOGISTIC is being used and the DESCENDING option is not specified, by default PROC LOGIS-TIC will model the probability of a non-event as 0.

Options/examples:

NOPRINT (REG and GLM) DESCENDING (LOGISTIC)

(For details about specifying PROC options, refer to the SAS online manuals on PROC REG,34 PROC GLM,35 and PROC LOGISTIC.36) 12. Macro-call parameter: Folder to save SAS graphics and output

files (optional statement).

Descriptions and explanation: To save the SAS graphics files in an EMF format suitable for inclusion in PowerPoint presentations, specify output format as TXT in version 8.0 or later. In pre-8.0 versions, all graphic format files will be saved in a user-specified

3456_Book.book Page 181 Wednesday, November 20, 2002 11:34 AM

folder. Similarly, output files in WORD, HTML, PDF, and TXT formats will be saved in the user-specified folder. If this macro field is left blank, the graphics and output files will be saved in the default folder.

Option/example:

c:\output\ — folder named “OUTPUT”

13. Macro-call parameter: ith number of analysis (required statement).

Descriptions and explanation: SAS output files will be saved by forming a file name from the original SAS dataset name and the counter value provided in this field. For example, if the original SAS dataset name is “sales” and the counter number included is 1, the SAS output files will be saved as “sales1” in the user-specified folder. By changing the counter value, users can avoid replacing previous SAS output files with new outputs.

14. Macro-call parameter: Display or save SAS output and graphs (required statement).

Descriptions and explanation: Option for displaying all out-put/graphics files in the OUTPUT/GRAPHICS window or saving as a specific format in a folder specified in macro input option #12.

Options/examples: See Section 5.5.2 (macro input option #11) for detailed explanations for these options:

DISPLAY WORD WEB PDF TXT

Note: System messages are deleted from the LOG window if DIS-PLAY is not selected as the input.

5.7 Scoring New Regression Data Using the SAS

Dans le document Data Mining Using (Page 186-191)