Task: Generate Test Design

Generate a procedure or mechanism to test the model’s quality and validity. For example, in supervised data mining tasks such as classification, it is common to use error rates as quality measures for data mining models. Therefore, we typically separate the dataset into train and test sets, build the model on the train set, and estimate its quality on the separate test set.

Purpose

Relationships

Roles	Primary Performer: Data Miner/Data Scientist	Additional Performers:
Process Usage	ASUM-DM > Analyze-Design-Configure&Build > Build Model > Generate Test Design

Key Considerations

Create a test design that describes the steps you will take to test the models produced. Because modeling is an iterative process, it is important to know when to stop adjusting parameters and try another method or model.

When creating a test design, consider the following questions:

What data will be used to test the models? Have you partitioned the data into train/test sets? (This is a commonly used approach in modeling.)

How might you measure the success of supervised models (such as C5.0)?

How might you measure the success of unsupervised models (such as Kohonen cluster nets)?

How many times are you willing to rerun a model with adjusted settings before attempting another type of model?

Licensed Materials - Property of IBM. (c) Copyright IBM Corp. 2015. IBM, the IBM logo, and SPSS are trademarks of International Business Machines Corp, registered in many jurisdictions worldwide. Other products and service names may be trademarks of IBM or other companies. You may use the Content 'AS IS" or modify them, however IBM will not be responsible for any deficiencies or errors that result from modifications that you make.