Task: Generate Test Design
Generate a procedure or mechanism to test the model’s quality and validity. For example, in supervised data mining tasks such as classification, it is common to use error rates as quality measures for data mining models. Therefore, we typically separate the dataset into train and test sets, build the model on the train set, and estimate its quality on the separate test set.
Purpose

Generate a procedure or mechanism to test the model’s quality and validity. For example, in supervised data mining tasks such as classification, it is common to use error rates as quality measures for data mining models. Therefore, we typically separate the dataset into train and test sets, build the model on the train set, and estimate its quality on the separate test set.

Relationships
RolesPrimary Performer: Additional Performers:
Process Usage
Key Considerations

Create a test design that describes the steps you will take to test the models produced. Because modeling is an iterative process, it is important to know when to stop adjusting parameters and try another method or model.

When creating a test design, consider the following questions:

What data will be used to test the models? Have you partitioned the data into train/test sets? (This is a commonly used approach in modeling.)

How might you measure the success of supervised models (such as C5.0)?

How might you measure the success of unsupervised models (such as Kohonen cluster nets)?

How many times are you willing to rerun a model with adjusted settings before attempting another type of model?