Data Cleaning: The Burke Way Marketing Research Help

According to Damon Jones, Office Manager. Data Acquisition and Processing. Burke. Inc. ( completed questionnaires from the field often have many small errors because of the inconsistent quality of interviewing. For example. qualifying responses are not circled. or skip patterns are not followed accurately.

These small errors can be costly. When responses from such questionnaires are put into a computer. Burke runs a cleaning program that checks for completeness and logic. Discrepancies are identified and checked b) the tabulation supervisors. Once the errors are identified. appropriate corrective action is taken before data analysis is carried out. Burke has found that this procedure substantially increases the quality of statistical results.

The department store example describes the various phases of the data-preparation process. Note that the process is initiated while the fieldwork is still in progress. The Burke example describes the importance of cleaning data and identifying and correcting errors before the data are analyzed. A systematic description of the data-preparation process follows.

The Burke Way

The Burke Way

The Data-Preparation Process 

The data-preparation process is shown in Figure 14.\. The entire process is guided by the preliminary plan of data analysis that was formulated in the research design phase (Chapter 3). The first step is to check for acceptable questionnaires. This i~ followed by editing. coding, and transcribing the data. The data are cleaned and a treatment for ‘ilissing responses prescribed. Often, statistical adjustment of the data may be necessary to make them representative of the population of interest. The researcher should then select an appropriate data analysis strategy. The final data analysis strategy differs from the preliminary plan of data analysis due to the information and insights gained since the preliminary plan was formulated. Data preparation should begin as soon as the first batch of questionnaires is received from the field, while the fieldwork is still going on. Thus if any problems are detected, the fieldwork can be modified to incorporate corrective action.

Questionnaire Checking 

The initial step in questionnaire checking involves a check of all questionnaires )for completeness and interviewing quality. Often these checks are made while fieldwork is still underway. If the fieldwork was contracted to a data-collection a~y. the researcher should make an independent check after it is over. A questionnaire returned from the field may be unacceptable for several reasons .

Preparing Preliminary  Plan of DaIa Analysis

Questionnaire Checking




Data Cleaning

Statistically Adjusting the Data

Selecting a Data Analysis Strategy

  1. Parts of the questionnaire may be incomplete .
  2. The pattern of responses may indicate that the respondent did not understand or follow the instructions. For example, skip patterns may not have been followed.
  3. The responses show little variance. For example, a respondent has checked only 4s on a series of 7-point rating scales.
  4. The returned questionnaire is physically incomplete: one or more pages are missing.
  5. The questionnaire is received after the preestablished cutoff date.
  6. The questionnaire is answered by someone who does not qualify’ for participation.

If quotas or cell group sizes have been imposed, the acceptable questionnaires should be classified and counted accordingly. Any problems in meeting the sampling requirements should be identified and corrective action taken, such as conducting additional interviews in the underrepresented cells, before the data are edited.


Editing is the review of the questionnaires with the objective of increasing accuracy and precision. It consists of screening questionnaires to identify illegible, incomplete, inconsistent, or ambiguous responses.

Responses may be illegible if they have been poorly recorded. This is particularly common in questionnaires with a large number of unstructured questions. The data must be legible if they are to be properly coded. Likewise, questionnaires may be incomplete to varying degrees. A few or many questions may be unanswered.

At this stage, the researcher makes a preliminary check for consistency. Certain obvious inconsistencies can be easily detected. For example, a respondent reports an annual income of less than $20.000, yet indicates frequent shopping at prestigious department stores such as Saks Fifth Avenue and Neiman Marcus.

Responses to unstructured questions may be ambiguous and difficult to interpret clearly. The answer may be abbreviated, or some ambiguous words may have been used. For structured questions, more than one response may be marked for a question designed to elicit a single response.

Suppose a respondent circles 2 and 3 on a 5-point rating scale. Does this mean that 2.5 was intended? To complicate matters further, the coding procedure may allow for only a single-digit response.

Treatment of Unsatisfactory Responses 

Unsatisfactory responses are commonly handled by returning to the field 10 get better data. assigning missing values, or discarding unsatisfactory respondents.

RETURNINGTO THEnn,o The questionnaires with unsatisfactory responses may be returned to the field. where the interviewers recontact the respondents. This approach is particularly attractive for business and industrial marketing surveys. where the sample sizes are small and the respondents are easily identifiable. However. the data obtained the second time may be different from those obtained during the original survey. These differences may be attributed to changes over time or differences in the mode of questionnaire administration (e.g., telephone versus in-person interview).

ASSIGNINGMISSINGVALUES If returning the questionnaires to the field is not feasible. the editor may assign missing values to unsatisfactory responses. This approach may be desirable if (\) the number of respondents with unsatisfactory responses is small, (2) the proportion of unsatisfactory responses for each of these respondents is small, or (3) the variables with unsatisfactory responses are not the key variables.

DISCARDING UNSATISFACTORY RESPONDENTS In this approach. the respondents with unsatisfactory responses are simply discarded. This approach may have merit when (I) the proportion of unsatisfactory respondents is small (less than 10 percent). (2) the sample size is large, (3) the unsatisfactory respondents do not differ from satisfactory respondents in obvious ways (e.g., demographics, product usage characteristics). (4) the proportion of unsatisfactory responses for each of these respondents is large. or (5) responses on key variables are missing. However. unsatisfactory respondents may differ from satisfactory respondents in systematic ways. and the decision to designate a respondent as unsatisfactory may be subjective. Both these factors bias the results. If the researcher decides to discard unsatisfactory respondents. the procedure adopted to identify these respondents and their number should be reported.

Posted on November 30, 2015 in Data Preparation

Share the Story

Back to Top
Share This