The code for a response to a question includes an indication of the column position field) and data record it will occupy, For example gender of respondents may be coded as I for females and 2 for males field represents a single variable value or item of data, such as the gender of a single respondent, Although numeric information is most common in marketing research, a field can also contain alphabetic or symbolic information, A record consists of related fields. i.e., variable values. such as sex. marital status age household size occupation and so forth all pertaining to a single respondent. Thus. each record can ha\ e several columns, Generally all the data for a respondent will be stored on a single record. although a number of record, generally data from all the respondents in a study may be used for each respondent, data files are sets of records in a study that are grouped together for storage in the computer. If a single record is used for each respondent, records represent rows in a data file. In such a case. a data file may be viewed as n X m matrix of numbers or values, where n is the number of respondents and in is the number of variables or fields, It is often helpful to prepare a code book containing the coding instructions and the necessary information about the variables in the data set (see the opening Project Research example).
One can use a spreadsheet program. such as EXCEL, to enter the data. as most analysis programs call import data from a spreadsheet. In this case. the data fur each respondent for each field i, a cell. Typically. each row of the I. EXCEL) spreadsheet continuous the data of one respondent or case. The columns will contain the variables. with one column For each variable or response. The use of EXCEL can be complicated if there are more than 256 variables.
We illustrate these concepts using the data of Table 14.1. For illustrative purposes, we consider only a small number of observations. In actual practice. data analysis is performed on a much larger sample such as that in the Dell running case and other case, With real data that are
presented in this book. Table 14.1 gives the data from a pretest sample of 20 respondents on preferences for restaurants.
Each respondent was asked to rate preference 10 eat in a familiar restaurant (1 = Weak Preference. 7 = Strong Preference), and to rate the restaurant in terms of quality of food. quantity of portions, value, and service (I = Poor, 7 = Excellent). Annual household income was also obtained and coded as: I; Less than 520,000: 2 =$20,000 to 34.999: 3 = 535.000 to 49.999: 4 = $50.000 to 74.999: 5 = 575.000 to 99.999: 6 = $100.000 or more. The code book for coding these data is given in Figure 14.2. Figure 14.3 is an example of questionnaire coding. showing the coding of demographic data typically obtained in consumer surveys. This questionnaire was preceded, If the data of Table 14.1 are entered using either EXCEL or SPSS. the resulting data files will resemble Table 14.1. You can verify this by downloading EXCEL and SPSS files for Table 14.1 from the student Web site for this book, Note that the SPSS data tile has two views: the data view and the variable view. The data view give a listing of the data and resembles Table 14.1. The variable view gives a listing of the variables showing the type , labels or description. values. and underlying coding for each variable as shown in Table 14.2. Clicking on the “Values” column of the SPSS tile opens a “Value Labels” dialog box. Value labels are unique labels assigned to each possible value of a variable For example. I denotes weak preference and 7 denotes strong preference. If descriptors were used for the other preference values those other preference valuer would also be assigned the corresponding “Value Labels:’ The other columns of Table 14.2 are self-explanatory.
In Table 14.1. as well as in the corresponding EXCEL and SPSS files. the columns represent the fields, and the rows represent the records or respondents as there is one record per respondent, Notice that there are seven columns, The first column contains the respondent ID and the second column contains the preference for the restaurant, Columns three to six contain the evaluations of the restaurant on quality of food, quantity of portions value, and service, respectively, Finally the seventh column contains the respondent’s income, coded as specified in the code book, Each row contains all the data of a single respondent and represents a record, There are 20 rows or records, indicating that data for 20 respondents are stored in this data file, Note that Table 14.1 is a 20 x 7 matrix, as there are 20 respondents and 7 variables (including ID), Databases consist of one or more files that are interrelated, For example, a database may contain all the customer satisfaction surveys conducted quarterly for the last 5 years.