Please evaluate how accurately each word or phrase describes each of the department stores. Select a plus number for the phrases you think describe the store accurately. The more accurately you think the phrase describes the store. the larger the plus number you should choose. You should select a minus number for phrases you think do not describe it accurately. The less accurately you think the phrase describes the store. the larger the minus number you should choose. Youcan select any number. from +5 for phrases you think are very accurate. to -5 for phrases you think are very inaccurate
The data obtained by using a Stapel scale are generally treated as interval and can be analyzed in the same way as semantic differential data. The Stapel scale produces results similar to the semantic differential. The Stapel scale’s advantages are that it does not require a pretest of the adjectives or phrases to ensure true bipolarity, and it can be administered over the telephone. However, some researchers believe the Stapel scale is confusing and difficult to apply. Of the three itemized rating scales considered, the Stapel scale is used least. However, this scale merits more attention than it has received
Non comparotive Itemized Rating Seole Decisions
As is evident from the discussion so far. non comparative itemized rating scales need not be used as originally proposed but can take many different forms. The researcher must make six major decisions when constructing any of these scales.
1. The number of scale categories to use
2. Balanced versus unbalanced scale
3. Odd or even number of categories
4. Forced versus non forced choice
5. The nature and degree of the verbal description
6. The physical form of the scale
Number of Scale Categories
Two conflicting considerations arc involved in deciding the number of scale categories. The greater the number of scale categories. the finer the discrimination among stimulus objects that is possible. On the other hand. most respondents cannot handle more than a few categories. Traditional guidelines suggest that the appropriate number of categories should be seven plus or minus two: between five and nine.’? Yet there is no single optimal number of categories. Several factors should be taken into account In deciding on the number of categories.
If the respondents are interested in the scaling task and are knowledgeable about the objects, a larger number of categories may be employed. On the other hand, if the respondents are not ve y knowledgeable or involved with the task. fewer categories should be used. Likewise, the nature of the objects is also relevant. Some objects do not lend themselves to fine discrimination, so a small number of categories is sufficient Another important factor is the mode of data collection. If telephone interviews are Involved, many categories may confuse the respondents. Likewise, space limitations may restrict the number of categories in mail questionnaires
Balanced Versus Unbalanced Scales
In a balanced scale, the number of favorable and unfavorable categories are equal; i~’an unbalanced scale, they are unequal.!” Examples of balanced and unbalanced scales are given in Figure 9.1. In general, the scale should be balanced in order to obtain objective data. However, if the distribution of responses is likely to be skewed, either positively or negatively, an unbalanced scale with more categories in the direction of skewness may be appropriate. If an unbalanced scale is used, the nature and degree of unbalance in the scale should be taken into account in data analysis
Odd or Even Number of Categories
With an odd number of categories, the middle scale position is generally designated as neutral or impartial. The presence, position, and labeling of a neutral category can have a significant influence on the response. The Likert scale is a balanced rating scale with an odd number of categories and a neutral point
The decision to use an odd or even number of categories depends on whether some of the respondents may be neutral on the response being measured. If a neutral or indifferent response is possible from at least some of the respondents, an odd number of categories should be used. If, on the other hand, the researcher wants to force a response or believes that no neutral or indifferent response exists, a rating scale with an even number of categories should be used. A related issue is whether the scale should be forced or non forced.
Forced Versus Nonforced Scales
On forced rating scales, the respondents are forced to express an opinion. because a “no opinion” option is not provided. In such a case, respondents without an opinion may mark the middle scale position. If a sufficient proportion of the respondents do not have opinion’> on the topic, marking the middle position will distort measures of central tendency and variance. In situations where the respondents are expected to have no opinion, as opposed to simply being reluctant to disclose it, the accuracy of data may be improved by a non forced scale that includes a “no opinion” category
Nature and Degree of Verbal Description
The nature and degree of verbal description associated with scale categories varies considerably and can have an effect on the responses. Scale categories may have verbal, numerical, or even pictorial descriptions. Furthermore, the researcher must decide whether to label every scale category, some scale categories, or only extreme scale categories. Surprisingly, providing a verbal description for each category may not improve the accuracy or reliability of the data. Yet an argument can be made for labeling all or many scale categories to reduce scale ambiguity. The
category descriptions should be located as close to the response categories as possible.
Physical Form or Configuration
number of options are available respect to scale form or configuration. Scales can be presented vertically or horizontally. Categories can be expressed by boxes, discrete lines, or units on a continuum and mayor may not have numbers assigned to them. If numerical values are used, they may be positive, negative, or both. Several possible configurations are presented in Figure 9.2.
1. Develop Likert, semantic differential, and Stapel scales for measuring customer satisfaction toward Sears.
2. Illustrate the six itemized rating scale decisions of Table 9.2 in the context of measuring customer sati .faction toward Sears
A multi-item scale consists of multiple items, where an item is a single question or statement to be evaluated. The Likert, semantic differential, and Stapel scales presented earlier to measure attitudes
toward Sears are examples. of multi-item scales. Note that each of these scales has multiple items. The development of multi-item rating scales requires considerable technical expertise.’? Figure 9.1 is a paradigm for constructing multi-item scales. The researcher begins by developing the construct of interest. A const.. -ct is a specific type of concept that exists at a higher level of abstraction than do everyday cone, ,):s, such as brand loyalty, product involvement, attitude, satisfaction, and so forth. Next, the researcher must develop a theoretical definition of the construct that states the meaning of the central idea or concept of interest. For this, we need an underlying theory of the construct being measured. A theory is necessary not only for constructing the scale but also for interpreting the resulting scores. For example, brand loyalty may be defined as the consistent repurchase of a brand prompted by a favorable attitude toward the brand~ The construct must be ope rationalized in a way that is consistent with the theoretical definition. The operational definition specifies which observable characteristics will be measured and the process of assigning value to the construct. For example, in the context of toothpaste purchases, consumers will be characterized as brand loyal if they exhibit a highly favorable attitude (top quartile) and have purchased the same
brand on at least four of the last five purchase occasions
The next step is to generate an initial pool of scale items. Typically, this is done based on theory, analysis of secondary data, and qualitative research. From this pool, a reduced set of potential scale items is generated by the judgment of the researcher and other knowledgeable individuals. Some qualitative criterion is adopted to aid their judgment. The reduced set of items is still too large to constitute a scale. Thus, further reduction is achieved in a quantitative manner
Data are collected on the reduced set of potential scale items from a large pretest sample of respondents. The data are analyzed using techniques such as correlations, exploratory factor analysis, confirmatory factor analysis, cluster analysis, discriminant analysis, and statistical tests discussed later in this book. As a result of these statistical analyses, several more items are eliminated, resulting in a purified scale. The purified scale is evaluated for reliability and validity by collecting more data from a different sample (see the following section). On the basis of these assessments, a set of scale items is selected. As can be seen from Figure 9.4, the scale development process is an iterative one with several feedback loops.