Viewpoints & Discussion:
Slider Scales and Web-Based Surveys: A Cautionary Note
Slider scales (or, visual analog scales) are becoming increasingly popular in web-based surveys. These psychometric response scales utilize Likert-type items in which respondents select a point along a line labelled with bipolar endpoints to indicate their preference or agreement with a statement. This article is a response to an article that appeared earlier in the Journal of Research Practice where the writers treated Likert-type data as interval-level measurements. The authors of this response article contend that it is erroneous to suppose the data obtained by visual analog scales to be quantitative and continuous (on an interval scale) when, in fact, these are discreet data (on an ordinal scale), which are essentially qualitative. Qualitative data are not quantitative and no means of analysis can make it so.
Index Terms: data analysis; Likert; Likert-type; measurement; measurement scales; ordinal data; slider scale; visual analog scale
Suggested Citation: Kero, P., & Lee, D. (2015). Slider scales and web-based surveys: A cautionary note. Journal of Research Practice, 11(1), Article V1. Retrieved from http://jrp.icaap.org/index.php/jrp/article/view/513/414
Note. This is a response to an article published in this journal:
Roster, C. A., Lucianetti, L., & Albaum, G. (2015). Exploring slider vs. categorical response formats in web-based surveys. Journal of Research Practice, 11(1), Article D1.
We wish to express our appreciation to authors Roster, Lucianetti, and Albaum (2015) for bringing an important issue to the attention of readers of the Journal of Research Practice. Their research question, “Do the use of sliders to express numerical amounts and the use of the more traditional radio-button scales give the same, or different, measurements?” deserves thoughtful response. Visual analog scales (VAS) are becoming increasingly popular; their screen presence is deceptively elegant and seemingly deserving of sophisticated parametric analysis (i.e., statistical tests involving assumptions about the distribution of the data). We maintain that despite the inherent appeal of VAS items, nonparametric analysis is most fitting for this type of data.
Consider the first part of the question: “Do the use of sliders to express numerical amounts . . .?” This question is really about the type of data obtained by VAS scales and corresponding radio-button items. At the heart of the matter are four types of data that have been well-established over the years. In his seminal work, On the Theory of Scales of Measurement, Stevens (1946) classifies measurement into four distinct scales: nominal, ordinal, interval, and ratio. These scales are found in most statistics textbooks and commonly accepted by researchers concerned with measurement.
Nominal scales merely label variables (e.g., gender, eye color, political affiliation, etc.) and are mutually exclusive. While numbers may be used to label responses, these have no numerical significance. These are discrete data with finite values. We can count and determine a mode (as a measure of central tendency) but that’s all we can do with nominal scale data.
Ordinal scales also have limited numerical significance because, as the name implies, the scale “orders” data so we know what is first, second, third, and so on, but we have no sense of the interval between each value. Indeed, and in all probability, the intervals between values may not be equal. Measures of central tendency for ordinal data are limited to mode or median. Nominal and ordinal scales are considered “qualitative” in nature, a point we shall return to later.
We contend that VAS scales are deceptive and while these items appear to be on an interval scale, these are characteristically ordinal. The authors Roster et al. compare five-point scale survey questions where respondents report sensitivity using a slider scale (1= Not at all sensitive ---------- 5= Extremely sensitive) to a corresponding survey item using radio buttons (Roster, Lucianetti, & Albaum, 2015, Figure 2). Certainly the label “Extremely sensitive” represents greater sensitivity than “Not at all” but the difference or interval between these labels cannot be quantified. While the interval between radio buttons #2 and #3 may be greater for Respondent A, than it is for Respondent B. We cannot say with certainty that respondents selecting #4 report four times more sensitivity than those selecting #1—we simply don’t know as these values are wholly subjective and left to each respondent’s interpretation. A respondent using a VAS scale might stop at 2.5 while another at 4.3. It seems precise but the argument is the same. Data from VAS slider items and corresponding radio-button items are either categorical or possibly ordinal scale data—nothing else.
Gardner and Martin (2007) describe Likert scales as inherently “lumpy” due to a tendency for people to bunch their responses on a Likert scale item at the extremes. Albaum (1997) outlines three “form-related” errors related to different item formats: leniency—the tendency to rate too low or too high; central tendency—a reluctance to rate at the extremes, and proximity—the tendency to rate similarly for questions close to one another in the survey. We contend these “form-related” errors are found in VAS as well as radio-button response type items. It only underscores the subjective nature of any Likert-scale item.
Why is all this so important? Because the type of data determines method of analysis. Ordinal scale data do not lend themselves to analysis of means and standard deviations but rather modes, medians, and quartiles. This point is not lost on the authors as they report χ2 values on sample demographic variables by group; all demographic characteristics are categorical and some are ordinal scale data requiring nonparamedic analysis.
This brings us to the second part of their research question, “Do . . . the use of the more traditional radio-button scales [compared to slider scales] give the same, or different, measurements?” The authors use survey item means to compare slider scales and radio-button scales and t-tests to make that determination and this is where we part ways. We contend these are ordinal scale data, not interval or ratio scale data.
Table 2 in Roster et al.’s (2015) article contains two columns with item means for each group computed, in many instances, to the thousandth place. Jöreskog (1994) writes, “Ordinal variables are not continuous variables and should not be treated as if they are. Ordinal variables do not have origins or units of measurements. Means, variances, and covariances of ordinal variables have no meaning” (Jöreskog, 1994, p. 383). How then should we interpret the item means contained in the article—in particular those means that are computed to the thousandth place? Can the concept of sensitivity truly be measured with such accuracy?
At the very least the authors should consider presenting the percent or frequency for each item to let readers decide how to interpret these results. Averaging smooths or blurs ordinal data, so the results are no longer restricted to whole numbers. We are reminded of a quote by Justice Louis Brandeis, “I abhor averages. I like the individual case. A man may have six meals one day and none the next, making an average of three meals per day, that is not a good way to live.”
Ordinal scale data obtained from VAS slider scales and radio-button items can be described in frequency tables or by modes. In this particular inquiry, a nonparametric test such as the Mann-Whitney U Test could be used in lieu of a t-test to determine whether or not the summed ranks for slider scales and radio-button scales are statistically different. The fact that these results would be limited to the survey sample does not materially degrade the authors’ investigation.
To summarize, visual analog scales are becoming increasingly popular as most web-based survey systems offer very engaging response formats. Additionally, it is highly likely that more investigators will employ powerful parametric tests to analyze these data. We respectfully caution, however, that it is easy to erroneously suppose the data obtained by these methods are quantitative and by nature continuous (interval scale or ratio scale) when, in fact, these are discreet data (nominal scale or ordinal scale) that are essentially qualitative. Simply, qualitative data are not quantitative and no means of analysis can make it so.
Albaum, G. (1997, April). The Likert scale revisited: An alternate version. Journal of the Market Research Society, 39(2), 331-348.
Gardner, H. J., & Martin, M. A. (2007). Analyzing ordinal scales in studies of virtual environments: Likert or lump it! Presence, 16(4), 439-446.
Jöreskog, K. G. (1994). On the estimation of polychoric correlations and their asymptotic covariance matrix. Psychometrika, 59(3), 381-389.
Roster, C. A., Lucianetti, L., & Albaum, G. (2015). Exploring slider vs. categorical response formats in web-based surveys. Journal of Research Practice, 11(1), Article D1. Retrieved from http://jrp.icaap.org/index.php/jrp/article/view/509/413
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2684), 677-680.
Published 31 July 2015
Copyright © 2015 Journal of Research Practice and the authors