Is dissatisfied the opposite of satisfied?
Is discourage the opposite of recommend?
And is not recommending the same as recommending against?
When computing the Net Promoter Score (NPS), people who rate the 0–10-point likelihood-to recommend-item high (a 9 or 10) are categorized as promoters and those who give low ratings (6 or lower) are described as detractors.
The term detractor suggests (intentionally) that these customers play a role in spreading negative word of mouth by discouraging other people from making purchases. Hearing bad things (negative word of mouth) may have as large (if not a larger) impact on company growth than positive word of mouth.
But is being less likely to recommend or even saying negative things about a brand or product the same as actively discouraging others from purchasing or using?
We are generally a little skeptical of bold claims and like to see the data or look to replicate. In earlier research, we examined the claim that detractors account for 80% of negative word of mouth. We found most (90%) of the negative comments in our study were, indeed, associated with detractors.
However as noted in that analysis, it’s important to differentiate between the majority of detractors saying negative things versus the majority of negative comments coming from detractors. Not all detractors say bad things. In fact, in our analysis (and in others), some positive comments do come from detractors and some negative comments also come from promoters. However, this doesn’t negate the value of knowing where more negative comments are likely to come from.
But making a negative comment isn’t necessarily the same as actively discouraging others from engaging with a brand (regardless of where the person was on the likelihood to recommend scale, high or low). For example, here’s a negative comment about a bad experience on United Airlines from our earlier analysis of detractors:
“I took a flight back from Boston to San Francisco 2 weeks ago on United. It was so terrible. My seat was tiny and the flight attendants were rude. It also took forever to board and deboard.”
While hardly an endorsement of the airline, does this explicitly discourage others from using United?
Comments that suggest actual discouragement (for example, “I would tell my friends to stay away!”) are less common in our data.
In this article, we review data in the relevant literature to understand whether the Net Promoter Score adequately measures discouragement.
Review of Relevant Published Studies
We identified five relevant papers, four from the same group of authors led by Robert East. All the papers investigated the frequency of word-of-mouth (WOM) behaviors that are usually defined as statements that advise others to make purchases (positive word-of-mouth, PWOM) or to avoid purchases (negative word-of-mouth, NWOM) from a company or brand.
Review of the East et al. Papers
In The Relative Incidence of Positive and Negative Word of Mouth (2007), East, Hammond, and Wright analyzed 15 categories (e.g., cars, ISPs, and dentists) from a few thousand participants (on average, 153 respondents per brand in the UK between 2001-2003). Participants were asked how many times they either recommended or advised against using a brand.
They reported that 78% of PWOM was directed at the participant’s main/current brand, whereas 77% of NWOM was directed toward a former brand. Thus, as a rough approximation, three-fourths of PWOM and one fourth of NWOM is about the main brand.
After the publication of the 2007 paper, East presented that research at the 2008 ANZMAC conference (Measurement Deficiencies in the Net Promoter Score), augmented with some additional data. With the additional data, their estimates of PWOM and NWOM for the current brand changed slightly to 72% and 25%—still the same approximately three-fourths to one-fourth ratio. He also provided some early results from a study that was published later in 2008, discussed next.
In Measuring the Impact of Positive and Negative Word of Mouth on Brand Purchase Probability (East, Hammond, & Lomax, 2008), the authors conducted 11 new in person surveys in the UK between 2005-2007 with 2,544 respondents answering questions about one or two categories of purchases (e.g., cell phones, credit cards, restaurants). Respondents were asked whether they had received positive or negative advice for any of the brands listed, whether the advice was positive or negative, and whether it affected their brand choice. Their survey used the 11-point Juster Scale to assess purchase intention before and after the WOM.
Technical note: Like the standard item for likelihood to recommend, the Juster scale has eleven response options from 0 to 10. Unlike the typical unlabeled response options of likelihood to recommend (except for its endpoints), the Juster scale is fully labeled with probability statements (e.g., 0: No chance, almost no chance [1 in 100]; 1: Very slight possibility [1 in 10]; … 9: Almost sure [9 in 10]; 10: Certain, practically certain [99 in 100]). Because likelihood to recommend and Juster scale formats differ in ways that tend to be more cosmetic than substantial (e.g., full versus endpoint labeling of response options), we suspect the format differences likely have little effect on respondent behavior (but we have not yet tested this).
Overall, 64% claimed that PWOM and 48% claimed that NWOM affected their decisions. When impact was measured as the shift in purchase probability, PWOM produced a mean shift of 0.20 and NWOM produced a shift of −0.11. This indicates that PWOM is more influential than NWOM, but NWOM is itself influential and should be measured. Because most NWOM against a brand is produced by consumers who are not current users of a brand, the authors concluded that a major weakness of the typical measurement of NPS is that companies only ask it of their current customers.
In a follow-up paper published three years later (The NPS and the ACSI: A Critique and Alternative Metric), East, Romaniuk, and Lomax (2011) argued that neither the NPS nor the American Customer Satisfaction Index (ACSI) adequately measure NWOM because ex-customers and never-customers aren’t sampled in their methodologies. They further criticized the NPS based on reasoning that the intention to recommend is likely to have less influence on future purchase behaviors than the memory of having received a recommendation in favor of or against a brand.
The authors delivered surveys to homes in the UK between 2007-2008 and received 2,254 usable responses. Respondents reported whether they had given or received positive and negative advice across several categories including grocery stores, banks, luxury brands, and cell phones. Respondents completed Juster scales to measure behavioral likelihoods and a version of the three-item ACSI to measure satisfaction.
The authors totaled the negative advice reported given and then determined what percent came from detractors on their NPS-like Juster scale and ACSI-like scale (see Table 1).
| Main Supermarket | ||
| Main Coffee Shop | ||
| Skin Care Products |
Table 1: Percent of negative advice given by NPS-like detractors and ASCI-like detractors for their main brands used in three categories.
Similar to the authors’ earlier published findings, NPS detractors accounted for a minority of the total NWOM across used and unused brands (31%). The authors also showed the NPS correlated highly with the ACSI (which we also found in our earlier analysis of CSAT and NPS), with ASCI accounting for a similar amount of NWOM (28%). This suggests the inability to fully account for NWOM has less to do with the measure used but more to do with who is measured (the sampling strategy).
Review of the Schneider et al. Manuscript
In an unpublished manuscript (Measuring Customer Satisfaction and Loyalty: Improving the ‘Net-Promoter’ Score) by a different set of researchers (Schneider, Berent, Thomas, & Krosnick, 2008), the authors conducted two studies in which they manipulated rating scale labels for likelihood-to-recommend items. Although it was not the focus of their research, they did measure the association between the Net Promoter Score (standard likelihood-to-recommend item) and stated past positive and negative recommendations.
As part of their research, Schneider et al. asked 4,883 respondents questions about eight brands (automotive manufacturers and airlines), also asking whether they were familiar with the brands and whether they were customers. This research was highly exploratory including over 150 regression analyses with varying outcomes, making it difficult to construct a comprehensive narrative that could satisfactorily account for all the results.
Despite this, we were especially intrigued by comparison of the standard unipolar likelihood-to-recommend scale with a 7-point bipolar version designed to allow respondents to indicate the extent to which they recommended for or recommended against purchasing from a brand (see Figure 1).
Figure 1: Standard unipolar and Schneider et al. (2008) bipolar items for assessing likelihood to recommend (created in our MUiQ® platform from text descriptions of the items).
The key regression results for these two items appear in Table 2. Before including variables in regression analyses, Schneider et al. standardized all values to a 0-1 scale where 0 was the lowest possible value for a rating scale and 1 was the highest possible value, permitting interpretation of regression weights (the cells in Table 2) as measures of the strength of the regression model.
| Predictor | ||||
| Unipolar | ||||
| Bipolar | ||||
| Predictor | ||||
| Unipolar | ||||
| Bipolar | ||||
Table 2: Regression weights for prediction of numbers of recommendations and numbers of respondents making recommendations, both positive and negative, for all respondents and those who were brand customers. Larger absolute values suggest stronger predictive abilities.
The table shows the results for regressions modeling the predictive strength of each item (unipolar and bipolar) for PWOM (number of positive recommendations and number of people who made at least one positive recommendation for all respondents and for respondents who were brand customers) and NWOM (number of negative recommendations and number of people who made at least one negative recommendation for all respondents and for respondents who were brand customers). The standard unipolar scale was as good as or better than the bipolar scale when modeling PWOM, especially for the Customers Only condition. In contrast, the bipolar scale was as good as or better than the unipolar scale when modeling NWOM, especially for the All Respondents condition.
Findings of Schneider et al. relevant to this review are:
- Likelihood to recommend was a consistently significant predictor of the number of positive recommendations either alone or in combination with predictive measures of liking and satisfaction.
- Likelihood to recommend was a statistically significant predictor of the number of negative recommendations in simple linear regression but was not significant when combined with liking and satisfaction in multiple regression.
- Likelihood to recommend (and therefore NPS) may significantly predict negative recommendations but probably not as well as specifically measuring if people intend to recommend against (e.g., using the bipolar recommendation scale or some other metric of discouragement).
Summary and Discussion
Our review of one unpublished and four published papers about the NPS’s ability to measure recommending against a brand (i.e., NWOM, discouragement) found:
NWOM is worth measuring. PWOM appears to be more prevalent and influential than NWOM, but NWOM is itself influential and should be measured.
Surveying only existing customers is problematic for assessing NWOM. Asking only existing customers will almost surely understate the percentage of people likely to discourage or recommend against. Proper measurement of the state of a brand requires surveying customers and noncustomers.
NPS is not necessarily the best measure of recommending against. Consistent with being based on a unipolar measure of likelihood to recommend that is a strong predictor of PWOM, NPS appears to properly measure encouragement/recommendation for. It also appears to significantly measure NWOM, but it may not be the best predictor.
A bipolar scale may better predict NWOM but at a loss of benchmarks. Good benchmarks are not developed overnight, and in most cases the process takes years (and even then, there is no guarantee of broad adoption). Despite the bipolar scale’s superior prediction of NWOM, one of the desirable features of the NPS is its published benchmarks.
Future research: How much might researchers gain by asking a discouragement question in addition to the standard likelihood to recommend? We’ll explore this in an upcoming study.
Cited Papers
East, R., Hammond, K., & Wright, M. (2007). The relative incidence of positive and negative word of mouth. International Journal of Research in Marketing, 24, 175–184.
East, R. (2008). Measurement deficiencies in the Net Promoter Score. Sydney, Australia: ANZMAC.
East, R., Hammond, K., & Lomax, W. (2008). Measuring the impact of positive and negative word of mouth on brand purchase probability. International Journal of Research in Marketing, 25, 215–224.
Schneider, D., Berent, M., Thomas, R., & Krosnick, J. (2008). Measuring customer satisfaction and loyalty: Improving the ‘Net-Promoter’ score. Unpublished manuscript.
East, R., Romaniuk, J., & Lomax, W. (2011). The NPS and the ACSI: A critique and an alternative metric. International Journal of Market Research, 53(3), 327–346.




