The Fallacy of the Net Promoter Score
Zaki, M., Kandeil, D. A., Neely, A., & Mccoll-Kennedy, J. R. (2016)
Zaki’s working paper examines a longitudinal customer dataset from 2012–2015, which includes attitudinal data (NPS ratings and qualitative customer comments), behavioral data (transactional data), and demographic data across multiple touchpoints from 3,000 customers. The data is from a large international B2B asset-heavy service organization providing both products and services.
The authors used sentiment analysis algorithms to analyze the verbatim comments and categorize people as complainer, neutral, or satisfied.
The data came from monthly surveys which included questions on overall satisfaction, repurchase, referral, resource availability, responsiveness, communication, service completion duration, preparation, service quality, invoice timeliness, and invoice accuracy. Customers rated each question from 1 to 10, where 10 is “Very Satisfied” and 1 is “Very Dissatisfied.” The LTR question is also asked.
The authors aggregated product and service data to establish a link between likelihood to recommend and actual spending. Transactional data was converted into profitability scores using the recency, frequency, and monetary, or RFM, model. Customers are ranked from 1 to 5 based on the recency of their transaction (more recently purchased customers are more valuable and receive a 1), frequency (most frequent = 1) and monetary value (1 = spending the most, 5 = least). These are combined using some algorithm into a total score: higher scores equal more valuable customers (recent, frequent, and high dollar).
The authors then used a k-means cluster analysis to create 11 groups (1 for each point on the LTR) and the computed the average RFM score for each cluster. Note the authors made the passive category include 6, 7, and 8.
The authors then identified active customers (those who transacted within 88 days—the average number of days typical of transactions). Customers were then classified as churners (if they exceeded 88 days) or loyal. Interestingly, they only classified customers as either current and loyal or churned (and not loyal). This was their loyalty classification that NPS should predict (active or inactive), which conflicts with other definitions of loyalty.
Verbatim comments were coded as complaint, compliment, suggestion.
Three thousand customers were surveyed over three years: 70% promoter, 25% passive, and 5% detractor. The authors examined only customers who changed their NPS over the three-year period, either from promoter to passive/detractor or from detractor/passive to promoter.
The company has two types of customers: those who have a maintenance contract with the company, referred to as Customer Service Agreement (CSA) customers; and those who deal in a noncontractual setting, referred to as Product Support (PS) customers. The analysis was based only on PS customers.
The authors then compared the NPS classifications to the RFM categories to see how the customer spending pattern matched the NPS designation. The authors felt a reduction of spending or frequency over the three-year period would classify them as detractors. Reichheld’s classification of detractor is, however, someone who spreads negative word of mouth.
The authors argued that organizations should link customers’ self-proclaimed referral intentions and their actual spending patterns by comparing the NPS rating with customer transactional data to have a multi-dimensional picture. The authors did not compare the NPS performance to satisfaction performance, so their argument is to use more than just attitudinal data and also look at purchase data.
While there were 70% promoters, they found that 54% of customers were loyal under their definition and still buying from the company or using its services, and 46% were churners. This is evidence that the NPS metric misinforms managers, diverting them away from marketing actions because organizations rely on a sample of respondents’ memories of a service transactions.
The text analysis categorized respondents by verbatim comments: 63% of the company’s customers were complainers, 22% were satisfied, and 15% were neutral.
It’s interesting that there was a large mismatch between the customers’ sentiments and their NPS classification. This would suggest that people who were giving 9s and 10s were actually unhappy.
The authors’ stats of the study highlight the gross inadequacies of using the NPS as a single loyalty metric, thereby supporting the criticism of the NPS in the literature. Their conclusions and the title are a bit misleading. The article maybe should be titled “The Fallacy of Relying Only on Self-Reported Metrics and How to Analyze Verbatim Responses.”
This study defines loyalty simply as active customers in an unconventional approach. It may be harder to generalize these findings unless a similar analysis is used.
Takeaway: Using data from one New Zealand company, the authors show that verbatim comments collected after the NPS can help identify possible churning customers better. The authors neither compare the NPS to other attitudinal metrics such as satisfaction nor correlate it with company growth.