You know that feeling, trying to find that one, perfect coffee machine on Amazon by scrolling through tons of reviews to find the one best suited to your needs? Think about it. If we as individual consumers are having difficulties browsing through the reviews to find those that add value, imagine the difficulty companies must have in analysing those large amounts of reviews.
The above explained difficulty occurs mainly due to what we call the ‘4 V’s of Data’: volume, variety, velocity and veracity (Salehan and Kim, 2015). That’s where the paper ‘predicting the performance of online consumer reviews: a sentiment mining approach to big data analytics’ by Salehan and Kim (2015) comes in. The paper looks at the predictors of both readership and helpfulness of online consumer reviews (OCR). Using different techniques the paper aims to create an approach that can be adopted by companies to develop automated systems for sorting and classifying large amounts of OCR. Sounds exciting, doesn’t it?! Let’s have a look at how this works.
What this paper is about
Whereas previous literature focusses at the factors that determine the perceived helpfulness of a review, this paper takes a step back. It starts by considering the factors that determine the likelihood of a consumer paying attention to a review in the first place, since without reading a review you cannot determine its helpfulness. Hence the research questions are as follows:
Research Question 1: Which factors determine the likelihood of a consumer paying attention to a review?
Research Question 2: Which factors determine the perceived helpfulness of a review?
In order to answer the research questions, the paper looks at a sample of 2616 Amazon reviews and considers several factors they believe may impact review readership, helpfulness, or both. Readership is measured as the total number of votes (helpful and not helpful), whereas helpfulness is measured as the proportion of helpful votes out of total votes. Since I see no reason to bore you with detailed methodologies, I made a quick and easy to follow overview of the different factors the paper considers using a random Amazon review as an example:
- Longevity is measured as the number of days since the review was created. It has a positive effect on readership, meaning older reviews are more likely to be read. Whereas this may sound counterintuitive, this could simply occur due to the way in which Amazon sorts the reviews, since by default users view reviews with most helpful votes first, unless they change the setting to viewing the most recent review first.
- Review – & title sentiment is measured by conducting sentiment analysis on the review content, which scores a review depending on how emotional the content is (either positive or negative). Both have a small, negative effect on helpfulness, which indicates that consumers perceive emotional content to be less rational and therefore less useful. These findings are somewhat different from previous research, which showed that reviews carrying a strong negative sentiment have a stronger impact on buyer behaviour than positive or neutral reviews.
- Title length has a small, negative effect on readership meaning that a reviews with longer titles are less likely to be read.
- Review length has a large, positive effect on both readership and helpfulness, meaning that longer reviews are read more and receive more helpfulness votes on average.
All above outlined findings are statistically significant. Whereas previous research focussed mainly on numerical rating and length of the review, this paper looks at the textual information the review contained. This means that the practical implementations are high. For example, the paper suggests that companies may use sentiment data to analyse large amounts of OCR which are constantly produced on the Internet. The paper also showed the importance of the title: make it short and not too emotional. This is something e-commerce companies can guide their customers in when writing a review.
In my opinion, a large limitation of this paper is that they use the number of ‘total votes’ as the number of times a review was read. I don’t know about you, but I certainly don’t hit the vote button every time I read a review. Hence I think using a different methodology might be better. For example you, could track customers as they move over a page, note how long they spend at the review and count the review to be ‘read’ if the this time was anywhere between e.g. 20 and 50 seconds (since you don’t want to count people that simply left the page open).
How about in practice?
This sounds great, but are there actually companies out there using similar approaches to make the life of their customers easier? A company that does this very well is Coolblue. Their aim is to be the most customer centric company of the Netherlands (Coolblue, 2018) and hence they go even further than described in the paper. Their product page contains an overview of the pros and cons for the product to allow for an easy overview. Whether these pros and cons come from frequently placed customer reviews isn’t clear. Moreover, they ask customers to fill in the pros and cons, so that customers looking to buy don’t need to read through long, unstructured sentences. Lastly, they use the review helpfulness to rank reviews according to relevance.
Salehan, M., & Kim, D. J. (2016). Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics. Decision Support Systems, 81, 30-40.
Coolblue, 2018. Yearbook 2017, Accessed via http://nieuws.coolblue.nl/jaarboek-2017/