Digital Innovation and Entrepreneurship

Discover a new model to detect the Amazon scourge of fake reviews

Researchers examined 260,000 reviews to find 12 common characteristics

It’s a familiar scenario. You’re planning a week away, a meal at a restaurant, or a new purchase. You want to make the right choice, so you check the reviews left by previous customers.

But can you really trust the glowing recommendations and horror stories you find online?

Several high-profile legal cases suggest not, exposing how some companies use fake reviews to boost their brand image, promote individual products, and tarnish their rivals’ reputations.

Chinese tech giant Alibaba sued a third-party seller in 2016 over a ‘brushing operation’, where customers paid for products and submitted positive reviews before recouping their money.

And in 2019, the US Federal Trade Commission launched a landmark case targeting phoney Amazon reviewers who left five-star recommendations for a weight-loss supplement that caused liver failure.

These are only the tip of the iceberg. Many more cases slip under the radar.

Companies like Amazon are struggling to keep up with the flood of fraudulent reviews, despite using algorithms to identify fakes.

Machine-learning is simply not enough by itself. The technology needs a helping hand.

We analysed more than 260,000 real-world restaurant reviews collected from Yelp.com. This covered 5,044 restaurants across four US states over a five-year period.

In doing so, we identified 12 new characteristics shared by many fraudulent reviews. Half focused on the reviewers, the remaining six targeted the language and content of the reviews themselves.

For example, fraudulent reviews were likely to post uniformly extreme reviews designed to artificially enhance or damage a company reputation. On the other hand, honest reviewers adopted a more balanced approach across their posts on different products and services.

Phoney reviewers were also more likely to use short-lived accounts and engage in flurries of activity before switching to a new identity, whereas genuine customers posted fewer reviews.

Other tale-tell signs that helped to expose fraudulent reviews included:

Length: Five star reviews are likely to be longer than one star reviews, which in turn tend to be longer than two or three star reviews. However, fake reviews were likely to be shorter and less detailed. Batches of phoney reviews are also likely to contain a similar number of words.
Sentiment: Some companies hire professionals to write regular reviews. Their posts were often predictably positive or negative and were not in keeping with the overall ratings for that product.
Language: Fraudulent posts are more likely to repeat particular words or patterns of phrases. They also tend to contain more misspellings, grammatical errors, verbs, and filler words.

Using these characteristics, we trained our algorithm, M-SMOTE (modified-synthetic minority over-sampling technique), to detect fake reviews more efficiently and accurately.

We repeated this approach on three more datasets – two from Amazon and one from Yelp. Each time, our model outperformed algorithms that did not use our feature engineering approach.

Further research is required to ensure our results translate to other major online retailers, such as eBay, TripAdvisor, Walmart, and Alibaba. The characteristics we identified may also need to evolve to keep up with fraudulent reviewers if they change their behaviour to avoid detection.

Nonetheless, our findings are significant, especially when you consider how much faith many online shoppers place in reviews.

One survey found 91 per cent of customers were more likely to use a firm after reading a positive review, while 82 per cent would avoid businesses based on negative feedback.

Crucially, three-quarters said they would trust online reviews just as much as a personal recommendation from a friend, family member, or colleague.

Our model offers the tantalising prospect of algorithms that are more accurate and effective when it comes to weeding out phoney feedback, leaving behind genuine reviews that deserve that level of trust.

And the benefits need not stop there. Social media platforms have become key conduits for conspiracy theories and fraudulent information, much of it disseminated and shared by bots.

Our novel machine-learning model could help to redesign the tools used to detect the automated accounts spreading fake news and reviews. Implemented, it will serve to improve revenue-generating opportunities and customer experience for both digital platforms and businesses.

Our findings suggest that online e-commerce platforms should encourage social interactions between customers and reviewers. Enabling such communication generates rich evidence of reviewers' behaviour to enhance the detection of opinion spamming.

In an age of post-truth, where misinformation proliferates across the internet at lightening speed, what price do we place on posts we can trust?

Related News

Digital Innovation and Entrepreneurship

TikTok ban: The digital battleground between the US and China

As the deadline looms for TikTok to find a buyer, Shweta Singh explores why Donald Trump is threatening to ban the app in the US.

Digital Innovation and Entrepreneurship

Why AI is driving bored singles to break up with their dating apps

Frustrated dating app users fear AI is limiting their pool of potential partners. Anh Luong reveals how dating apps can win them back.

Digital Innovation and Entrepreneurship

Why have German carmakers lost their competitive edge?

Pietro Micheli explores how firms like BMW, Mercedez-Benz, and VW can reverse their declining fortunes.

Discover a new model to detect the Amazon scourge of fake reviews

Further reading: