Is running text analytics on social media data worth the effort?
Download PDF

Is running text analytics on social media data worth the effort?

best kind of feedback

We’re often asked if we can analyse comments from Facebook and Twitter feeds and other social media sources. After all, it is customer feedback – and that’s our bag!

The answer is “Yes, of course we CAN”. On a technical level, we can take the comments, feed them into our text analysis engine, and provide clients with a file of data categorized by topic.

But SHOULD we be doing this? The real question is:

“Is running text analytics on Facebook and Twitter etc. content worth the effort? ”


Over the last few years, we have undertaken a number of social media pilot studies for clients, all of whom were keen to see what value there was in it before they paid for it to be analysed. Data was scraped from the client’s own Facebook and Twitter pages and the results were reviewed manually by Feedback Ferret.

Here’s what we found:

1. Too Fluffy

There were a few comments that pass the human eyeball test but many of these were just general statements and unlikely to generate any meaningful insight. If you plan to use social media data to make actionable customer experience improvements, you are looking in the wrong place.

2. Missing Context

In both Facebook and Twitter, there were lots of one or two word answers such as “thanks”, “congratulations” or “good job”, and many simply stating a brand name or outlet name, which tell you nothing from an insight perspective. A subject without any information is a non-starter.

3. Curated Content

Any really juicy customer issues were often taken offline (from public to private / direct message) by someone on the client side, to avoid airing too much dirty laundry in public. Your most actionable insights are not included in the data being analysed.

4. Duplication

There were lots of retweets with the same information in multiple comments. Although at first glance it looked like there were plenty of comments to analyse, many of these were duplicates. In one data set we analysed, there were 8,000 records and 4,500 of them were simply retweets of one tweet!

The only choice is to either a) eliminate dupes which significantly decreases the data set, or b) leave them in, which creates redundant results. It’s worth remembering that the cost of text analytics is based on volume of comments.

So with social media data, you are not only paying for getting the data but you are also paying to analyse a great deal of worthless insight.

5. Data Quality

The quality of the language within the comments was dubious. This was probably down to the limitations of the scraping tools which result in character set problems. For example, some comments had strange characters like “didn’t” and some emojis are converted inconsistently by the scraping tools.

Character set mismatches remain a bugbear of the text analytics world as they affect the quality of any automatic analysis.

6. Privacy

Many content providers are bound by content agreements which prohibit inclusion of non-public posts. Private posts generally hold the juicy content, so this is a huge limitation which impacts both the quality and quantity of results (and likely skews the data set).

It is also difficult to establish from Facebook and Twitter comments who the real person is and what product or service they’ve bought from which outlet.

7. Marketing Skew

Organisations push out content geared towards inspiring interaction with customers. If you are looking for a snapshot about the real customer experience, responses to these social media marketing messages are probably not what you’re looking for.



After multiple attempts with different clients, we’ve concluded that the quality of comments on Facebook is better than Twitter but there simply isn’t a sufficient quantity of meaningful records to warrant analysing them in any detail. Even the largest brands with a high level of social media interactions are not immune to the 7 pitfalls mentioned above.



Before embarking on any analysis of Facebook and Twitter data, we’d suggest eyeballing the data first. Take 500 records, read them and see if they contain any meaningful insight before paying a text analytics provider to automatically analyse them.

We don’t want to come across as being a Negative Nelly, but our recommendation is to use social media as a marketing and customer service instrument, and by all means use a specialised tool to monitor your channels and code important posts so reporting is possible. Monitoring and responding to individuals on your social media channels is an extremely critical business function – there is no question about this.

But the aggregated social media insights gleaned from automated text analytics will be far inferior to other sources of customer feedback. When you really want to know what customers think, use open-ended feedback from surveys and online reviews. They really do provide much richer customer experience insights, and are far better food for our Feedback Ferret text analytics engine to digest. For you, that means more succinct, insightful and actionable results.

Social Media is seen as sexy, and often grabs the spotlight – especially in the eyes of execs. Remind them of the end goal, which is to make business improvements.

Use insights from your richest sources to inform your improvements, and those systemic improvements will cascade into all areas – including the comments you see on social media.