The Next Stage in Sentiment Analysis?
Sentiment analysis is arguably the most common and prominent tool for the analysis of web and social media data. Before coding data with additional parameters, e.g. themes or author groups, it makes sense to allocate them to sentiments. Mentions are conventionally rated as neutral, negative or positive. Additionally, some allocation methods enable an ambivalent rating where a mention may be e.g. both critical (negative) and positive. Other allocation methods break down a mention into smaller content pieces that are allocated to multiple sentiment ratings according to the classical categories neutral, positive and negative.
Another question is how precise the analysis is. A number of tools offer AI engines that promise a sentiment accuracy of more than 95 percent with the help of artificial intelligence. In an internal test, our human analysts counterchecked a data panel at four points of time within 12 months, which had been processed with the help of an AI engine. The check revealed that the average accuracy of the sentiment was only slightly above 60 percent (!), not 95 percent as claimed. Though this aspect is not considered in this post, it might be worthwhile to think about.
Regardless whether the analysis is conducted by humans or with the help of computers, we believe that it is important to examine and rate the sentiment in an even more differentiated way. This post focuses on how this can be done and what our solution is.
As with so many things, we came across the idea of a more differentiated examination of the sentiment in our daily routine. We repeatedly noticed that brands and products received either very good or very bad ratings compared to their competitors. Often, however, the negative or positive effect did not feel as intense as the reports seemed to indicate. So, we did some research in order to support our gut feeling with facts. We quickly found out that, for example, a relatively large amount of positive content was available for a particular brand, but this content was limited to a relatively small number of different domains. We also found that by comparison, other brands also received positive ratings, but the positive buzz took place on a relatively large number of different domains. This was also true of the negative ratings of different brands and products.
Based on these observations, we felt that apart from the number of mentions per sentiment, it is also necessary to take the spread of positive and negative mentions on different domains into consideration in order to achieve a more differentiated sentiment assessment. Our assumptions are as follows:
- If positive or negative mentions of a brand or product exist on a relatively large number of different domains, these are "highly positive" or "highly negative" mentions.
- If positive or negative mentions of a brand or product exist on a relatively small number of different domains, these are "slightly positive" or "slightly negative" mentions.
The assumptions imply that positive or negative mentions on several different domains leave a generally more positive or more negative image of the brand or product, as the potential audience reach is higher. Conversely, it is implied that positive or negative mentions on fewer different domains leave a less positive or less negative image of the brand or product, as the potential audience reach is lower.
For the implementation, we make use of statistical tools. Firstly, the arithmetic mean (μ) is used for the positive and negative references as well as the number of different domains in the past 24 months. Furthermore, the spread (standard deviation σ) of the positive and negative references as well as the number of different domains in the past 24 months are taken into consideration.
With the help of these two statistical calculations, we were able to compare a recent time period (e.g. the current month) with the corresponding parameters of the past 24 months and derive conclusions. With the help of a scoring model from 0 to 6, we were also able to define the thresholds from which the positive or negative sentiment should be rated as slightly or highly positive or slightly or highly negative. In practice, we adapt the influence of the spread to the specific use case. In some use cases, for example, the thresholds are defined for 0.5, 1 and 1.5 standard deviations.
We assess the sentiment in five levels:
- Slightly negative
- Highly negative
- Slightly positive
- Highly positive
This more complex and work-intensive analysis is conducted especially at the overall reference level for a brand or product or only at site type and topic level in order to deliver more differentiated statements on the prevalent moods.