Compute & Conquer: Generals of text mining
The general who wins the battle makes many calculations in his temple before the battle is fought. The general who loses makes but few calculations beforehand.
In the previous part, we have seen how text analytics (TA) can be used to build virtual map (key players and their relationships) of the market segment of our interest (a.k.a. battlefield) and in the following one we will dig down deeper into data to discover dynamics of the fight - who is winning or loosing and why. If you have missed the introductory part for some reason, please
follow this link. Now, we will demonstrate all of this on the group of mobile network operators that we have created in the previous part. Our collection of news articles grew by that time up to 674 stories. We can recall that we have three key players: Orange Slovakia, T-com/T-mobile (part of Deutsche Telekom) Slovakia and Telefonica Slovakia (formerly known under O2 brand).
|
Picture 1: Network operators |
First we will analyze T-mobile/T-com brand:
|
Picture 2: Results for Tcom/Tmobile Slovakia |
On the picture we can see interesting topic - red square region with terms
"pokuta"(fine)/"vypadok"(crash/outage)/"linka"(line) - which represents articles dealing with consequences of a emergency line (112) crash in Zilina region, operated by Tcom (the same company that runs Tmobile brand), that caused unnecessary death of one woman (more
details here). Being in the news because of service failure with such an impact has a negative effect on brand perception:
|
Picture 3: Articles regarding consequences for emergency line crash. |
Unfortunately, the above was not the only problem that got attention in the media - from the results
(still picture 2 - 3d row - terms ("zhromazdenie (assembly, meeting), "valne" (general, company), dividend") follows that another problematic situation occurred when the major shareholder in Tcom/T-mobile (Deutsche Telekom - DT) decided on the annual general meeting not to pay dividends back to its shareholders, which affected mainly the Slovak Republic and its government that still has significant (but minority) shares in the company. The recently generated negative image of Tcom in the eyes of slovak consumers was further strengthened by the fact that DT used the dividend issue to exercise a pressure on the government to sell its remaining shares
(again picture 2, 5th row, terms "podiel"(stake)/"rokovat"(discussed)/"akcia"(share)/"vlada"/(government).
|
Picture 4: Articles regarding DT's intent to buy remaing shares of Tcom from Slovak government |
Fortunately, there are also topics, that on the contrary build positive image of Tmobile/Tcom brand (as can be seen from picture 5). Tmobile is obviously more active (attacking strategy) than Orange and its activity is the main reason why direct benefits for customers emerge - for example via cheaper roaming offers, better network coverage, introduction of new smarphones or price reductions on existing ones.
|
Picture 5: Positive elements of T-com/Tmobile's presence in the media |
To sum things up for Tcom/Tmobile, we have:
- Problematic dividends, DT's pressure on Slovak government (rows 3 and 5).
- Emergency line crash and fine from authorities as consequence (7th row).
- Better and cheaper roaming services (summer is here) for existing customers (2nd row - terms "zakaznik"customer/roaming/"vyhodne"(better price)).
- Cheaper smartphones and tablets (popular Samsung galaxy) of course in connection with customer acquisition for pre-paid services (9th row).
- Coverage expansion of T-mobile HSPA network (for example in towns of Kolarovo, Poltar, Fiľakovo etc..) which means better mobile data services, faster internet, etc.. (2nd row - terms "zakaznik"customer/roaming/"vyhodne"(better price)).
Now, let's briefly go through results for Orange:
|
Picutre 6: Results for Orange Slovakia |
- Orange responded to Tmobile roaming offer attacks (defensive strategy) (row 7)
- Company is also offering cheaper smart phones/tablets from popular vendors HTC/Samsung (adding also iPhone 4.0 white) (highlighted row also supported by top window displaying respective articles)
- Operator pushes hard with LTE network tests with uploads faster than 100 Mbit/s (rows 3 and 10)
As we can see, there is almost no negative flavor in the news about Orange (well, we can see spot mentioning of network outage (second row on the previous picture – term “zlyhat”), but since it is mentioned only once we will not consider it as relevant in this demonstration (of course in the real "production" analysis we might treat it differently).
Finally, we can take a look at Telefonica, which is the youngest and smallest player-operator in Slovakia:
|
Picture 7: Results for Telefonica Slovakia |
From the results we can see, that:
- Operator is not competing in the same area (flanking strategy) as T-mobile/Orange, rather it focuses on pre-paid offers of free SMS messages for 30 days following the day of renewal of pre-paid credit. Highlighted row on picture 7 containing terms like “zakaznik”(customer)/”predplateny”(prepaid)/”kredit”(credit). Supporting articles are displayed on the right side of the picture.
- Telefonica is dropping O2 from the official company/brand name (as can be seen on the previous picture, 4th row)
- Row 6 take us to very interesting topic. Twitter account - @o2slovakia - which from it’s inception truthfully informed about all special offers by O2, suddenly displayed controversial status “Forget about O2, we’re just lying and bullsh***ng you. Go for Orange or T-mobile, for quality for better price…”. This of course on the first sight appeared as ordinary hack into the account of mobile operator on the popular social network, but Telefonica quickly informed, that the account was not maintained by them and therefore from the beginning provided unofficial information. After contacting the social network operator for the brand name abuse, the problem account was removed. We can only speculate, that this might be a revenge for the aggressive tone that Telefonica used against it’s competitors for example here following the principle that stronger players do not attack weaker players... at least not officially…
|
Picture 8: Results for articles discussing attack on O2 brand |
When speaking about Telefonica, it is also interesting to take a closer look on one more topic. Virtual opertor (based on Telefonica) - Tesco mobile - attracts price sensitive customer with offer to call “only for one cent”, marketing the idea aggressively as “A price revolution in Slovakia”. The name of the player behind the campaing remained hidden for the whole duration of the “heating” time (the period when only promising marketing messages were available with no real offers to compare with). After gaining momentum and attracting customers, campaign quickly fade away (after revelation of full offer details of the offer which were of course not so revolutionary). Recognizing signs and techniques of guerrilla style fight? Picture 7, 7th row, terms tesco/”virtualny” (virtual)/”jeden”(one)/”volanie”(calling).
With the help of text analytics, generals fighting in marketing wars can decipher tactics of the oponents even from publicly available sources and take appropriate actions. What is more important – topics created from text analysis can be used to measure overall positive vs. negative image (or position) associated with the brand and it’s impact on customers. The next part of the article is beyond of text analytics alone - but we need to go through it just to illustrate possibilities of using TA – so please take this only as really illustrative example. Using this approach in real-world will require more detailed models and computations.
Let’s assume right now, that all discovered topics will be grouped in the following categories and later marked with respective positive or negative score according to this table (model):
|
Picture 9: Model that will be used to categorize discovered topics for each operator |
For example the LTE showcase topic by Orange will be categorized under the category “Perception of operated network” and will get +5 points. If we cannot put a topic into any category from our model, we will treat that topic as “generic” one. Running briefly through all topics discovered showed above, we might yield the following table.
|
Picture 10: Topics and points that were assigned to them |
The table says that Orange is clearly leading (for the period in which we reviewed articles) with T-com/Tmobile finishing second and Telefonica third. Results can be visualized as a pie chart (which is of course more suitable especially when you’re going to present results to someone else).
|
Picture 11: Visualization of results after category assignments, not taking frequency into account |
As the heading of the previous chart suggests, we should also consider frequency of articles supporting each topic. We can assume, that the more frequent a topic is in the media, the more significant impact it has – both positive, or negative. Table with topics and frequencies (if frequency is not detected, number “1” is used instead of zero value) follows.
|
Picture 12: Frequencies of topics |
Finally we can compute the resulting table, where values are dot product of frequencies and original category points from model.
|
Picture 13: Topics and points assigned to them taking frequencies into computation model |
Now we can see, that situation remained unchanged for Orange, but Telefonica and T-mobile switched positions, where the latter lost its position due to the higher frequency of negative topics (emergency line crash, problematic dividends) in the media. Please note, that the computation model presented here is tailored for the purpose of this demonstration and therefore is not suitable for real deployment. For example in the real life the impact of frequency should be weakened (for example with some weight) not to influence the final results too much.
|
Picture 14: Visualization of results after taking topic frequencies into computational model. |
So far we have demonstrated how TA can processes raw data (free form text articles) and turn them into valuable information, that can be later evaluated, visualized and if necessary measured for longer time periods to discover trends.
|
Picture 15: Hypothetical graph visualization of trends in brand perception of mobile network operators. |
Picture 15 displays purely hypothetical graph, on which we can see that T-com/T-mobile “might” systematically strengthen its position over time (showing that their marketing and PR guys using TA in order to succeed in their fight against competitors). Trend visualization is just one example how to use the newly derived information, but possibilities are endless (for example enhancing existing BI prediction models for existing customer segments, etc...).
I hope that the marketing war trilogy helped you to better understand how you or your company can benefit from usage of text analytics/mining techniques right now and in real-life scenarios. Again, I will appreciate your feedback as comments under the blog.