CSPs have been using customer segmentation for many years now – demographic, usage based and ARPU driven. In the times of voice and SMS, such segmentation sufficed the purpose of selling these basic services. With data explosion and revenues increasingly influenced by data and VAS, such segmentation is not fit for purpose and needs to change.
In addition, CSPs have granular customer interaction data passing through their network – Apps used, URLs accessed, locations and times for access, type of network of access, recency and frequency of access and so on. A lot of these contextual data is extremely valuable for segmenting customers in a different way – using their interests derived from these data. Some CSPs have started exploiting these data for segmentation but there are challenges ahead in terms of technology, data privacy, derived interests and more importantly potential business benefit in terms of uplift over existing campaigns. I would like to dwell on some of these aspects in this article.
1. Technology Issues
Much of the traffic today on the mobile internet is transitioning to HTTPS. With some estimates, more than 60% of traffic is HTTPS. Customer conscious of their privacy, are increasingly using VPN services to access internet. Much is the URL and APP data gathered by CSPs become not so useful when these connections are encrypted as CSPs lose the long URLs they used to get before encryption. These long URLs allow them to derive customer’s intention and hence interest.
Second challenge is to classify multitudes of URLs and APPs into meaningful categories and interests that can then be used for further segmentation. There are organisations such as similarweb and ZveloDB that offer classifications. However, output of these classification engines is not of great quality because input are the high level URLs due to encryption and HTTPS issue discussed earlier.
So quality of category data presents the first challenge to segmentation.
2. Cluster Dimensioning Issues
Each category/subcategory output from categorisation tool becomes a potential dimension and these can go up to as high as 300. With recency and frequency values over 300 dimensions, the input becomes quite complex and cluster output quality can be poor. So concepts from text analytics might have to be used to manage the dimensions and also preprocess data. Concepts of TFIDF – Term Frequency Inverse Document Frequency can be applied to dimension matrix to remove rare or more frequent terms and then balance the remaining terms as per their frequency across all corpus. Such processing is shown to to give much high quality cluster output.
3. Creating Personas or segments from clusters to make sense of them
When you create clusters, sometimes it is difficult to describe these clusters to the business community. Supervised classification using clusters can create rules that can help define clusters. Another challenge is to integrate existing demographic, psychographic and value information with clusters to create personas – way to describe segment fully in terms of their characterisitics. Since demographic and other information is available on sample basis, it is tricky to merge it with cluster information to create personas. Additional market research for the customers in the sample for clustering might be needed to integrate this information.
4. Data Privacy Issues
CSPs and OTT players are governed by different laws with respect to privacy. For example in the US, FTC laws govern Facebook and Google while FCC laws governs CSPs. FCC laws tend to be tougher on privacy than FTC and hence OTT players have some advantage of being able to use your data vs CSPs being able to use data for upsell and cross-sell or even making content free through ad-supported sites. I think customer education with respect to privacy and what data CSPs can really see versus what Facebook and Google know about you might help CSPs to convince its customers to share consent.
5. Return of Investment
We all talk about personalized offers generating huge benefits – Netflix, Amazon have proved it many times. But there is little evidence of CSPs achieving substantial lift through such behavioral segmentation over their existing baseline segmentation based on demographic and service usage. The key challenge is the attribution where it is tricky to measure and attribute incremental value to such behavioral segmentation and not to other aspects of the business that are changing at the same time. Sophisticated econometric modeling is needed to measure and quantify attribution.
Despite these challenges, there are huge opportunities for CSPs to make use of their data to create meaningful segments based on customers interests. In addition, they have huge advantage in terms of customer context – device, location, time, network presence which can be critical for contextual marketing in addition to clustering.
I feel that in the coming months, we can see lot of action from CSPs on this front – Verizon with Yahoo, Telenor with Tapad, Telefonica with Amobee and so on.