One of the most substantial big data workloads over the past fifteen years has been in the domain of telecom network analytics. Where does it stand today? What are its current challenges and opportunities? In a sense, there have been three phases of network analytics: the first was an appliance based monitoring phase; the second was an open-source expansion phase; and the third – that we are in right now – is a hybrid-data-cloud and governance phase. Let’s examine how we got here.
The Dawn of Telco Big Data: 2007-2012
Initially, network monitoring and service assurance systems like network probes tended not to persist information: they were designed as reactive, passive monitoring tools that would allow you to see what was going on at a point in time, after a network problem had occurred, but the data was never retained. To do so would – at the time – have been prohibitively expensive, and no one was really that interested anyway. Reductions in the cost of compute and storage, with efficient appliance based architectures, presented options for understanding more deeply what was actually happening on the network historically, as the first phase of telecom network analytics took shape. Advanced predictive analytics technologies were scaling up, and streaming analytics was allowing on-the-fly or data-in-motion analysis that created more options for the data architect. Suddenly, it was possible to build a data model of the network and create both a historical and predictive view of its behaviour.
The Explosion in Telco Big Data: 2012-2017
As data volumes soared – particularly with the rise of smartphones – appliance based models became eye-wateringly expensive and inflexible. Increasingly, skunkworks data science projects based on open source technologies began to spring up in different departments, and as one CIO said to me at the time ‘every department had become a data science department!’
They were using R and Python, with NoSQL and other open source ad hoc data stores, running on small dedicated servers and occasionally for small jobs in the public cloud. Data governance was completely balkanized, if it existed at all. By around 2012, data monetization projects, marketing automation projects, M2M/IoT projects and others all developed silo’d data science functions within telecom service providers that each had their own business case and their own agendas. They grabbed data from wherever they could get it – in some cases over the top from smartphones and digital channels – using for example the location of the GPS sensor in the mobile phone rather than the network location functions. At the same time, centralised big data functions increasingly invested in Hadoop based architectures, in part to move away from proprietary and expensive software, but also in part to engage with what was emerging as a horizontal industry standard technology.
That second phase had the benefit of convincing everyone ofto the value of data, but several things were happening by around 2016 / 2017. First, AI was on the rise, and demanding consistent, large data sets. Second, the cost of data was getting out of control – literally. It wasn’t just that the cost was high, it’s that the cost was distributed across the business in such a way as to be uncontrollable. Third, data privacy rules were being prepared in several major markets, that would require at the very least a coherent level of visibility across data practices, which was impossible in a distributed environment. In the network itself, 5G, IoT and Edge architectures were being designed with copious ‘information services’, and network virtualization was on the cusp of being production grade. All of these network changes were designed with data in mind – and the data architectures needed to be ready to cater for them.
The Well-Governed Hybrid Data Cloud: 2018-today
The initial stage of the third phase of Telecom Data Analytics has often been mischaracterized as merely a shift to cloud. Virtualisation of the infrastructure has certainly been a part of this latest phase, but that’s only a partial picture. Service providers are increasingly designing data architectures that recognise multiple (hybrid) data clouds, edge components, and data flows that don’t merely move data from source, to processing, to applications; processing itself is distributed and separated.
The real transformation in data management in this third phase has been in governance. Integrated lineage and a unified data catalog offer the potential for consistent policy enforcement, and improved accountability and traceability across a multi-cloud infrastructure. Not only that, but integrated governance can allow service providers to distribute workloads appropriately. High volume, low value data – often the case with network data – that needs to be harvested for AI training, but not necessarily persisted for extended periods, should not necessarily route to the public cloud, which can be expensive. Similarly, some sensitive data should be retained on-prem, and other data should be routed to a particularly secure cloud. As the economics change, workloads should be moveable to other clouds as appropriate, allowing the service provider to retain control over costs and true flexibility.
The Challenge of Telecom Network Analytics Today
The primary tasks of the telco data architect in 2021 are scale and control. The amount of data continues to grow, with more devices, more network elements and more virtualized components, while – on the demand side – AI and Automation in the network and beyond are demanding ever more data. Issues of liability, compliance and consistency demand significantly enhanced governance, and a capacity to manage costs which are significant, and growing. New kinds of data through IoT and Edge, faster data from connected devices, and new processing architectures – including on-device and at-the-Edge pre-processing – will create new bottlenecks and challenges. The greater the capacity for control – data governance – the more options will be available to the CIO as the applications continue to grow. In particular, as public cloud options become more widely available, the orchestration of data workloads from Edge to AI – across public, private, local, secure, low cost and on-prem cloud – will be critical in providing the transformed telco with the agility necessary to compete and win.
Cloudera President Mick Hollison alongside speakers from LG Uplus and MTN will be speaking about the challenges of Data Driven Transformation at the TM Forum Digital Leadership Summit on October 5th. Those already registered for the TM Forum Digital Transformation World Series can register for this special event here, while those who need to register can sign up here. Registration is free for service providers, analysts and press.