Driving value with AI in telecoms

To embed our video on your website copy and paste the code below:

<iframe src="https://www.youtube.com/embed/BjgX-tYLfoo?modestbranding=1&rel=0" width="970" height="546" frameborder="0" scrolling="auto" allowfullscreen></iframe>
Guy Daniels, TelecomTV (00:05):
Hello, you are watching TelecomTV. I'm Guy Daniels. Now at a time when telco margins are tight and complexity is rising. How is AI reshaping the economics and operations of the telecom industry from breaking down data silos to moving beyond pilots and into real world impact? What does it take for operators to turn AI into measurable business value? Well, joining me now is Benjamin Hickey, who is director product portfolio management, AI networking at IBM. It's good to see you again. Thanks for joining us now with margins under pressure, as I mentioned earlier and 5G Investments yet to deliver the expected returns, telcos are seeking operational efficiency gains. So what opportunities and challenges does AI in general present for telecom operators in 2025?

Ben Hickey, IBM (01:01):
Guy, that's a good point and it's important we recognize the context that we're operating in. And of all the telecom operators that we talk to here at IBM, it's pretty well recognized that they are still dealing with the investment cycle from 5G and they're looking for opportunities in their business. Now, many of them have driven very strong efficiency and optimization programs and those are delivering results. There's been a pretty widespread set of headcount redundancies across the industry. Now, what that means is that many of the teams that these operators use are kind of really being drawn back to their kind of skeletal minimal staff required. It's not just the staff required to cover the amount of infrastructure. It's also to think about the technology. Many of these operators are still running. If you think about the investment lifecycle technology that can be quite old and maintaining that skillset is also important.

(02:01):
So if we merely focus as many, many AI projections right now on the human efficiency gain, you are really going to start to see a ceiling that we hit in terms of being able to take people out of the business. So that's an important thing for us to think about. How do we think bigger? So it is definitely good to start to think about both how we can make our workforce more efficient and then how we can drive productivity. But we want to think bigger, and I would argue that we need to think not just about our OPEX budget, but we need to think about our CapEx budget too. And then in turn, we need to think about our revenues and our revenue growth. CapEx budgets are typically large, while OPEX does tend to be the largest expense over the lifespan of a network on an annual basis, tier one telecom operators spend 12 to 16% of their revenues on CapEx.

(02:58):
So for a operator in the 150 billion revenue range, that can be 20 billion upwards of CapEx. And then if you dig into that a little bit further, you'll see 75 to 80% of that CapEx is actually spent on network equipment. So if we think about how do we actually drive more efficiencies in the equipment we buy and we deploy, we're going to get a huge return from a whole new budget area. But that's not it. If you see where leading organizations, many in the hyperscaler are actually becoming capital intensive businesses themselves, if you just look at their plans just this year, but going back to last year too on deploying data centers, but they still have very envious numbers with metrics such as revenue per employee, and there are often orders of magnitude higher than we see in the telecoms industry. So ultimately we do want to think about how do we drive up the revenue per employee by making them more productive so that they can then work on new revenue generating ideas. So really guy, I think it's not one answer. I think we do need to acknowledge the efficiency drives in play right now, but we need to think big. We need to think about CapEx and we need to think about revenue growth.

Guy Daniels, TelecomTV (04:19):
Thanks so much, Ben. Well look, most telcos are running complex multi-vendor environments that have evolved over years and decades, and we're seeing operators struggle with data silos and fragmented tools. How is this impacting the industry's digital transformation vision and how is AI changing this dynamic?

Ben Hickey, IBM (04:38):
Yeah, that's a good point guy, and it's something we also need to recognize. But to be honest, if the outcome of that is that we need to start a project inside the telcos to go and clean the data, to consolidate the data, to organize it, in all honesty, that's not really going to get us to where we need to be. I mean, these things could run in the orders of months, more likely years. And so what we need to think about is where do we have data? Often the data is an untapped resource, there's a latent value in that if we can tap it. And if we think about the operations of telecom's networks, often it's the humans that are required to kind of stitch that data together across the silos. And it's not just the data, it's also the systems. But let's ask ourselves the question, why is it like this in the first place?

(05:32):
And if we think about that, there are some typical economic reasons. If people, vendors in particular think about increasing stitching costs, sorry, increasing switching costs, then one of the things they will often try to do is to try and deliver more components in the solution, more layers in the stack. And so if they're delivering network equipment and they can deliver the systems on top and those two are tightly linked, then it makes it harder to switch those vendors out. But let's put that to one side because that's a separate dynamic. There's also a technology reason. If we think about the type of data we collect, we've often had to put it into separate silos, separate data stores and use separate systems to process it. And that's really been a function of technology. The algorithms that we've had, the machine learning algorithms we've been using for 15, 20 years plus are all very task specific.

(06:33):
And so what that means is that they've been designed with a particular purpose in mind and they're very focused on just that one purpose. And so they're tightly tuned to the type of data and the outcome they get. So many of these systems have actually been built with these algorithms to focus on a single type of data. So even when the system is multi-vendor, that's where you still get silos in the data and in the systems layer. Now that is actually changing with ai. And so we've got an opportunity right now to tap into the data where it is as it is, and we can use these techniques with the AI foundation models to actually get value out of it by bringing the data together. And that is going to give us a huge benefit in turn that's going to reduce the reliance on our human glue. That's the human that is required to swivel chair between one system or another, or even they're pulling data and they're running it in separate analytics. These are all things that are manual and bespoke and they take time and they have a huge maintenance effort if not debt. So what we want to think about now is how do we actually take advantage of AI and the invention of these foundation models to actually go after this data as it is and get value from it.

Guy Daniels, TelecomTV (07:50):
Well, let's talk about scaling now, Shari, because there is this gap between AI networking pilots that may show promise in labs and then production deployments that actually reduce operational overheads. What will unlock AI implementations that transform from those that just add another tool to what's an already quite complex stack?

Ben Hickey, IBM (08:13):
Yeah, absolutely. Tools sprawl is a real problem. Obviously tools is the fancy new word. I mean systems one in the same in many respects, but really it does come down to if we're going to start a pilot and everybody's moving quickly here, quite rightly so, and all innovation programs should move quickly with pretty limited structure because that's where you actually get the benefit of innovation. You don't want to try and over rotate to put too many, too much governance in place, too many rules. That's just going to limit your ability to find a new innovative kind of outcome. That's all quite right. The thing we need to keep in mind here is that sometimes we can end up in this pursuit chasing the buzzwords. And if you're chasing the buzzwords, sometimes you don't actually take the step back to think about what is going on here and most importantly, what is it good for?

(09:11):
And even more importantly, what is it not good for? And so if we ask ourselves that question with what's happening with AI right now and in particular gen ai and then even more so LLMs, we started to be able to answer some of these questions, well, what is it good for? It's absolutely great at consuming inputs in the form of language. All of the research has been around with these LLMs language as well as images, video voice, and that is a fantastic way to actually be able to consolidate information, make it easier to access and effectively allow more junior engineers to get and do the work of more senior engineers. So hugely valuable. However, if you think about it, what they're not good for you quickly come to realize that all of that data we were speaking about a minute ago that telecoms operators collect that metadata for the network.

(10:05):
This is the time series data like performance metrics, events and logs, traces like IP fix, even data that we would install in documents unstructured as it is, configuration data, topology design data. Much of that data is time series and that is actually not well suited to LLMs. So when we start to think about these pilots, if all of our efforts, all of our eggs are going into one basket, then really all we're going to get are these benefits around improved usability from the language type data we have. We are not going to get benefits from any of that data we use to actually design and operate our network. And that's where I would suggest to get to production, to get to real value rather than just a personal assistant or a little tool that one or two or a small set of your engineers use, you need to think about how we solving these real business problems.

(11:04):
And that's where we would bring in that data. So these pilots right now, we've been working with operators focused, as I said, on that range of data from metrics to events to logs, and we are putting those through the new time series foundation models in effect to get a better signal to noise ratio from our systems. Now, those are going to lead to us being able to get a better output from our large language models and our gen ai. So you're going to get this multiplier effect. So really what I'd say is all of the focus on pilots is great. We need to move fast. Absolutely. Zeroing in on Gen I is absolutely necessary, but it's not sufficient. We need to think about the other assets we have in data and how we tackle those with alternative techniques that have been brought around about this AI revolution as well, namely time series foundation models.

Guy Daniels, TelecomTV (12:04):
You mentioned some of the traditional metrics then. So let's look at outcomes, and if we go beyond the typical telco metrics we're all used to, what business outcomes are you seeing from operators that are aiming high with AI across the entire network lifecycle? How do you go about quantifying the value of preventing issues versus responding to them faster?

Ben Hickey, IBM (12:29):
Yeah, that's interesting, isn't it guy? Because we've been talking about improving operational metrics like reductions in MTTR, and even if you break that down, so not just mean time to resolution but mean time to identify and in reality there's nothing new there. And operators will continue to turn the screws to improve those metrics, but we're not going to find anything exciting. And in many respects, operators won't see a reason to reinvest in their operational systems to get a better outcome if all they're going to get is marginal improvement on those metrics. So we need to think more broadly. As I mentioned before, many operators have already gone through efficiency drives where they've taken as much headcount out as they can. So really, if you focus on reducing headcount further, we will start to hit ceilings with respect to the number of people we can take out.

(13:27):
And of course that will vary by operator. Each will have different opportunities in their different teams, but fundamentally many of them are going to be restricted. And there's a component to this as well, just with the age of these operators, the amount of infrastructure they have and the long lifespan that it has, you need to maintain some of those skillsets in there. So really we are hitting this point where the ability to take out more people will hit a ceiling. So then we need to look further. So then we can start to think about, well, how from an operations standpoint, how is time, how resources consumed? And if we get a little bit more granular on the data, what you will find is it's a typical Pareto distribution. Most operators will find the issues that their operations teams have to deal with are on an 80 20 distribution.

(14:22):
80% are pretty routine. They've seen them before they come up again and again, and as a result, they've got pretty good documentation, pretty good institutional knowledge. They know how many of them are pretty well automated When something happens, how do we make sure that we suppress the alarms so that we're only getting a single alert? How do we then act on that and how do we automate our workflows such that that's acted upon? So again, not a huge amount of benefit there, but if we think about that 20%, that 20% that isn't routine isn't common. These are the ones that can last days if not weeks. These are these issues when a war room call could be kicked off and you could have a team of 10, 20 people or more on that call for hours and days. This is what's happening, right? The human glue that's our engineers are actually having to look across their different systems.

(15:22):
They're having to speak to their colleagues where they've got experts in different domain, the RAN domain, the IP domain to transport the domain, the mobile core domain, and they're trying to use their expertise to identify what could be going on here. Now, that's actually a big opportunity for us, and if you think about that 20% of issues that consumes approximately 90% of the team's resources. So being able to zero in on that is going to have an outsized impact on our team. So that's the first point. Now, that's not where we should stop. Like I mentioned at the start here, operational improvements are one, but we need to think bigger. And so as we start to drive operational improvements here, we now need to think about, well, what else can we touch? And the CapEx budget is a non-trivial budget. We can go after that.

(16:15):
Just think about the scale of these networks we operate today and all of those processes related to capacity management. Those are still largely manual processes. And we suffer from the law of averages where because the networks are so large scale, we have to, and the processes are manual, we have to bundle group things together to handle them in step increases. So what we need to do is think about these benefits from a CapEx standpoint. So what you're going to see is can we take the systems to get a more granular, more accurate view of what we can then go and target with capacity upgrades? And we can move from a case where we have CapEx and capacity management being deployed just in case to a mode where we actually deploy it just in time. Now that's going to deliver us a lot of value still.

(17:17):
That's a subset of the total CapEx budget. So let's go back even further. Let's think about how do we design networks. And if you think about IT, networks are different to other parts of the IT infrastructure. Firstly, they are complex distributed systems, and as a result, they suffer from what we call the shared fate problem. That means we need to look at the network in its entirety to understand what's going on, and that comes true as we design it. The other thing that makes networks very important is where they're placed in the stack and by residing at the bottom of the stack, when things go wrong in the network, they have the largest impact. These factors lead to our network engineers, our network managers, our executives in the network domain being typically risk adverse people. And so what do we do? Well, we throw a heck of a lot of money and resources at resilience.

(18:12):
That resilience comes in the form of duplicated infrastructure, doubling up on redundancy, within boxes, across boxes, across links, and put on top of that the fact we then don't want to stress the network too much, definitely not in normal conditions because what happens if something unanticipated happens? We want to have some headroom. Your typical network operates at sub 40% utilization. So what's limiting that? Well, again, it's our network engineers. Those designers, they're risk adverse. They are basically building in buffer to be able to handle all the different things that might happen in the network that they can't conceive of and they can't model out. Well, that's changing now with AI as well, our ability to build out designs for network to do scenario planning and to use different forms of ai. Again, another good use case, not targeted at LLMs, but at the different types of AI algorithms that we have out there, these are the sorts of algorithms that were used in Alpha go to beat humans.

(19:18):
That's not an LLM that played that. If you've ever played chess with chat GBT or Claude, you'll see it's got a great opening gambit, a great opening game, but the middle game is pretty terrible. So when you start to think about those strategic decisions looking forward, looking at all the different permutations and the scenarios and modeling them out, we actually have AI systems that can do that for us now. And if we can do that with our designs and we can model out different scenarios, different failure scenarios, different utilization scenarios, we actually have the opportunity to run these networks at much higher utilizations and still protect that traffic, which needs to be protected. This actually gets us out of a big problem we've had in the industry for a long time where we've gold plated our best effort traffic, and that has actually led to faster cannibalization of many of our premium services. So just think about the granularity that our engineers will be able to operate at with AI designing, managing these networks, which is going to give a huge improvement to those CapEx budgets.

Guy Daniels, TelecomTV (20:25):
Well, final question then, Ben, for a telecom CXO looking to build an AI driven network operation strategy, what should they prioritize to avoid the AI tool sprawl that you highlighted earlier, which we're seeing in some networks now?

Ben Hickey, IBM (20:41):
Yeah, guy, and I think here we often hear the answer to this question is aim big, start small. So I won't repeat that here because there's no value in that from our interview. What I would say is we need to go back to what we talked about at the beginning, and it comes down to data. And we all know the starting point for AI is data. So we need to think about this deployment. How do we roll out into production based upon our data? So if we think about the data that we have for our operational systems, our day two management of the network, and using AI to get a better, more accurate read of how we are managing those networks, this is actually going to give us real time data for our actual network, how the network performs, the types of outages we have to deal with, the impact of those outages.

(21:34):
All of that is going to be brought together from the silos it is today. When you start there, you can now start to think about, well, now I've got an accurate view of how my network runs. Now I can start to think about, well, how do I use that data to help me forecast how I grow my network? So back to that capacity management use case I was talking about. Now we start to think about, well, we've got this data. We don't need to suffer from the law of averages where we think about the network as these large building blocks. We can now come down to the granularity of individual pops, individual devices, individual links, and roll those up. So we're really deploying CapEx just in time and then keep going. As we've now got a view of how our network runs and how it grows over time, we can start to think about how do we want to design our network so that data will then feed into our designs.

(22:30):
We've got ACRA reviews on topology, actual utilization levels. We can then play that in with the failures that we see moving away from just statistical numbers provided by vendors to actual numbers that we are running now. This is a pretty big vision, and so I'm kind of mapping it out just so that we can see the full opportunity set that AI will bring about to us. But it's really starting right now in an operational sense is there's a lot of low hanging fruit there. As I mentioned, focusing on those 20% of complex issues and then layering up from there to the other use cases.

Guy Daniels, TelecomTV (23:08):
Well, it's a good vision, but we must leave it there for now. Ben, good talking with you, and thanks so much for sharing your views with us today.

Ben Hickey, IBM (23:15):
Thanks Guy.

Please note that video transcripts are provided for reference only – content may vary from the published video or contain inaccuracies.

Benjamin Hickey, Director, Product Portfolio Management, AI Networking, IBM

At a time when margins are tight and complexity is rising, Benjamin Hickey of IBM explores how AI is reshaping the economics and operations of the telecom industry. From breaking down data silos to moving beyond pilots for real-world impact, what does it take for operators to turn AI into measurable business value?

Recorded September 2025

Find out more about the upcoming AI-Native Telco Forum, which takes place on 23-34 October in Düsseldorf, including the full agenda and registration details, here.

Email Newsletters

Sign up to receive TelecomTV's top news and videos, plus exclusive subscriber-only content direct to your inbox.