To embed our video on your website copy and paste the code below:
<iframe src="https://www.youtube.com/embed/cErhoGE1EKo?modestbranding=1&rel=0" width="970" height="546" frameborder="0" scrolling="auto" allowfullscreen></iframe>
Sean McManus, TelecomTV (00:17):
Hi, my name is Sean McManus. I am here at the Innovative Data Infrastructure Forum in Munich, and I'm joined now by Mr. Yuan Yuan, who is the VP for Huawei of the data storage products like,
Yuan Yuan, Huawei (00:29):
Yeah, yeah, yeah.
Sean McManus, TelecomTV (00:30):
Thank you for joining me. You are welcome. Tell me, first of all, what are the challenges that you see enterprises having today in implementing ai?
Yuan Yuan, Huawei (00:38):
Many enterprise customers, and they would like to implement AI in their sorts, but it is really hard to put into practice. The first challenge I think from the for them is mindsets. The thinking transformations, how to leverage your datas, transfer your data to the models and put the models, transfer your models to the applications. Latest workflows, totally different from the traditional applications. The customer need to change their mindset to make fully useable data, organize datas and leverage or introduce the model tools and new workflows. And the second is, I mentioned the data. Data preparations very important. Play the pivotal role in AI's, high quality datas for examples. I can give you an example. Very interesting. In China's, we have the availability. They're responsible for the weather forecasting, those kinds of things. They forecast the rain rate is about to rain tomorrow or not. Okay? This is very important for individual or for the economies.
(01:45):
Usually they're using data in the latest 10 years. That is 10 years, roughly 73 petabytes. But with a large models, they can leverage 300 petabytes. The history datas, the data is there. You can get datas on the websites, but there's no approach to yield it. With large models, they can make full use of a 300 petabytes datas, and they can import in the accuracy of the weather predictions by 20 percentage. Very, very huge improvement. So for the enterprise, for the organizations in the same, so how to refine their datas and generate high quality datas in the second challenges. And the third thing is, I think is the cost. If they want to do any new things, you need to build something new. You are pay some monies. How to balance your current investment on the future proof application or future proof, the approach into balance. So the cost is also the, I think, the big hurdles for the current enterprise to move to the AI years.
Sean McManus, TelecomTV (02:51):
Thank you. Tell me about how AI and especially large models will influence the data strategy for enterprises.
Yuan Yuan, Huawei (02:58):
The AI has existed for roughly 10 years running application based on small models, but large models. That means large the amount of huge amount of data. That means the first, the customer need to hold and maintains the huge amount of datas were larger than they before. This is one thing, impact large models. Second is to train the high quality models. You don't need to all the data, you just need high quality datas. That means we call clean datas and label the datas and transport the datas to the copper. This approach, how to transfer your total amount of data to the high quality compass is also very important to the ai.
Sean McManus, TelecomTV (03:43):
It's clear a new data strategy is going to be needed. Tell me a bit about how AI will affect the data infrastructure.
Yuan Yuan, Huawei (03:51):
For the data infrastructures, it's all storage. Basically, the customer use storage to build all the datas. If you want to maintain 10 times of the larger dataset than before, you need the low course data first, and then it can be scaled skew from very small to a large, the first second. To match the training procedures requirements, you need high performance data storage much faster than before. It can really to help you help customers to reduce their training times and save their monies for the whole processors.
Sean McManus, TelecomTV (04:32):
Now, this week, you are launching a new AI ready data storage solution. Tell me what
Yuan Yuan, Huawei (04:37):
That means. We can segment AI into two stage first as the model training, second as the inference. First, you need to build models on the train models. Use the large sides of the corpus datas at this moment. The common headache, the pinpoint for customer is utilization of the GPUs. The GPUs is very, very expensive, and the one card, maybe tens of thousands of dollars. Pounds, okay. But averagely, the utilization of the GPU card is less than 50 percentage. That means the customer wastes half the monies because the GPU have to wait. Wait for datas. Okay. How to the improve the utilization of the GPU card clusters is the one things we're thinking about. Now we develop the ARS 800 and dedicated for the training procedures with high performance designs and some new features. We can improve the GP utilizations by the 20 percentage from the 50 to 70.
(05:43):
That saved the huge money for customers. That's one things. Second stages for inference. If you ask question to the large models, the feedback time, maybe one minute, two minutes, you are lost to patient. Okay? How to reduce the time to response to use is one things. Usually we'll implement some cash. We load in the data, prepare ahead of your questions. For example, Sean, I know your name. Okay. When we launch the conversations, I know you are shown, so I reckless those question you are shown when the second round, we don't have to calculate again your name. In my CI know you are shown, I adjust the record status. That's means of the cache. Okay. Especially for the modern applications for ai, multiple rounds. If you initiate the conversation with the large models, maybe several round, ask several questions. They can memorize the context in the cache to reduce theso of time.
(06:42):
Second scenario is for the long serial response. That means you can input the full sets of the fictions. Okay. Like ordering hate and output. Today, digest for me. You input the whole sides of the one fictions and you expect digest quickly. Listen in scenario, we also can save some things in the memories, catch the datas, and improve the responsible times for you. Okay. But the cache is not easy to design. Yearly is implemented into the GPU card that cost this 16 euros per gigabyte. Our approaches, we are cash. Those datas into no storage equipments, adding 0.3 euros, project gigabytes, dramatically reduce the customer cost to build a high performance. The inference storage systems. There's two parts we are working on. That's the meaning of the A servers.
Sean McManus, TelecomTV (07:42):
Brilliant. Now, you mentioned that cost as one of the challenges that people have in implementing ai. Are there other ways that your solution helps them to implement AI for the first time?
Yuan Yuan, Huawei (07:52):
Apart from the storage equipment, so also we have some tours, software tours. It's fully open source. We can help the customers quickly to saving their time to market. And the name is DCS ai. These SAS solutions, we have the tour chain called the model Engine. Model Engine has three features. Firstly, they can help the customer to refine the datas, to clean the datas, to prepare the corpus. We have the over 50 operators implementing into the tools. Operator means some small tools. For example, we get the redundancy of the datas. We get rid of the similar things in one site to reduce the data. Okay? We implement the over 50 operators, which can help the customer quickly to clean their datas, to label their datas. Okay. Second, in a model stage, the customer can use the original large models like lama, but sometimes they need to refine the models with their private domain data.
(08:54):
We provided the tourist chains help them to refine their models, to input their private datas, along with the public large models, to generate new models for dedicated for themselves. So the last thing is to transfer the model to the applications. Because you need applications. You need app model is only in the model. You need to transport the model to the applications. We also provide some tours, the low-code, the graphic chains, and some templates for q and a to reduce the time date to have the customer, to transfer the model to the real applications to launch to the market. Those kind of tourists. We all release to the open source communities. We invite all our customer partners, which gather to build those towards genes model engine.
Sean McManus, TelecomTV (09:43):
Tell me, I'd be interested to know, what do you think are the key things that CIOs need to consider in their data infrastructure? Now we're in this AI era.
Yuan Yuan, Huawei (09:51):
Yes. Let's go back to the first question. I can repeat my opinions. Change the mindset. It's easy to talk. Okay. The data lake consolidate all the datas and clean the datas prepared the corpus. But in the organizations is all the data is distributed, segmented into different departments. It's not easy to organize all the datas in one logical course. If you cannot consolidate the datas, you cannot transfer that to the high quality topics. So firstly, I think they say, oh, need to organize the high level guys and to come to some agreements to journal some principle data. Principle data create, or in our organizations, we need to share the data in a security ways to connect it, aggregate other datas in one lake or in one pool. Then based on this infrastructures, data infrastructures, they say we can build on some stacks and gradually to transfer the data to the model, to model to applications. So that's the worst thing is come to agreement to generate some data principle in the organizations. This is very important. If they cannot come to some agreement, no, nothing happened.
Sean McManus, TelecomTV (11:07):
Excellent. Well, thank you so much for talking me through that. Thank you. You're welcome. Nice to meet you. Thank you.
Hi, my name is Sean McManus. I am here at the Innovative Data Infrastructure Forum in Munich, and I'm joined now by Mr. Yuan Yuan, who is the VP for Huawei of the data storage products like,
Yuan Yuan, Huawei (00:29):
Yeah, yeah, yeah.
Sean McManus, TelecomTV (00:30):
Thank you for joining me. You are welcome. Tell me, first of all, what are the challenges that you see enterprises having today in implementing ai?
Yuan Yuan, Huawei (00:38):
Many enterprise customers, and they would like to implement AI in their sorts, but it is really hard to put into practice. The first challenge I think from the for them is mindsets. The thinking transformations, how to leverage your datas, transfer your data to the models and put the models, transfer your models to the applications. Latest workflows, totally different from the traditional applications. The customer need to change their mindset to make fully useable data, organize datas and leverage or introduce the model tools and new workflows. And the second is, I mentioned the data. Data preparations very important. Play the pivotal role in AI's, high quality datas for examples. I can give you an example. Very interesting. In China's, we have the availability. They're responsible for the weather forecasting, those kinds of things. They forecast the rain rate is about to rain tomorrow or not. Okay? This is very important for individual or for the economies.
(01:45):
Usually they're using data in the latest 10 years. That is 10 years, roughly 73 petabytes. But with a large models, they can leverage 300 petabytes. The history datas, the data is there. You can get datas on the websites, but there's no approach to yield it. With large models, they can make full use of a 300 petabytes datas, and they can import in the accuracy of the weather predictions by 20 percentage. Very, very huge improvement. So for the enterprise, for the organizations in the same, so how to refine their datas and generate high quality datas in the second challenges. And the third thing is, I think is the cost. If they want to do any new things, you need to build something new. You are pay some monies. How to balance your current investment on the future proof application or future proof, the approach into balance. So the cost is also the, I think, the big hurdles for the current enterprise to move to the AI years.
Sean McManus, TelecomTV (02:51):
Thank you. Tell me about how AI and especially large models will influence the data strategy for enterprises.
Yuan Yuan, Huawei (02:58):
The AI has existed for roughly 10 years running application based on small models, but large models. That means large the amount of huge amount of data. That means the first, the customer need to hold and maintains the huge amount of datas were larger than they before. This is one thing, impact large models. Second is to train the high quality models. You don't need to all the data, you just need high quality datas. That means we call clean datas and label the datas and transport the datas to the copper. This approach, how to transfer your total amount of data to the high quality compass is also very important to the ai.
Sean McManus, TelecomTV (03:43):
It's clear a new data strategy is going to be needed. Tell me a bit about how AI will affect the data infrastructure.
Yuan Yuan, Huawei (03:51):
For the data infrastructures, it's all storage. Basically, the customer use storage to build all the datas. If you want to maintain 10 times of the larger dataset than before, you need the low course data first, and then it can be scaled skew from very small to a large, the first second. To match the training procedures requirements, you need high performance data storage much faster than before. It can really to help you help customers to reduce their training times and save their monies for the whole processors.
Sean McManus, TelecomTV (04:32):
Now, this week, you are launching a new AI ready data storage solution. Tell me what
Yuan Yuan, Huawei (04:37):
That means. We can segment AI into two stage first as the model training, second as the inference. First, you need to build models on the train models. Use the large sides of the corpus datas at this moment. The common headache, the pinpoint for customer is utilization of the GPUs. The GPUs is very, very expensive, and the one card, maybe tens of thousands of dollars. Pounds, okay. But averagely, the utilization of the GPU card is less than 50 percentage. That means the customer wastes half the monies because the GPU have to wait. Wait for datas. Okay. How to the improve the utilization of the GPU card clusters is the one things we're thinking about. Now we develop the ARS 800 and dedicated for the training procedures with high performance designs and some new features. We can improve the GP utilizations by the 20 percentage from the 50 to 70.
(05:43):
That saved the huge money for customers. That's one things. Second stages for inference. If you ask question to the large models, the feedback time, maybe one minute, two minutes, you are lost to patient. Okay? How to reduce the time to response to use is one things. Usually we'll implement some cash. We load in the data, prepare ahead of your questions. For example, Sean, I know your name. Okay. When we launch the conversations, I know you are shown, so I reckless those question you are shown when the second round, we don't have to calculate again your name. In my CI know you are shown, I adjust the record status. That's means of the cache. Okay. Especially for the modern applications for ai, multiple rounds. If you initiate the conversation with the large models, maybe several round, ask several questions. They can memorize the context in the cache to reduce theso of time.
(06:42):
Second scenario is for the long serial response. That means you can input the full sets of the fictions. Okay. Like ordering hate and output. Today, digest for me. You input the whole sides of the one fictions and you expect digest quickly. Listen in scenario, we also can save some things in the memories, catch the datas, and improve the responsible times for you. Okay. But the cache is not easy to design. Yearly is implemented into the GPU card that cost this 16 euros per gigabyte. Our approaches, we are cash. Those datas into no storage equipments, adding 0.3 euros, project gigabytes, dramatically reduce the customer cost to build a high performance. The inference storage systems. There's two parts we are working on. That's the meaning of the A servers.
Sean McManus, TelecomTV (07:42):
Brilliant. Now, you mentioned that cost as one of the challenges that people have in implementing ai. Are there other ways that your solution helps them to implement AI for the first time?
Yuan Yuan, Huawei (07:52):
Apart from the storage equipment, so also we have some tours, software tours. It's fully open source. We can help the customers quickly to saving their time to market. And the name is DCS ai. These SAS solutions, we have the tour chain called the model Engine. Model Engine has three features. Firstly, they can help the customer to refine the datas, to clean the datas, to prepare the corpus. We have the over 50 operators implementing into the tools. Operator means some small tools. For example, we get the redundancy of the datas. We get rid of the similar things in one site to reduce the data. Okay? We implement the over 50 operators, which can help the customer quickly to clean their datas, to label their datas. Okay. Second, in a model stage, the customer can use the original large models like lama, but sometimes they need to refine the models with their private domain data.
(08:54):
We provided the tourist chains help them to refine their models, to input their private datas, along with the public large models, to generate new models for dedicated for themselves. So the last thing is to transfer the model to the applications. Because you need applications. You need app model is only in the model. You need to transport the model to the applications. We also provide some tours, the low-code, the graphic chains, and some templates for q and a to reduce the time date to have the customer, to transfer the model to the real applications to launch to the market. Those kind of tourists. We all release to the open source communities. We invite all our customer partners, which gather to build those towards genes model engine.
Sean McManus, TelecomTV (09:43):
Tell me, I'd be interested to know, what do you think are the key things that CIOs need to consider in their data infrastructure? Now we're in this AI era.
Yuan Yuan, Huawei (09:51):
Yes. Let's go back to the first question. I can repeat my opinions. Change the mindset. It's easy to talk. Okay. The data lake consolidate all the datas and clean the datas prepared the corpus. But in the organizations is all the data is distributed, segmented into different departments. It's not easy to organize all the datas in one logical course. If you cannot consolidate the datas, you cannot transfer that to the high quality topics. So firstly, I think they say, oh, need to organize the high level guys and to come to some agreements to journal some principle data. Principle data create, or in our organizations, we need to share the data in a security ways to connect it, aggregate other datas in one lake or in one pool. Then based on this infrastructures, data infrastructures, they say we can build on some stacks and gradually to transfer the data to the model, to model to applications. So that's the worst thing is come to agreement to generate some data principle in the organizations. This is very important. If they cannot come to some agreement, no, nothing happened.
Sean McManus, TelecomTV (11:07):
Excellent. Well, thank you so much for talking me through that. Thank you. You're welcome. Nice to meet you. Thank you.
Please note that video transcripts are provided for reference only – content may vary from the published video or contain inaccuracies.
Yuan Yuan, VP, Data Storage Product Line & President, Scale-Out Storage Domain, Huawei
Huawei has launched its AI Data Lake solution to accelerate AI adoption across industries. Yuan Yuan, VP of Huawei’s data storage product line, explains how AI affects enterprises' data strategies and data infrastructure, and how Huawei is helping organisations to launch their AI applications.
Recorded April 2025
Email Newsletters
Sign up to receive TelecomTV's top news and videos, plus exclusive subscriber-only content direct to your inbox.