Huawei launches AI Data Lake Solution to help enterprises embrace the AI era with better data

Huawei has launched a new solution that aims to solve the data storage challenges associated with AI training and inference, designed to accelerate AI adoption across industries. Training is the data-intensive process of building an AI model, whereas inference is the real-time use of that model to analyse new data and make decisions.

Called the AI Data Lake, the new solution was launched at the 4th Huawei Innovative Data Infrastructure (IDI) Forum in Munich, Germany.

“To be Al-Ready, get data-ready at first,” said Peter Zhou, Vice President of Huawei and President of Huawei Data Storage Product Line.

The concept of a data lake is not new. What is new is that storage needs to meet the demands of AI training and inference, he said, adding that Huawei is seeing less and less cold (infrequently used) data. Increasingly, warm data is used for training AI models, driving a demand for high capacity, fast, cost-effective, and resilient storage.

Three data storage products are combined to deliver the AI performance required:

  • For high-performance data access, OceanStor A800 enables throughput of 500GB/s and 24 million IOPS, and as a result has the top ranking in the MLPerfTM AI benchmark. Many leading AI solution providers found that it could increase the usage of its AI training clusters  with the OceanStor A800. To speed up inference, the solution has key-value (KV) caching built in, so that AI models don’t need to repeatedly compute the same calculations, improving GPU utilisation to 70%.
  • For large capacity scale-out storage, OceanStor Pacific offers industry-leading density of 4PB in a 2U rack form. The solution has the industry’s lowest energy consumption of 0.25W/TB and has been certified by Energy Star®. The dedicated card built into OceanStor Pacific enables 2:1 data compression. A German university stored more than 50 petabytes (PB) of research data using the solution, and in the process was able to cut space by 70% and reduce energy consumption by 46%.
  • For backup of the AI corpus (used for training) and the vector database (used for inference), OceanProtect E8000 stores up to 16PB (post-reduction) at a rate of 255 terabytes (TB)/hour per system. It includes algorithms that enable data reduction of up to 72:1. A managed service provider in Switzerland saw a 25% reduction in data, while a Brazilian oil company experienced a 10-times backup performance using the solution. The backup storage also includes ransomware protection, with a 99.99% detection rate.

While the storage forms the foundation for the AI Data Lake, the solution is multilayered. The data management layer adds Huawei's Omni-Dataverse. This provides a global view of data assets across Huawei storage devices in different datacentres and enables data movement across these devices. Using techniques such as local cache acceleration, this speeds up access to training data. Mr Zhou’s presentation noted that Omni-Dataverse makes it possible to retrieve from 100 billion files in a matter of seconds.

The AI tool chain ModelEngine supports organisations with data ingestion, data enablement, model enablement and app enablement. This layer of the solution features low-code development, automatic evaluation and one-click deployment to accelerate the launch of AI solutions.

The final layer, called resource management, is for scheduling, and operations and maintenance. To further increase utilisation of GPU and NPU resources, the AI Data Lake enables them to be xPU resource pooling, so they can be shared across AI workloads. The solution also supports AI Copilot for O&M, for example, an AI assistant can address 80% of typical queries and an AI inspection expert helps to explore log exceptions in real time, and to identify root causes of problems.

“When we started to have databases, we used data for recording,” said Mr Zhou “Then, when we had big data, we started to dig into the data, to try to have more information. Then, when AI came, data became knowledge. For organisations to become AI-Ready, I really believe they have to get-data ready first.”

Email Newsletters

Sign up to receive TelecomTV's top news and videos, plus exclusive subscriber-only content direct to your inbox.