Re-engineering for better results: The Huawei AI stack

Huawei has released its CloudMatrix 384 AI chip cluster, a new system for AI learning. It employs clusters of Ascend 910C processors, joined via optical links. The distributed architecture means the system can outperform traditional hardware GPU setups, particularly in terms of resource use and on-chip time, despite the individual Ascend chips being less powerful than those of competitors.

Huawei’s new framework positions the tech giant as a “formidable challenger to Nvidia’s market-leading position, despite ongoing US sanctions,” the company claims.

To use the new Huawei framework for AI, data engineers will need to adapt their workflows, using frameworks that support Huawei’s Ascend processors, such MindSpore, which are available from Huawei and its partners

Framework transition: From PyTorch/TensorFlow to MindSpore

Unlike NVIDIA’S ecosystem, which predominantly uses frameworks like PyTorch and TensorFlow (engineered to take full advantage of CUDA), Huawei’s Ascend processors perform best when used with MindSpore, a deep learning framework developed by the company.

If data engineers already have models built in PyTorch or TensorFlow, they will likely need to convert models to the MindSpore format or retrain them using the MindSpore API.

It is worth noting that MindSpore uses different syntax, training pipelines and function calls from PyTorch or TensorFlow, so a degree of re-engineering will be necessary to replicate the results from model architectures and training pipelines. For instance, individual operator behaviour varies, such as padding modes in convolution and pooling layers. There are also differences in default weight initialisation methods.

Using MindIR for model deployment

MindSpore employs MindIR (MindSpore Intermediate Representation), a close analogue to Nvidia NIM. According to MindSpore’s official documentation, once a model has been trained in MindSpore, it can be exported using the mindspore.export utility, which converts the trained network into the MindIR format.

Detailed by DeepWiki’s guide, deploying a model for inference typically involves loading the exported MindIR model and then running predictions using MindSpore’s inference APIs for Ascend chips, which handle model de-serialisation, allocation, and execution.

MindSpore separates training and inference logic more explicitly than PyTorch or TensorFlow. Therefore, all preprocessing needs to match training inputs, and static graph execution must be optimised. MindSpore Lite or Ascend Model Zoo are recommended for additional hardware-specific tuning.

Adapting to CANN (Compute Architecture for Neural Networks)

Huawei’s CANN features a set of tools and libraries tailored for Ascend software, paralleling NVIDIA’s CUDA in functionality. Huawei recommends using CANN’s profiling and debugging tools to monitor and improve model performance on Ascend hardware.

Execution Modes: GRAPH_MODE vs.PYNATIVE_MODE

MindSpore provides two execution modes:

GRAPH_MODE – Compiles the computation graph before execution. This can result in faster execution and better performance optimisation since the graph can be analysed during compilation.PYNATIVE_MODE – Immediately executes operations, resulting in simpler debugging processes, better suited, therefore, for the early stages of model development, due to its more granular error tracking.

For initial development, PYNATIVE_MODE is recommended for simpler iterative testing and debugging. When models are ready to be deployed, switching to GRAPH_MODE can help achieve maximum efficiency on Ascend hardware. Switching between modes lets engineering teams balance development flexibility with deployment performance.

Code should be adjusted for each mode. For instance, when in GRAPH_MODE, it’s best to avoid Python-native control flow where possible.

Deployment environment: Huawei ModelArts

As you might expect, Huawei’s ModelArts, the company’s cloud-based AI development and deployment platform, is tightly integrated with Huawei’s Ascend hardware and the MindSpore framework. While it is comparable to platforms like AWS SageMaker and Google Vertex AI, it is optimised for Huawei’s AI processors.

Huawei says ModelArts supports the full pipeline from data labelling and preprocessing to model training, deployment, and monitoring. Each stage of the pipeline is available via API or the web interface.

In summary

Adapting to MindSpore and CANN may necessitate training and time, particularly for teams accustomed to NVIDIA’s ecosystem, with data engineers needing to understand various new processes. These include how CANN handles model compilation and optimisation for Ascend hardware, adjusting tooling and automation pipelines designed initially for NVIDIA GPUs, and learning new APIs and workflows specific to MindSpore.

Although Huawei’s tools are evolving, they lack the maturity, stability, and broader ecosystem support that frameworks like PyTorch with CUDA offer. However, Huawei hopes that migrating to its processes and infrastructure will pay off in terms of results, and let organisations reduce reliance on US-based Nvidia.

Huawei’s Ascend processors may be powerful and designed for AI workloads, but they have only limited distribution in some countries. Teams outside Huawei’s core markets may struggle to test or deploy models on Ascend hardware, unless they use partner platforms, like ModelArts, that offer remote access.

Fortunately, Huawei provides extensive migration guides, support, and resources to support any transition.

(Image source: “Huawei P9” by 405 Mi16 is licensed under CC BY-NC-ND 2.0.)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Source link