May 19, 2026

AMD says Malaysia has a role in Southeast Asia’s yotta-scale AI infrastructure push

  • AMD sees Malaysia as part of its Southeast Asia AI infrastructure focus.
  • AMD said yotta-scale AI will require enterprises to rethink AI infrastructure planning.

“In practical terms, yotta-scale AI represents an unprecedented level of global compute scale,” Navolokin said. “One exaflop represents a billion-billion calculations per second, while one yottaflop equals one million exaflops.”

Reaching that level would require the equivalent of millions of today’s exaflop-class supercomputers operating together. Navolokin said the move toward yotta-scale AI is being driven by AI’s transition from on-demand workloads to “always-on intelligence”, including continuous inference, reasoning, and autonomous agents serving billions of real-time interactions.

Navolokin said enterprise infrastructure planning is becoming more complex. Rather than focusing only on raw compute performance or individual components, organisations need to consider the wider system. He cited silicon, software, networking, memory, orchestration, and power efficiency as part of that planning.

Alexey Navolokin, General Manager, APAC, at AMD
Alexey Navolokin, General Manager, APAC, at AMD

“For enterprises, the real shift is that AI infrastructure planning is becoming much more complex,” Navolokin said. “Organisations can no longer focus only on raw compute performance or individual components.”

He said yotta-scale AI would require an open, distributed compute fabric where CPUs, GPUs, networking, and software are designed to work together. That fabric would need to operate across cloud platforms, centralised data centres, edge systems, and endpoint devices.

At the hardware level, Navolokin pointed to rack-scale and system-level architectures designed for large-scale inference and agentic AI workloads. These systems require high-bandwidth memory and energy-efficient compute. They also need tighter integration between CPUs, GPUs, and networking.

Navolokin said networking is becoming a core design requirement. As AI systems scale across thousands or millions of nodes, he said the challenge extends beyond compute performance to moving large volumes of data with low latency.

Navolokin said open standards such as UALink and Ultra Ethernet are relevant because they support scalable and interoperable AI infrastructure. On the software side, Navolokin said open ecosystems are needed for portability and workload optimisation across different environments.

“Developers and enterprises need portability, flexibility, and the ability to optimise workloads across diverse environments without being locked into proprietary stacks,” he said.

Platforms such as AMD ROCm, along with industry collaboration around open standards and frameworks, are part of that approach. Navolokin said this gives developers and enterprises more flexibility when building distributed AI systems.

Malaysia’s role in AMD’s regional AI focus

In Asia Pacific, Navolokin said Malaysia is building momentum through investments in digital infrastructure and AI-ready data centres. He also pointed to cloud adoption, workforce development, and the country’s position in the semiconductor and electronics ecosystem.

From AMD’s perspective, Malaysia’s role is tied not only to demand for compute, but also to the broader need for systems-level AI infrastructure. Navolokin said the company sees an opportunity to help enterprises build long-term AI environments. Those environments need to bring together silicon, software, networking, and energy efficiency.

“Malaysia is an important part of Southeast Asia’s growing AI ecosystem,” he said, adding that AMD remains focused on supporting the region through AI infrastructure and ecosystem partnerships.

Navolokin linked AMD’s regional focus to enterprise deployments across cloud, data centreedge, and endpoint environments. He said open platforms and ecosystem collaboration are becoming more important as those deployments expand.

From pilots to production

As organisations move from AI pilots to production, Navolokin said three issues appear most often. The first is infrastructure modernisation, as many enterprises still operate legacy environments that were not designed for continuous AI workloads.

He said organisations need to improve compute and power efficiency while optimising data centre space. They also need to refresh ageing systems to support real-time AI operations. These requirements become more important as inference workloads move into production environments.

Some organisations are still using AI mainly for workflow automation and efficiency gains, while others are exploring new business models built around AI, Navolokin said. The pace of that transition depends heavily on data readiness and whether enterprise workflows are structured in ways AI systems can use.

The second issue is data readiness. Navolokin said companies need to understand where their data resides and whether it is accessible across the organisation. They also need workflows that AI systems can use.

The third issue is architectural flexibility. As AI environments evolve, enterprises are looking for infrastructure that can integrate multiple technologies and scale across different deployment models. Navolokin said the goal is to do this without adding unnecessary complexity.

“AI readiness depends on how effectively organisations can modernise their enterprise stacks to connect data flows, applications, and operational workflows in ways that make AI practical at production scale,” Navolokin said.

Cloud, edge, and endpoint deployment

Hyperscale infrastructure will remain important for large-scale model training and inference. However, Navolokin said many emerging workloads require low-latency inferencing closer to where data is generated. These include use cases in manufacturing, logistics, retail, healthcare, and physical AI.

Navolokin said enterprises are placing more emphasis on distributed AI deployment across edge, on-premises, cloud, and client devices. He said organisations are also seeking consistency across these environments, including interoperability, operational efficiency, and predictable performance.

That distributed model also extends to endpoint devices, including AI PCs. Navolokin said some real-time inference workloads are better suited to systems closer to the data source. Latency, energy use, cost, and privacy requirements can differ from centralised infrastructure.

Navolokin said AI infrastructure is becoming more workload-aware. Different workloads require different types of compute in different locations, from centralised data centres to edge systems and endpoint devices.

Efficiency and flexibility

Power consumption and cost are also becoming central considerations. Navolokin said enterprises are increasingly focused on infrastructure productivity, or how efficiently they can deliver performance within power, cooling, and budget constraints.

“Different workloads have very different requirements, so improving efficiency at scale increasingly means using the right compute engine for the right job,” he said.

Depending on workload requirements, enterprises may use CPUs, GPUs, adaptive computing, edge systems, or AI PCs. Navolokin said openness and interoperability remain part of that efficiency discussion. These considerations become more important as organisations deploy AI across cloud, on-premises, edge, and endpoint environments.

Navolokin said AMD’s regional role is centred on workload flexibility, pointing to CPUs, GPUs, adaptive computing, edge systems, and AI PCs as options for different AI requirements.

To avoid repeated re-architecture, Navolokin said enterprises should design AI infrastructure around openness and flexibility. Open ecosystems allow organisations to choose tools for specific workloads, customise deployments, and scale without being locked into proprietary architectures.

He said large-scale model training will continue to rely on centralised infrastructure. At the same time, real-time inference workloads can be better suited to edge systems or AI PCs located closer to the data source. These environments can help address latency, energy use, and data privacy requirements.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology eventsclick here for more information.

TNG – Latest News & Reviews