คุณกำลังมองหาข้อมูลเกี่ยวกับ "Agent factory" ของ Jensen Huang และเนื้อหาที่เกี่ยวข้องจาก COMPUTEX 2026 อยู่ใช่ไหมคะ ฉันสามารถสรุปและแปลเนื้อหาที่คุณให้มาเป็นภาษาไทยให้คุณได้เลย นี่คือเนื้อหาที่คุณส่งมา พร้อมการรักษาโครงสร้าง HTML และตัวคั่น `

星球君的朋友们

Odaily资深作者

2026-06-01 12:00

บทความนี้มีประมาณ 6403 คำ การอ่านทั้งหมดใช้เวลาประมาณ 10 นาที

` ไว้ตามเดิม:

สรุปโดย AI

ขยาย

เมื่อ Agent กลายเป็นโครงสร้างพื้นฐานของ AI ทุกเลเยอร์ก็จะมี NVIDIA อยู่เบื้องหลัง

Original Authors: Li Hailun, Su Yang

Editor: Xu Qingyang

Source: Tencent Tech

On June 1, 2026, at the NVIDIA GTC Taipei conference held during COMPUTEX 2026, NVIDIA founder and CEO Jensen Huang delivered a keynote speech.

It had only been three months since the last GTC.

At that time, NVIDIA unveiled the Vera Rubin "chip family," including: Vera CPU, Rubin GPU, Groq 3 LPU, ConnectX-9, BlueField-4 DPU, and Spectrum-6 switch——six chips constituting a rack-scale AI supercomputer. It was also announced that the number of GPUs required for training large MoE models would be reduced to one-quarter, inference throughput per watt would increase 10x, and the cost per token would drop to one-tenth.

Unlike the previous emphasis on system-level solutions like the "chip family" and "computing family," three months later at COMPUTEX, Jensen Huang shifted his focus to the target these infrastructures will serve——Agents.

During his speech, Jensen Huang revealed: Vera Rubin has officially entered mass production, Vera CPU has begun global deliveries, DGX Station has entered the enterprise desktop for the first time in a Windows form factor, Cosmos 3 has restructured the perception framework for Physical AI, and DSX has become the operating system for AI factories. NVIDIA also partnered with Unitree to launch the H2 Plus——the first humanoid robot reference design based on Isaac GR00T, extending the Agent's boundaries from the digital world to the physical form.

NVIDIA is reorganizing a complete technology stack from chips, data centers, models, and software to robotics platforms around the Agent ecosystem.

Jensen Huang said: "The era of Agent AI and Practical AI has arrived. Now tokens are profit units, AI is a GDP 'generator,' and the number of software engineers is increasing. People talk about AI reducing jobs; that's complete nonsense. In reality, more software engineers are being hired."

The Same AI Factory, Running 10x More Agent Tasks

The Vera Rubin platform has entered full production.

Unlike the past, which was mainly focused on large model training and inference, Vera Rubin was designed from the outset with Agents as a key workload.

In his speech, Jensen Huang mentioned that an Agent task often isn't just a single model inference but includes multiple steps like reasoning, search, tool calling, code execution, and result verification, potentially involving thousands of underlying steps. What future data centers need to handle will no longer be just single model requests, but a large number of continuously running, collaborative Agent tasks.

The platform is defined as a massive, unified computing unit-level AI supercomputer, purpose-built for handling Agent workloads ranging from inference and retrieval to tool use. In a super-large data center of the same scale, using the new Vera Rubin platform to run autonomous AI Agent tasks achieves 10x the processing efficiency of the previous generation Grace Blackwell platform.

Beyond the computing platform itself, networking has also become a key upgrade focus for Vera Rubin.

In the past, data transfer between GPUs in data centers primarily relied on traditional optical modules and switch architectures. However, as cluster sizes continue to scale, power consumption, heat dissipation, and deployment complexity increase rapidly. To address this, NVIDIA introduced the Spectrum-X Ethernet Photonics networking system in the Vera Rubin platform.

This is the first time NVIDIA has introduced Co-Packaged Optics (CPO) technology at scale into AI data center networks.

Simply put, traditional solutions require plugging optical modules outside the switch, whereas CPO integrates the optical components directly inside the switch, thereby reducing energy consumption and signal loss.

Additionally, security is another core capability heavily emphasized for the Vera Rubin platform this time.

To this end, NVIDIA has extended Confidential Computing capabilities across the entire Vera Rubin platform. Through trusted execution environments, hardware-level attestation, and end-to-end encryption mechanisms, enterprises can achieve a higher level of security assurance when processing private data, industry-sensitive information, and critical models.

Jensen Huang revealed that Vera Rubin has entered the mass production phase. As a third-generation MGX rack-scale system, it involves over 150 partners, more than 350 factories, and a supply chain covering over 30 countries and regions. According to NVIDIA's announced plan, Vera Rubin will begin official shipments this autumn.

A Processor "Born for Agents"

NVIDIA has launched a new processor, Vera, designed specifically for the Agent era, and it is already in full production.

Jensen Huang pointed out that advancements in memory systems will drive innovation and modernization in storage systems. Until now, all CPUs were built for humans. Vera is a CPU designed for the AI era, built for Agents.

As the successor to Grace, Vera adopts NVIDIA's self-designed "Olympus" CPU core architecture, increasing the core count from 72 to 88, and significantly improving memory and data processing capabilities. According to NVIDIA, in tests related to Agent workloads, the Vera's task execution speed reaches 1.8x that of comparable x86 server CPUs.

More important than the pure performance improvement is the change in the relationship between Vera and the Rubin GPU: Vera connects to the Rubin GPU via second-generation NVLink-C2C, achieving an interconnect bandwidth of 1.8 TB/s, further reducing the overhead of data transfer between CPU and GPU during Agent operation.

Jensen Huang stated that Vera Rubin uses HBM (High Bandwidth Memory) from Micron, SK hynix, and Samsung, with a supply chain "twice the size" of the previous generation Blackwell. However, while deploying a large Blackwell rack took two hours, the time for Vera Rubin has been compressed to the five-minute level.

Moving AI Factories from "Construction" to "Operation"

The NVIDIA DSX launched this time can be understood as an "AI Factory Construction and Operations Toolbox."

In the past, when building an AI data center, customers had to separately consider servers, networking, power, cooling, facility design, and operations systems, with many aspects relying on coordination between different vendors. What DSX aims to do is consolidate these disparate aspects into a single framework, providing customers with a set of referenceable and verifiable standard solutions for design, simulation, construction, and operation.

At the launch event, Jensen Huang stated: NVIDIA is not just selling chips; it is providing infrastructure builders with a complete blueprint for an AI factory.

The two most important new capabilities of DSX this time are:

First is DSX MaxLPS. It addresses the most practical problem for AI factories: given a fixed power budget, how to fit in more GPUs and generate more Tokens.

According to NVIDIA, MaxLPS, combining liquid cooling and in-rack power optimization, allows operators to run up to 40% more GPUs without significantly impacting performance.

Second is DSX OS. It acts as the operating software for the AI factory, responsible for lifecycle management, intelligent scheduling, health monitoring, failure recovery, and multi-tenant management. Simply put, if an AI factory is a complex facility, DSX OS is responsible for keeping it running stably and continuously.

Within the DSX product matrix, Reference Design provides reference designs for AI factories, advising customers on how to set up facilities, racks, networks, power, and cooling systems; DSX Sim handles simulation, allowing customers to verify the feasibility of a design before building; DSX Flex connects the AI factory to the power grid, enabling data centers to adjust tasks based on electricity prices, load, and demand response signals; and DSX Exchange is responsible for opening data interfaces between IT systems, operations systems, energy, and cooling systems.

On the ecosystem side, cloud partners like CoreWeave, Crusoe, and Lambda are deploying DSX Sim, MaxLPS, and DSX OS to reduce risk and improve GPU utilization. Manufacturers like Dell, HPE, Lenovo, Supermicro, as well as ASUS, Foxconn, Gigabyte, and QCT, are building systems that support DSX.

Aligning with Windows and ARM

During his on-stage speech, Jensen Huang officially announced the "DGX Station for Windows" workstation, defined by NVIDIA as a desktop-grade AI supercomputer for the Windows ecosystem.

Hardware-wise, it is powered by the GB300 Grace Blackwell Ultra Desktop Superchip, connecting the Blackwell Ultra GPU with a 72-core Grace CPU via NVLink-C2C, offering up to 748GB of unified memory and 20 PFLOPS of FP4 performance, equipped with up to 800Gb/s networking capability.

The key point of this product lies in the change in Agent deployment method.

NVIDIA hopes that enterprises can run multiple Agents locally, safely, and manageably within a Windows environment, integrating them into workflows for design, engineering, data science, reasoning, and Physical AI. The concurrently launched OpenShell handles Agent runtime security, using isolated sandboxes and system-level policy controls to restrict Agents from unauthorized actions or leaking credentials and private data.

In addition to enterprise desktop products, Jensen Huang also unveiled a system-level SoC——the RTX Spark SoC. This integrates the N1X CPU and Blackwell GPU onto a single chip with a unified memory architecture, specifically designed for thin-and-light laptops and small form-factor desktops.

Among these, the N1X is the first PC processor co-created by NVIDIA and Microsoft. Based on the Arm architecture, it was custom-designed by MediaTek and manufactured using TSMC's 3nm process. It will first debut this autumn in laptops from Microsoft, Dell, HP, ASUS, Lenovo, and MSI, with over 30 models initially, targeting the high-end thin-and-light segment.

This is NVIDIA's "super chip" for the AI PC era, which Jensen Huang sees as a significant restructuring of the PC form factor.

The Agent's "Two Brains"

At this event, NVIDIA announced the latest progress on two core model product lines, corresponding to two scenarios for Agents: one running within enterprise systems, the other in the physical world.

NVIDIA launched the Nemotron 3 Ultra, a 550 billion parameter Mixture-of-Experts model, designed to provide top-tier intelligence for long-running Agents in code development, scientific research, and enterprise business processes. Compared to comparable leading open-source frontier models, this model offers up to 5x faster inference speed and up to 30% lower usage cost, enabling Agents to complete tasks more efficiently and cost-effectively.

Around the open Nemotron model, NVIDIA released a series of software, open-source models, and partnership updates, aiming to help enterprises build "digital colleagues" that can assist employees in scenarios like engineering design, healthcare, software development, and business operations.

Within this portfolio, Nemotron provides the base model capability, NemoClaw is responsible for organizing models into Agents, OpenShell handles runtime security, and the Agent Toolkit turns NVIDIA software libraries like CUDA-X into tools that Agents can directly call. Agents can use tools, call data, execute tasks, and interface with existing enterprise systems within a controlled environment.

Jensen Huang stated that global software companies are bringing AI Agents into real work systems, helping employees complete complex tasks faster. NemoClaw provides the open-source components needed to build long-running Agents, including capabilities for orchestration, context, memory, tool calling, and security controls.

In the past, enterprise discussions about AI focused more on what a model could answer. Now, NVIDIA is addressing how Agents can safely interface with tools, data, and business processes, and run continuously in real-world work.

Then there's Cosmos 3, the third generation of the Cosmos series, which also represents a restructuring at the architectural level.

Cosmos 3 is a world foundation model for Physical AI, providing underlying capabilities to "understand the physical world, predict what will happen, and decide what to do."

Compared to previous Cosmos versions, which were primarily targeted at robotics and autonomous driving developers for video generation and physical world simulation, essentially a relatively single-modal generation framework, Cosmos 3 features a new architecture—a Hybrid Transformer—unifying visual reasoning, world generation, and action prediction into a single system for the first time.

It natively understands and generates text, images, video, environmental sounds, and actions. It achieves leading physical accuracy and is the world's first fully open omnimodal model. NVIDIA claims it has the potential to compress the Physical AI training and evaluation cycle from months down to days.

Jensen Huang predicts that thanks to breakthroughs in multimodal reasoning, language, vision, and world models, a big bang in Physical AI is imminent.

The Cosmos 3 series of open frontier omnimodal models provides developers with a generational leap in capability to build robots, autonomous vehicles, and vision AI that can perceive, reason, plan, and act within the physical world.

Lowering the Barrier to Physical AI

NVIDIA and Unitree jointly released the H2 Plus——a humanoid robot development platform for researchers and developers.

"Development platform" means: Unitree provides the robot hardware, NVIDIA provides the software and computing platform. They pre-integrate the hardware and software, so development teams can start working on skill development immediately without spending time solving underlying integration issues. It is also the world's first open humanoid robot built on the NVIDIA Isaac GR00T development platform.

This platform targets a long-standing pain point in humanoid robot development: hardware integration, data collection, simulation, training, evaluation, and deployment are often siloed, making the entire process highly fragmented.

NVIDIA states that when a research team obtains a robot body, they often spend a significant amount of time on underlying assembly, pushing actual skill development further down the line. What the H2 Plus attempts to do is streamline this path, allowing research teams to bypass underlying integration and jump directly into skill development and real-world scenario validation.

In Jensen Huang's view, humanoid robots will bring Physical AI to the world's largest industries, unlocking a multi-trillion dollar economic opportunity. The H2 Plus is the starting point for moving cutting-edge research into real-world scenarios like factories, warehouses, and logistics systems.

Additionally, NVIDIA announced the official open-sourcing of a set of Physical AI Skills tools, covering core scenarios like robotics, autonomous driving, vision AI, and industrial digital twins.

These "Skills" can be understood as standardized ways of using NVIDIA's platforms like Cosmos, Omniverse, Isaac, and Metropolis, written as operational instructions that Agents can directly read and execute. Packaging these instructions as open-source is the toolkit released this time.

When an Agent receives a task—for example, generating a batch of training data for defect detection—it knows which model to call, what format to output, and how to verify the results. The entire process runs automatically without requiring humans to operate each step manually.