黄仁勋的「Agent工厂」里，装了什么新故事？

星球君的朋友们

Odaily资深作者

2026-06-01 12:00

Bài viết này có khoảng 6403 từ, đọc toàn bộ bài viết mất khoảng 10 phút

当Agent成为AI基础设施，每一层都能有英伟达。

Tóm tắt AI

Mở rộng

Quan điểm chính: NVIDIA tại COMPUTEX 2026 tuyên bố xây dựng hệ thống công nghệ hoàn chỉnh từ chip, mô hình đến nền tảng robot xoay quanh “Agent AI”. Nền tảng Vera Rubin được sản xuất hàng loạt và tối ưu hóa riêng cho các tác vụ Agent, nhằm đưa nhà máy AI từ giai đoạn cơ sở hạ tầng sang kỷ nguyên vận hành và triển khai mới.
Các yếu tố then chốt:
1. Nền tảng Vera Rubin được thiết kế riêng cho Agent và đã được sản xuất hàng loạt, hiệu suất xử lý gấp 10 lần thế hệ Grace Blackwell trước đó, và lần đầu tiên giới thiệu công nghệ mạng CPO cũng như tính toán bảo mật trên quy mô lớn.
2. Công ty ra mắt CPU Vera được thiết kế riêng cho kỷ nguyên AI, hiệu suất xử lý tác vụ Agent gấp 1.8 lần so với máy chủ x86 cùng thời kỳ, hiện đã được đưa vào sản xuất toàn diện.
3. Giới thiệu hệ điều hành nhà máy DSX, bao gồm MaxLPS (tối ưu hóa điện năng) và DSX OS (quản lý vận hành), mục tiêu là chuẩn hóa việc xây dựng và vận hành các trung tâm dữ liệu AI.
4. Ra mắt mô hình hỗn hợp chuyên gia 5500 tỷ tham số Nemotron 3 Ultra và khung Agent mã nguồn mở NemoClaw, dùng để xây dựng các “đồng nghiệp kỹ thuật số” cấp doanh nghiệp.
5. Ra mắt mô hình AI vật lý thế hệ thứ ba Cosmos 3, hợp nhất suy luận thị giác, sinh tạo và dự đoán hành động, nhằm rút ngắn chu kỳ huấn luyện AI vật lý từ vài tháng xuống còn vài ngày.
6. Hợp tác với Unitree ra mắt thiết kế tham chiếu robot hình người H2 Plus, đồng thời mã nguồn mở bộ công cụ “kỹ năng” AI vật lý, hạ thấp rào cản phát triển robot.
7. Vera BlueField-4 STX nâng cấp bảo mật lưu trữ, thông qua việc thực thi chính sách cấp chip đảm bảo an toàn tương tác giữa Agent và dữ liệu doanh nghiệp.

Original authors: Li Hailun, Su Yang

Original editor: Xu Qingyang

Original source: Tencent Technology

On June 1, 2026, at the NVIDIA GTC Taipei conference held during COMPUTEX 2026, NVIDIA founder and CEO Jensen Huang delivered a keynote speech.

It has been only three months since the last GTC.

At that time, NVIDIA released the "full chip family" of Vera Rubin, including: Vera CPU, Rubin GPU, Groq 3 LPU, ConnectX-9, BlueField-4 DPU, and Spectrum-6 switch. These six chips form a rack-scale AI supercomputer, and declared that the number of GPUs required for training large MoE models has been reduced to a quarter, inference throughput per watt has increased 10 times, and the cost per token has dropped to one-tenth.

Unlike the previous emphasis on system-level solutions like the "chip family" and "computing family," at COMPUTEX three months later, Jensen Huang focused on the target these infrastructures will serve — Agents.

Jensen Huang revealed in his speech: Vera Rubin has officially entered mass production, Vera CPU has begun global delivery, DGX Station has entered enterprise desktops for the first time in a Windows form, Cosmos 3 has restructured the perception framework for physical AI, and DSX has become the operating system for AI factories. NVIDIA also partnered with Unitree to release the H2 Plus — the first humanoid robot reference design based on Isaac GR00T, extending the boundaries of Agents from the digital world to physical forms.

NVIDIA is reorganizing a complete technical system from chips, data centers, models, software to robotic platforms around the Agent ecosystem.

Jensen Huang said: "The era of Agent AI and practical artificial intelligence has arrived. Now, tokens are the unit of profit, AI is a GDP 'generator,' and the number of software engineers is increasing. People talk about AI reducing jobs, which is completely nonsense. In reality, more software engineers are being hired."

The Same AI Factory, Running 10x More Agent Tasks

The Vera Rubin platform has entered full production.

Unlike the past, which mainly focused on large model training and inference, Vera Rubin has treated Agents as a key workload from the design phase.

Jensen Huang stated in his speech that an Agent task is often not just a single model inference, but includes multiple steps such as reasoning, search, tool calling, code execution, and result verification, potentially involving thousands of steps behind the scenes. Future data centers will need to handle not just single model requests, but a large number of continuously running, collaborating Agent tasks.

The platform is defined as a massive, unified computing-unit-level AI supercomputer, purpose-built for handling agent workloads from reasoning and retrieval to tool usage. In a similarly sized hyperscale data center, running autonomous AI agent tasks on the new Vera Rubin platform is 10 times more efficient than the previous Grace Blackwell platform.

Beyond the computing platform itself, networking is also a key upgrade focus for Vera Rubin.

In past data centers, data transmission between GPUs mainly relied on traditional optical modules and switch architectures. However, as cluster sizes continue to expand, power consumption, heat dissipation, and deployment complexity increase rapidly. To address this, NVIDIA introduced the Spectrum-X Ethernet Photonics networking system in the Vera Rubin platform.

This is the first time NVIDIA has大规模引入共封装光学 (CPO) technology into AI data center networks.

Simply put, traditional solutions require plugging optical modules outside the switch, whereas CPO directly integrates optical components inside the switch, thereby reducing energy consumption and signal loss.

Furthermore, security is a core capability heavily emphasized in this Vera Rubin platform.

To this end, NVIDIA has extended Confidential Computing capabilities to the entire Vera Rubin platform. Through trusted execution environments, hardware-level verification, and end-to-end encryption mechanisms, enterprises can achieve a higher level of security when processing private data, sensitive industry information, and key models.

Jensen Huang revealed that Vera Rubin has entered mass production. As the third-generation MGX rack-scale system, it involves over 150 partners, more than 350 factories, and a supply chain covering over 30 countries and regions. According to NVIDIA's announced plan, Vera Rubin will begin formal shipments this autumn.

A Processor "Born for Agents"

NVIDIA has launched a new processor, Vera, designed specifically for the agent era, and it has entered full production.

Jensen Huang pointed out that advancements in memory systems will drive innovation and modernization in storage systems. All CPUs up to now have been designed for humans, but Vera is a CPU designed for the AI era, built for agents.

As the successor to Grace, Vera adopts NVIDIA’s custom-designed "Olympus" CPU core architecture, increasing the number of cores from 72 to 88, and significantly improving memory and data processing capabilities. According to NVIDIA, in tests related to Agent workflows, Vera achieves 1.8 times the task execution speed of comparable x86 server CPUs.

More important than the pure performance boost is the change in the relationship between Vera and the Rubin GPU: Vera connects to the Rubin GPU via second-generation NVLink-C2C, achieving an interconnect bandwidth of 1.8TB/s, further reducing the overhead of data transfer between CPU and GPU during Agent operation.

Jensen Huang stated that Vera Rubin uses HBM (High Bandwidth Memory) from Micron, SK Hynix, and Samsung, with a supply chain volume "double" that of the previous generation Blackwell. However, deploying a large Blackwell rack takes two hours, whereas the time for Vera Rubin has been compressed to the minute level.

Moving AI Factories from "Construction" to "Operation"

The DSX introduced by NVIDIA this time can be understood as an "AI factory construction and operation toolkit."

In the past, building an AI data center required customers to consider servers, networking, power, cooling, facility design, and operation & maintenance systems separately, with many parts relying on coordination between different vendors. What DSX aims to do is bring these previously fragmented components into a single framework, providing customers with a referenceable and verifiable standard solution from design, simulation, and construction to operation.

Jensen Huang stated at the launch: "NVIDIA is not just selling chips, but providing infrastructure builders with a complete blueprint for an AI factory."

There are two most important new capabilities added to DSX this time.

The first is DSX MaxLPS. It addresses the most practical problem for AI factories: how to fit more GPUs and generate more Tokens when the power budget is fixed.

According to NVIDIA, MaxLPS, combined with liquid cooling and in-rack power optimization, allows operators to run up to 40% more GPUs without significantly impacting performance.

The second is DSX OS. It acts as the operational software for the AI factory, responsible for lifecycle management, intelligent scheduling, health monitoring, fault recovery, multi-tenancy management, and more. Simply put, if an AI factory is a complex plant, DSX OS is responsible for ensuring its continuous and stable operation.

Within the DSX product matrix, the Reference Design provides AI factory reference designs, telling customers how to set up the facility, racks, networks, power, and cooling systems; DSX Sim handles simulation, allowing customers to verify design feasibility before construction; DSX Flex connects the AI factory to the power grid, enabling the data center to adjust tasks based on electricity prices, load, and demand response signals; DSX Exchange is responsible for opening data interfaces between IT systems, operational systems, energy, and cooling systems.

On the ecosystem front, cloud partners like CoreWeave, Crusoe, and Lambda are deploying DSX Sim, MaxLPS, and DSX OS to reduce risk and improve GPU utilization. Manufacturers like Dell, HPE, Lenovo, Supermicro, as well as ASUS, Foxconn, Gigabyte, and Quanta Cloud Technology, are building systems that support DSX.

Aligning with Windows and ARM

During the live speech, Jensen Huang officially announced the debut of the "DGX Station for Windows" workstation, defined by NVIDIA as a desktop-class AI supercomputer for the Windows ecosystem.

In terms of hardware, it is equipped with the GB300 Grace Blackwell Ultra Desktop Superchip, connecting the Blackwell Ultra GPU with a 72-core Grace CPU via NVLink-C2C, offering up to 748GB of unified memory and 20 PFLOPS FP4 performance, along with networking capabilities up to 800Gb/s.

The key point of this product lies in the change in Agent deployment methods.

NVIDIA hopes that enterprises can run multiple Agents locally, securely, and manageably within a Windows environment, and integrate them into workflows for design, engineering, data science, reasoning, and Physical AI. The simultaneously launched OpenShell is responsible for Agent runtime security, using isolated sandboxes and system-level policy controls to prevent Agent privilege escalation or leakage of credentials and private data.

In addition to products for the enterprise desktop, Jensen Huang also unveiled a system-on-a-chip (SoC) — the RTX Spark SoC, integrating the N1X CPU and Blackwell GPU onto a single chip with a unified memory architecture, specifically for thin-and-light laptops and small desktops.

Among these, the N1X is NVIDIA's first PC processor co-developed with Microsoft. It is based on the Arm architecture, custom-designed by MediaTek, and manufactured using TSMC's 3nm process. It will first be featured this autumn in laptops from Microsoft, Dell, HP, ASUS, Lenovo, and MSI, with over 30 models initially, targeting high-end thin-and-light devices.

This is NVIDIA's "super chip" prepared for the AI PC era, which Jensen Huang sees as a major redefinition of the PC form factor.

Agent's "Two Brains"

At this conference, NVIDIA announced the latest progress on two core model product lines, corresponding to two scenarios for Agents: one running within enterprise systems, and the other running in the physical world.

NVIDIA released a 550-billion-parameter Mixture-of-Experts model, Nemotron 3 Ultra, designed to provide top-tier intelligence for long-duration agents in code development, scientific research, and enterprise business processes. Compared to leading open-source frontier models of similar scale, this model offers up to 5x faster inference speed and up to 30% lower usage costs, helping agents complete various tasks more efficiently and cost-effectively.

Around the open Nemotron model, NVIDIA released a series of software, open-source models, and partnership updates, aiming to enable enterprises to build "digital colleagues" capable of assisting employees in scenarios like engineering design, healthcare, software development, and business operations.

Within this suite, Nemotron provides foundational model capabilities, NemoClaw is responsible for organizing models into Agents, OpenShell handles runtime security, and the Agent Toolkit transforms NVIDIA software libraries like CUDA-X into tools that Agents can directly call. Agents can use tools, call data, execute tasks in a controlled environment, and interface with existing enterprise systems.

Jensen Huang stated that global software companies are integrating AI Agents into real work systems, helping employees complete complex tasks faster. NemoClaw provides the open components needed to build long-running Agents, including capabilities for orchestration, context, memory, tool calling, and security control.

In the past, enterprise discussions about AI focused more on what models could answer; now, NVIDIA needs to solve how Agents can safely access tools, data, and business processes, and operate continuously in real work environments.

Then there is Cosmos 3, officially released as the third generation of the Cosmos series, marking a significant architectural restructure.

Cosmos 3 is a world foundation model for physical AI, providing the underlying ability to "understand the physical world, predict what will happen, and decide what to do."

Compared to previous Cosmos versions, which were primarily targeted at robotics and autonomous driving developers, focused on video generation and physical world simulation, making it essentially a relatively single-modal generative framework, Cosmos 3 uses a different architecture — a Hybrid Transformer. It unifies visual reasoning, world generation, and action prediction into a single system for the first time.

It can natively understand and generate text, images, video, ambient sounds, and actions, achieving a leading level of physical accuracy. It is the world's first fully open, all-capable model of its kind. NVIDIA claims it has the potential to compress the training and evaluation cycle for physical AI from months to days.

Jensen Huang predicted that thanks to breakthroughs in multimodal reasoning language, vision, and world models, the big bang of physical AI is imminent.

The open frontier all-capable models in the Cosmos 3 series provide developers with a generational leap in capability to build robots, autonomous vehicles, and vision AI capable of perceiving, reasoning, planning, and acting in the physical world.

Lowering the Barrier to Physical AI

NVIDIA, in partnership with Unitree, released the H2 Plus — a humanoid robot reference platform for researchers and developers.

"Reference platform" means Unitree provides the robot hardware body, while NVIDIA provides the software and computing platform. The two sides pre-integrate the hardware and software, allowing development teams to start skill development directly without spending time solving underlying interface issues. It is also the world's first open humanoid robot built on the NVIDIA Isaac GR00T development platform.

This reference platform targets a long-standing pain point in humanoid robot development: hardware integration, data collection, simulation, training, evaluation, and deployment are all separate silos, making the entire process highly fragmented.

NVIDIA stated that research teams often spend a significant amount of time patching together low-level components upon receiving a robot body, pushing actual skill development further back. What the H2 Plus attempts to do is streamline this path, allowing research teams to bypass underlying integration and proceed directly to skill development and real-world scenario validation.

In Jensen Huang's view, humanoid robots will bring physical AI to the world's largest industries, unlocking trillions of dollars in economic opportunities. The H2 Plus is the starting point for moving cutting-edge research into real-world scenarios like factories, warehouses, and logistics systems.

Furthermore, NVIDIA announced the official open-sourcing of a set of Physical AI Skills toolkits, covering core scenarios like robotics, autonomous driving, vision AI, and industrial digital twins.

"Skills" can be understood as NVIDIA standardizing how to use its platforms like Cosmos, Omniverse, Isaac, and Metropolis, then writing them into operation instructions that agents can directly read and execute. These packaged and open-sourced instructions constitute the released toolkit.

When an agent receives a task, such as generating training data for detecting defects, it knows which model to call, what format to output, and how to validate the results. The entire process runs automatically, without requiring manual steps for each stage.