NVIDIA Launches Nemotron 3 Nano Omni Model, Boosting Multimodal Inference Efficiency by 9x
2026-04-28 16:07
NVIDIA announced on X platform that it has launched the open-source multimodal model Nemotron 3 Nano Omni today. The model adopts a 30B-A3B mixture-of-experts (MoE) architecture, supports a 256K context window, and can uniformly process video, audio, image, and text inputs. Compared to open-source omnimodal models at a similar interaction level, this model achieves up to a 9x increase in throughput, significantly reducing inference costs and improving scalability. Nemotron 3 Nano Omni is now available on Hugging Face, OpenRouter, and NVIDIA NIM, and has been adopted by enterprises including Aible, Applied Scientific Intelligence, and H Company.
