NVIDIA released Nemotron 3.5 Content Safety, a 4B-parameter model that combines multimodal input evaluation, multilingual reach across 140 languages, custom enterprise policy enforcement, and auditable reasoning traces in one inference call. The model addresses critical gaps in production AI safety pipelines.
Nvidia released Nemotron 3 Ultra, a 550-billion-parameter open-weight model optimized for long-running agents. While it's the fastest among U.S. open-weight models and promises 30% cost savings, it still lags behind Chinese competitors and GPT-5.5 on core benchmarks.
NVIDIA's research shows that synthetic training data structured around task families—not raw scale—drives targeted capability gains. Their approach improved scientific reasoning by 11 points while keeping math and code performance stable.
NVIDIA ships Cosmos 3, an omni-model that unifies world generation, physical reasoning, and action prediction in a single architecture. The release marks a shift from separate specialized models to one foundation model for robotics, autonomous vehicles, and smart spaces.
NVIDIA's new diffusion language models generate multiple tokens in parallel, hitting 865 tokens/second on B200 hardware — roughly 6× faster than traditional autoregressive models. Unlike GPT-style generation that produces one token at a time, these models draft and refine text blocks simultaneously while maintaining accuracy.