News

Together AI
together.ai > models > wan-27

Wan 2.7 API

7+ hour, 25+ min ago  (255+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > models > parakeet-tdt-0-6b-v3

NVIDIA Parakeet TDT 0.6B v3 API

1+ day, 10+ hour ago  (173+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > ai-engineer-europe-2026

AI Engineer Europe 2026

2+ day, 22+ hour ago  (208+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > models > aura-2

Deepgram Aura-2 API

1+ week, 1+ day ago  (228+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > research-blog

Research Blog

1+ week, 1+ day ago  (201+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > models > gemini-31-flash-image

Gemini 3.1 Flash Image (Nano Banana 2) API

1+ week, 1+ day ago  (256+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > blog > together-ai-at-nvidia-gtc-2026

Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

2+ week, 2+ day ago  (660+ words) " FlashAttention-4: up to 1.3" faster than cuDNN on NVIDIA Blackwell " Introducing Together AI's new look " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " Together GPU Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens at…...

Together AI
together.ai > blog > flashattention-4

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

4+ week, 19+ hour ago  (1319+ words) Introducing Together AI's new look " For founders and builders defining the AI-native era. Register now " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " " Together Instant Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens…...

Together AI
together.ai > models

Build with leading AI models

4+ week, 2+ day ago  (275+ words) Introducing Together AI's new look " For founders and builders defining the AI-native era. Register now " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " " Together Instant Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens…...

Together AI
together.ai > blog

Blog

4+ week, 2+ day ago  (1542+ words) Introducing Together AI's new look " For founders and builders defining the AI-native era. Register now " " ATLAS: runtime-learning accelerators delivering up to 4x faster LLM inference " " Together Instant Clusters: self-service NVIDIA GPUs, now generally available " " Batch Inference API: Process billions of tokens…...