Coverage of the tools, platforms, and developer ecosystems shaping how AI is built and deployed.
Tenstorrent is releasing the QuietBox 2, a desktop AI workstation priced at **$9,999** that packs four custom Blackhole AI accelerators and **128 GB of GDDR6 memory** into a PC-sized case drawing only **1,400 W** — safely within a standard home circuit. Slated for Q2 2026, the machine runs Meta's Llama 3.1 70B at nearly 500 tokens per second and targets developers who want local AI inference without remote server access, competing directly with Nvidia's DGX Station, which starts at roughly **$85,000**.


Amazon Web Services has published a technical guide detailing how to build a text-to-SQL pipeline using Amazon Bedrock, the company's managed generative AI service. The solution translates plain-language business questions into executable database queries, aiming to reduce the technical barrier between non-technical users and enterprise data. The guide targets developers building internal data tools and analytics interfaces on AWS infrastructure.

Amazon Web Services has published a technical walkthrough demonstrating how developers can use **Amazon Nova Sonic** to generate automated, real-time conversational podcasts featuring two distinct AI hosts. The guide covers Nova Sonic's streaming audio capabilities, stage-aware content filtering, and live audio generation — offering a practical blueprint for teams looking to automate audio content production at scale.
Hugging Face has introduced a new server mode for Gradio that decouples the backend API from its default frontend, allowing developers to connect any custom interface to Gradio's underlying AI infrastructure. The update, announced on the Hugging Face Blog, gives teams greater flexibility to build polished, production-grade applications without being locked into Gradio's built-in UI components.

Amazon Web Services has published a technical blueprint for building a hybrid retrieval-augmented generation (RAG) system using Amazon Bedrock, Amazon Bedrock AgentCore, Strands Agents, and Amazon OpenSearch. The architecture combines semantic vector search with traditional keyword-based search to improve answer accuracy in generative AI assistants. The guidance targets developers building enterprise search and question-answering applications on AWS infrastructure.

Amazon Web Services has published a technical walkthrough detailing how to fine-tune **Qwen 2.5 7B Instruct** for agentic tool calling using Reinforcement Learning from Verifiable Rewards (RLVR) inside **Amazon SageMaker AI**'s serverless model customization pipeline. The guide covers dataset preparation across three agent behaviors, tiered reward function design, training configuration, and deployment — offering developers a structured path to building more reliable AI agents without managing dedicated training infrastructure.

Amazon Web Services has published a technical guide detailing how organisations can build AI-powered employee onboarding agents using **Amazon Quick**. The blueprint shows developers how to configure agents that connect to existing HR systems, answer new-hire questions, and track document completion — automating one of the most administratively intensive processes in enterprise HR.

Amazon Web Services has detailed how developers can connect OAuth-protected Model Context Protocol (MCP) servers to its Bedrock AgentCore Gateway using the Authorization Code flow. The feature gives organizations a centralized layer for managing how AI agents authenticate and connect to external tools and MCP servers, addressing a key security gap in enterprise agent deployments.
.jpg)
Cursor, the AI coding startup, has launched a next-generation agent experience for its coding platform, placing it in direct competition with Anthropic's Claude Code and OpenAI's Codex. The move marks a significant escalation in the agentic coding space, where the tools developers use to write software are increasingly built by the same companies that supply the underlying AI models.
OpenAI has extended its Responses API with a hosted computer environment — combining shell access, file handling, and stateful containers — enabling developers to build and deploy autonomous agents that can execute code, manage files, and persist state between steps. The update moves the Responses API from a model-querying interface toward a complete agent runtime, reducing the infrastructure burden on developers building multi-step AI workflows.

Anthropic has launched an auto mode for Claude Code, its agentic coding tool, designed to intercept potentially dangerous actions — such as deleting files or leaking sensitive data — before they execute. The feature positions itself as a middle ground between micromanaging the AI and granting it unchecked autonomy, targeting developers who want hands-off assistance without sacrificing safety.

Google has released Gemini 3.1 Flash-Lite, described by the company as its fastest and most cost-efficient model in the Gemini 3 series. The release targets developers and enterprises running AI at scale, where inference speed and cost-per-token are primary constraints. Details on pricing and benchmark performance were not disclosed in the initial announcement.

Google has launched Canvas inside AI Mode in Google Search, making it available to all U.S. users. The feature lets users draft documents and build interactive tools — such as quizzes, calculators, or simple apps — without leaving the Search interface. The expansion marks a significant shift in how Google positions Search: less as a lookup tool, more as a creation environment.
Stay informed
Get DeepBrief delivered to your inbox.