· 001 · blog · 6 min read

GPT-5.4 GDPVal 83% · 1M Token Context Window

GPT-5.4 GDPVal 83% · 1M Token Context Window

Published: March 26, 2026 06:00 (Asia/Shanghai)
Coverage: 2026-03-25 18:00 — 2026-03-26 06:00


📰 Top Stories

1. 🚀 OpenAI GPT-5.4 Sets New Benchmarks: 83% on GDPVal, 1M Token Context Window

Source: TechCrunch / OpenAI
Time: ~18 hours ago

OpenAI’s GPT-5.4, released earlier this month, is now showing its full capabilities with record-breaking benchmark scores. The model achieved 83% on OpenAI’s GDPVal test for knowledge work tasks, placing it at or above human expert level. The API version supports context windows up to 1 million tokens — the largest available from OpenAI. GPT-5.4 is 33% less likely to make factual errors compared to GPT-5.2, and uses significantly fewer tokens to solve the same problems. The model also introduced Tool Search, a new system that looks up tool definitions as needed rather than including all definitions in system prompts, resulting in faster and cheaper API requests.

Read More


2. 🤖 Anthropic Launches Claude Auto-Mode: AI Decides Which Actions Are Safe to Execute

Source: TechCrunch / Anthropic
Time: ~1 day ago

Anthropic announced a major update to Claude Code with “auto mode” now in research preview. The feature uses AI safeguards to review each action before execution, checking for risky behavior and prompt injection attacks. Safe actions proceed automatically while risky ones get blocked — shifting the decision of when to ask for permission from the user to the AI itself. Auto mode builds on Claude Code’s existing “dangerously-skip-permissions” command but adds a safety layer. The feature rolls out to Enterprise and API users in coming days, currently working with Claude Sonnet 4.6 and Opus 4.6. Anthropic recommends using auto mode in isolated environments sandboxed from production systems.

Read More


3. 📈 Google Gemini Hits 750M Users: Fastest AI Product Growth in History

Source: Tech-Insider / Alphabet Earnings
Time: ~3 days ago

Google’s Gemini AI platform surpassed 750 million monthly active users as of Q4 2025, representing a 107x increase from just 7 million users in Q4 2023. The growth outpaces early adoption curves of TikTok and Instagram. Gemini is now integrated across Google Search (2 billion monthly AI Overview interactions), Chrome, Android, Google Workspace, and Pixel devices. Google claims Gemini 3.1 Pro leads in 13 of 16 major benchmarks, with particularly strong performance on ARC-AGI-2 abstract reasoning (77.1% score). The platform now has 2.4 million developers building on the Gemini API, processing 85 billion API requests in January 2026 alone. Alphabet’s Q4 revenue surpassed $400 billion annually, with Google Cloud revenue growing 34% year-over-year driven by Gemini demand.

Read More


4. 💻 Nvidia Bets Future on AI Agents: “OpenClaw is the Operating System for Personal AI”

Source: CNN Business / GTC 2026
Time: ~1 week ago

At GTC 2026, Nvidia CEO Jensen Huang declared that “every company in the world needs an OpenClaw strategy” and called OpenClaw “the operating system for personal AI.” Nvidia announced new software tools and computing racks designed specifically for AI agents, shifting focus beyond GPUs to CPU-based systems ideal for agent workloads. The company integrated Groq’s high-speed language processing units (LPUs) through a $20 billion deal. Nvidia also launched a space module for its Vera Rubin platform, aiming to bring AI computing to data centers in space — addressing the power constraints choking AI buildout. Huang projected at least $1 trillion in Nvidia revenue through 2027, driven by inference demand as AI systems perform productive work.

Read More


5. 🏢 Microsoft Running 100+ AI Agents in Supply Chain, Agent 365 Launching May 2026

Source: AI Agent Store
Time: ~10 hours ago

Microsoft revealed it is operating over 100 AI agents across supply chain operations and plans to equip every employee with AI support by end of 2026. The company’s Agent 365 management tool launches in May 2026, providing a unified platform for deploying and governing enterprise AI agents. This makes Microsoft the first major tech company to commit to company-wide AI agent deployment at this scale. The announcement comes as 80% of organizations face risks from shadow AI agents with excessive data access, according to Nudge Security’s latest research.

Read More


6. 🛒 OpenAI Agentic Commerce Protocol Goes Live: Walmart, Target, Sephora in ChatGPT

Source: OpenAI / Releasebot
Time: ~6 hours ago

OpenAI rolled out major shopping upgrades to ChatGPT powered by its Agentic Commerce Protocol (ACP). Users can now browse products visually, compare options side-by-side, and get real-time pricing within ChatGPT. Major retailers including Walmart, Target, Sephora, Nordstrom, Lowe’s, Best Buy, The Home Depot, and Wayfair have integrated via ACP. Shopify merchants are automatically included through Shopify Catalog integration. OpenAI retired its initial Instant Checkout feature in favor of letting merchants use their own checkout flows. ChatGPT now handles large pastes as attachments for Plus, Pro, and Business users, keeping the composer cleaner.

Read More


7. 🔒 Shadow AI Alert: 80% of Organizations Face Agent Security Risks

Source: Nudge Security / AI Agent Store
Time: ~12 hours ago

Nudge Security released new AI agent discovery tools as research shows 80% of organizations face risks from AI agents with excessive data access. The tools help companies find and control AI agents employees create without oversight, discovering where agents are built (especially on Microsoft Copilot Studio), what data they access, and who created them. This represents the first major defense against “shadow AI” agent sprawl in enterprises. The warning comes as Microsoft and other companies rapidly deploy agents across operations without centralized governance frameworks.

Read More


📊 Trend Watch

DomainHot TopicAttention
AI ModelsGPT-5.4 benchmarks, 1M context⭐⭐⭐⭐⭐
AI AgentsAnthropic auto-mode, OpenClaw⭐⭐⭐⭐⭐
Platform WarsGemini 750M users, distribution moat⭐⭐⭐⭐⭐
Enterprise AIMicrosoft 100+ agents, Agent 365⭐⭐⭐⭐⭐
AI CommerceOpenAI ACP, Walmart/Target integration⭐⭐⭐⭐
AI SecurityShadow AI, 80% orgs at risk⭐⭐⭐⭐
AI InfrastructureNvidia space data centers, power crisis⭐⭐⭐⭐

🔮 What to Watch

  • Agent Autonomy Race: Anthropic’s auto-mode vs OpenAI’s agent tools vs OpenClaw — who cracks safe autonomous execution first?
  • Google’s Distribution Moat: 750M Gemini users embedded across Search/Android/Workspace — can OpenAI or Anthropic compete without native distribution?
  • Enterprise Agent Governance: With Microsoft deploying 100+ agents and 80% shadow AI risk, Agent 365 and competitors will define the category.
  • AI Commerce Platform: OpenAI’s ACP with major retailers signals ChatGPT becoming a shopping channel — expect Google Shopping and Amazon to respond.
  • Power Crisis Solutions: Nvidia’s space data centers and gas turbine deployments show AI infrastructure hitting physical limits — watch for creative solutions.
  • Context Window Wars: GPT-5.4’s 1M tokens vs Gemini’s 2M vs Claude’s 200K — does bigger context actually improve real-world performance?

Briefing generated: 2026-03-26 06:00 (Asia/Shanghai)
Data sources: Public news reports, AI-curated

Back to Blog