AI Giants Pt. 1: Clouds and Consequences – When Claude Went Dark
This is Part 1 of our AI Giants series, where we examine the successes and shortcomings of today’s largest AI firms.
In August 2025, Anthropic’s Claude, the developer’s favorite AI coding assistant, experienced a six-week reliability crisis that affected 30% of users, triggering a mass exodus to other major LLM competitors such as OpenAI’s ChatGPT and local AI systems. Three simultaneous infrastructure bugs degraded response quality just as the company rolled out controversial usage limits, exposing the fragility of cloud-dependent AI development. The crisis revealed a fundamental tension in modern AI: the trade-off between cutting-edge cloud models and the reliability, privacy, and control of local alternatives.
The incident marked a turning point. While Anthropic’s services have since stabilized and the company maintains a strong 32% enterprise market share, the crisis accelerated a shift already underway; developers are diversifying their AI infrastructure, embracing local models like Ollama, and questioning whether cloud AI providers can deliver the reliability that mission-critical applications demand.
The Safety-Focused Upstarts Who Left OpenAI
Anthropic emerged in early 2021 when seven researchers, led by siblings Dario and Daniela Amodei, left OpenAI in what Dario later described as “a fundamental loss of trust in the leadership’s sincerity.” The departure was driven by concerns that safety was being sidelined for “shiny products” and commercialization. Dario, who had led the GPT-3 project and written most of OpenAI’s original charter, felt that crucial decisions about governance and safety were made without proper consideration.
The founding team brought serious credentials—backgrounds from OpenAI, Google Brain, and academia, many with physics PhDs. They structured Anthropic as a Public Benefit Corporation with a unique Long-Term Benefit Trust, designed specifically to prevent the kind of board crisis that would later engulf OpenAI in November of 2023. This governance model legally allows the company to prioritize public benefit over shareholder profit maximization, with five financially disinterested trustees holding power to elect directors.
At the heart of Anthropic’s approach is Constitutional AI, a framework where AI systems are trained with explicit values drawn from sources like the UN Universal Declaration of Human Rights, rather than relying purely on user feedback to guide its behavior. The methodology aims to make AI systems “helpful, honest, and harmless,” with transparent, adjustable principles.
Financially, the bet paid off. Anthropic has raised over $27 billion from investors including Google ($3+ billion), Amazon ($8 billion), and Lightspeed Venture Partners, reaching a $183 billion valuation by September 2025. Revenue exploded from $1 billion annualized in December 2023 to $4.5 billion by July 2025, with CEO Dario Amodei calling it “the fastest growing software company in history at [this] scale.”
Why Developers Chose Claude Over ChatGPT
While OpenAI positioned ChatGPT as an all-purpose consumer assistant, equipped with image generation and custom GPTs, Anthropic deliberately focused on depth over breadth, particularly for developers and coding work. This strategy proved effective: software development accounts for over 10% of all Claude interactions, and coding-related revenue surged 1,000% in just three months.
Claude’s developer appeal centered on three key differentiators. First, its massive context windows allowed developers to load entire repositories with source code, tests, and documentation. The standard window supports 200,000 tokens (enough for ~150,000 words or a small codebase) and up to 1 million tokens in beta, empowering developers to maintain full project context without costly summarization or chunking workflows.
Second, Artifacts transformed how developers prototype. This interactive code preview feature displays live, editable code in a dedicated side panel, enabling “vibe coding,” where developers describe what they want and watch it get built in real-time. Developers could iterate on React components, landing pages, and dashboards with immediate visual feedback.
Third, Claude Code brought AI assistance directly to the terminal. This repository-level coding agent could analyze entire codebases, handle Git workflows through natural language commands, and execute multi-file refactoring with coherent changes. By July 2025, 115,000 developers were using Claude Code to process 195 million lines of code weekly, with some engineers reporting 2-10x productivity gains.
The developer community noticed the difference. In the Pragmatic Engineer’s 2025 survey, Claude earned 533 mentions, an 8x increase from the previous year. Stack Overflow’s survey found 45% of professional developers used Claude Sonnet, demonstrating serious adoption beyond experimentation. Even leading AI coding tools like Cursor IDE and Aider switched to Claude 3.5 Sonnet as their default model, a powerful industry endorsement.
Six Weeks of Empty Promises
On July 28, 2025, Anthropic announced new weekly rate limits for Claude Pro and Max subscribers, effective August 28. The company claimed the limits would affect “less than 5% of subscribers” and were designed to prevent account sharing and 24/7 background usage. Community reactions ranged from mixed to negative, but the worst was yet to come.
Unknown to users, three separate infrastructure bugs were already degrading Claude’s performance. The first appeared August 5—a routing error that sent short-context requests to servers configured for 1-million-token context windows. While initially affecting just 0.8% of requests, the bug was “sticky,” meaning once affected, users stayed on the wrong servers. A load balancing change on August 29 amplified the problem dramatically, eventually impacting 16% of Sonnet 4 requests at its peak on August 31.
Two more bugs landed on August 25. A TPU misconfiguration caused random Thai and Chinese characters to appear in English responses and syntax errors in generated code. An XLA compiler issue excluded the highest-probability tokens from generating, subtly degrading output quality across multiple models. The timing was catastrophic. These bugs coincided with the controversial rate limit rollout, leading users to suspect Anthropic was intentionally throttling quality to save costs.
The developer community documented the degradation methodically. Reddit posts described Claude as “significantly dumber,” “ignoring its own plans,” and “lying about code changes.” GitHub issues piled up. Twitter was flooded with screenshots of broken outputs. The top post on r/anthropic, titled “Claude Is Dead,” received over 841 upvotes and organized a mass cancellation campaign. Users felt betrayed, especially when Anthropic initially denied systematic issues.
Anthropic finally published a detailed technical “apology” on September 17, after fixes were deployed between September 2-18. The company acknowledged, “it’s been a rough summer for us, reliability wise,” and admitted its evaluations “didn’t capture the degradation users were reporting.” CEO Dario Amodei posted simply: “I’m very sorry for the problems and we’re working hard to bring you the best models.”
The transparency was unusually thorough for the AI industry, with specific percentages, dates, and technical details. But the damage was done. Claude Code’s usage on developer benchmarks dropped from 83% to 70%, with OpenAI’s Codex gaining ground. Trust, the foundation of Anthropic’s brand, had taken a serious hit.
The Local Alternatives That Don’t Break (and Keep Getting Better)
As Claude stumbled, developers rediscovered an obvious truth: AI running on your own machine can’t have cloud outages. Local LLM tools like Ollama and LM Studio, which had been gaining quiet traction, suddenly became the resilient alternative.
Ollama, with over 155,000 GitHub stars, had established itself as the “Docker for LLMs,” a command-line tool that made running powerful AI models as simple as “ollama run llama3.2.” Built on llama.cpp with sophisticated model management and optimization, Ollama supported everything from Meta’s Llama to Google’s Gemma to specialized coding models like CodeLlama and DeepSeek. It offered OpenAI-compatible API endpoints, automatic GPU support for NVIDIA, AMD, and Apple Silicon, and a massive ecosystem of integrations.
LM Studio took a different approach with a polished desktop GUI. Point-and-click model installation, a built-in ChatGPT-like interface, and zero configuration made it ideal for developers who wanted to experiment without touching the command line. Both tools were completely free: no subscriptions, no usage meters, and no rate limits.
The appeal went beyond just reliability. Local LLMs offer genuine privacy, as proprietary code and sensitive information never leave your machine, crucial for enterprises in regulated industries. They provided unlimited inference at fixed hardware costs, making them economical for heavy users facing mounting API bills. They also eliminated vendor lock-in, allowing developers to switch between models freely and fine-tune for specific domains.
During August and September 2025, articles about running local LLMs with Ollama proliferated across developer communities. Reddit and Hacker News discussions “lit up” as users shared their local setups. The message was clear: when cloud AI fails, local alternatives keep working.
The technology had matured enough to be practical. Modern open-source models like Llama 3.2, Mistral, and Qwen rivaled proprietary models for many tasks. Consumer hardware (Apple Silicon MacBooks and RTX 4000 GPUs) could run surprisingly capable models at reasonable speeds. The developer experience was polished, and, most importantly, local models never have bad weeks or months.
The New Path for AI Development
The August-September crisis didn’t kill Anthropic. Services are stable, the company maintains its market-leading position, and new model releases continue. But the incident revealed vulnerabilities that developers won’t forget.
Cloud AI dependency is now a single point of failure for thousands of businesses. Startups built entirely on OpenAI or Anthropic APIs faced operational crises during outages. The smart response emerging across the industry is multi-model redundancy: abstraction layers that allow automatic failover between providers, with local models as the ultimate backup. As one architect put it: “load balancing your intelligence layer.”
The privacy dimension matters more than companies initially realized. According to Pew Research, 81% of Americans worry AI companies will misuse their data, and Anthropic’s decision to extend data retention from 30 days to 5 years in August 2025 heightened concerns. For enterprises handling financial, medical, or biometric data, the self-hosted approach isn’t just preferable; it’s often legally required under GDPR and HIPAA.
What does this mean for developers today? The landscape has fundamentally shifted. Hybrid architectures are becoming standard. Use Claude or GPT-5 for cutting-edge capabilities but have Ollama running Llama or Mistral locally for when cloud services falter, privacy demands it, or when costs mount. The best AI coding setup in 2025 isn’t a single model, but a resilient system that gracefully degrades rather than failing completely.
The path forward is clear: embrace redundancy, invest in local alternatives, and design systems that assume cloud AI will occasionally fail.
This article was written by Max Kozhevnikov, Data and Software Engineer at Frontier Foundry. Visit his LinkedIn here.
To stay up to date with our work, visit our website, or follow us on LinkedIn, X, and Bluesky. To learn more about the services we offer, please visit our product page.



