Apple rides AI coding wave with Xcode upgrade

Welcome back. Apple just made another big AI move. Instead of building its own coding assistant, it's integrating Anthropic and OpenAI agents directly into Xcode 26.3. In 2024, I got a demo of Apple's in-house AI within Xcode. The one unveiled today is significantly better and will still give developers the tools they need within the Apple ecosystem. Meanwhile, Anthropic's latest research suggests AI risks look less like a Terminator event and more like industrial accidents. And Perplexity's data reveals that the AI market has splintered. No single model accounted for more than 17% of usage, with teams selecting different tools for different tasks. This is more evidence that the future isn't one model to rule them all. Jason Hiner

IN TODAY’S NEWSLETTER

BIG TECH

Apple rides AI coding wave with Xcode upgrade

Apple’s Xcode is home to millions of developers who contribute to its broad ecosystem of apps, and it just got a big AI upgrade

On Tuesday, Apple launched Xcode 26.3, which integrated agentic coding capabilities powered by OpenAI’s Codex and Anthropic’s Claude Agent. These agents can collaborate with developers throughout the development lifecycle, including searching documentation, updating project settings, exploring file structures, and more, according to the blog post

Another standout feature is the ability for users to use the agents to visually verify their code, using a “Preview Screenshot” tool to grab an image of the running app or preview and then visually analyze that screenshot to confirm the UI looks as intended, such as if the liquid glass effect was implemented correctly.

Ultimately, Apple noted that the combination of Xcode's native features with the capabilities offered by these powerful agentic tools makes for a killer combination for developers, adding significant capabilities to what the agents can do on their own. 

Xcode 26.3 also brought support for Model Context Protocol (MCP), Anthropic's open standard for connecting AI to data systems. This enables developers to use any compatible AI tool in XCode, adding flexibility to their workflows.

This can be seen as Apple further departing from its original goal of building the most advanced AI models and tools itself and instead adopting advanced models already available on the market. Ultimately, this is a better approach for the company, as it can still leverage its competitive advantage: a loyal base of users and developers entrenched in the Apple ecosystem. Meanwhile, it can keep them using its tools, such as Xcode, rather than jumping on the wave of hot new AI coding companions.

Sabrina Ortiz, Senior Reporter

TOGETHER WITH AIRIA

Reinvent Your AI Journey with Airia

You want every employee—regardless of skill level—to confidently embrace AI, but that doesn’t mean sacrificing governance or innovation speed.

Airia is the enterprise AI platform built to unify innovation and security while optimizing your AI ecosystem.

  • Empower all employees with no-code, low-code, or pro-code tools for quicker AI adoption and productivity gains.

  • Test prompts, LLMs, and agent variants in safe, production-like environments to reduce development cycles.

  • Implement automated threat detection and governance tools to ensure compliance while eliminating risks.

  • Manage agents, data flows, and security protocols from a single hub for seamless control.

  • Future-proof your enterprise with AI built for complex and regulated environments.

RESEARCH

Study: AI is less Terminator, more 'hot mess'

Though human-level AI is hotly debated in the industry, if there’s one thing that AI has most in common with people, it’s the tendency to screw up. 

Research published by Anthropic on Tuesday found that AI isn’t intentionally doing things wrong; rather, it’s more likely to fail when it’s a “hot mess.” The research indicates that, as tasks become harder and more complex, a model is more likely to fail as a result of incoherence, or when a model makes random errors, rather than systematic misalignment or bias. 

One of the biggest concerns regarding AI safety is models’ propensity to act contrary to their training. However, there are two scenarios in which a model can step out of line: honest mistakes and intentional malice. 

  • Many AI ethicists and safety advocates worry about the latter scenario, in which a superintelligence system “might coherently pursue misaligned goals,” Anthropic noted. 

  • But the company’s research finds that AI often isn’t deliberately working against our goals for it, potentially changing the risks that we should be paying attention to. 

“This suggests that future AI failures may look more like industrial accidents than coherent pursuit of a goal we did not train them to pursue,” Anthropic said in its research blog. 

The company’s research also called into question the effectiveness of scaling these models in battling incoherence. The more complex the task, the more these models become confused. Although scaling the models makes them more coherent on easier tasks, incoherence either remains the same or worsens as the model size increases. Scaling, however, tends to reduce bias in outputs. 

“This doesn't eliminate AI risk — but it changes what that risk looks like, particularly for problems that are currently hardest for models,” Anthropic notes.

Though this research might seem like good news for the doomers who expect AI to ruin the world, it doesn’t change the fact that these models are still acting against the best interests of humans, whether intentionally or not. While the Moltbook frenzy might make it appear that AI is poised to bring about the downfall of society, this research suggests that the more realistic risk lies in simple hallucination and incoherence. It’s the difference between a self-driving car speeding intentionally to get you to your destination, and speeding because it hallucinated that it was on a highway instead of in a school zone: A car crash could happen either way. Harm is still harm, no matter the intent.

Nat Rubio-Licht

TOGETHER WITH DECAGON

Your Customer’s Voice Matters More Than Ever

Whether via surveys and support conversation or product reviews and NPS scores, customer feedback has always been useful for brands –  but very few are maximizing its value. That’s because feedback often comes after the fact, doesn’t get total visibility by your team, or lacks actionable insights and next steps. 

The good news is, these are all problems that AI can help solve. In their latest ebook, “Why Voice Of Customer Matters More Than Ever”, Decagon explores how AI is turning customer convos into real-time intelligence for your brand, the ways it can empower your product, CX, and engineering teams with consumer feedback, and why it can even turn support from a cost center into a growth driver. 

Get your free copy of the ebook right here (and say thanks to Decagon!).

RESEARCH

Healthy trend: AI model competition intensifies

Not all AI models are created equal, and people are becoming more strategic in how they use them. 

On Tuesday, Perplexity published research indicating that as AI models become increasingly sophisticated, how people use them has become fragmented. The research finds that no individual model has garnered more than a 17% share of the overall usage on their platform.

Perplexity offers a unique perspective on this, as users can choose from a variety of models from different providers when running queries.

The market fragmentation deepened in 2025, according to the research. In January 2025, two models – Claude Sonnet 4 and GPT-4o – accounted for more than 90% of all AI usage on its platform. By December, the leading model captured 23% of queries, while four models each had a 10% share. 

Perplexity’s data shows that teams are leveraging different models for different tasks: 

  • Approximately 40% of visual arts users relied on Gemini Flash. Meanwhile, 31% of financial analysis tasks were done with Gemini 3.0 Pro Thinking. 

  • Nearly a third of debugging and software development tasks relied on Claude Sonnet 4.5, and 23% of legal and court case queries relied on Claude Thinking models. 

  • OpenAI’s GPT-5.1 Thinking was a common choice for medical research tasks, garnering 13% of queries.

“As new models launch, these preferences are likely to shift,” Perplexity wrote in the report. “What leads today may not lead next quarter.”

And the bigger the enterprise, the more likely it is to leverage a wider array of models. Perplexity’s top 50 enterprise accounts use 30 models on average, compared to seven for average accounts.

Here's one conclusion you may not have considered based on this data: The fragmentation of these models by capability across tasks may be a case against AGI. Why should one model rule them all? If Anthropic is the preferred vendor for AI-powered software development, Claude doesn’t need to be able to write poetry. If Gemini is fantastic at graphic design, it doesn’t need to do medical research. The unique capabilities of each model might also be stronger selling points: If every model on the market has human-level performance on any given task, why choose one over another?

Nat Rubio-Licht

LINKS

  • Kilo CLI 1.0: A VS Code extension for agentic engineering. 

  • ElevenLabs Skills: Allows coding assistants like Claude Code, Cursor, and OpenCode to call on the API. 

  • CodeRabbit Plugin: With the plugin users can create an autonomous AI development workflow in Claude Code. 

  • Canvas in AI Mode: Google just shared ways you can use the tool for vacation planning.

GAMES

Which image is real?

Login or Subscribe to participate in polls.

POLL RESULTS

Does SpaceX buying xAI give you more confidence to use Grok?

Yes (26%)
No (71%)
Other (3%)

The Deep View is written by Nat Rubio-Licht, Sabrina Ortiz, Jason Hiner, Faris Kojok and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

“[This image] is practically all fog. AI would never provide such an image.”

“Tough call between the two: the wisps rising is a realistic feature of fog on a mountainside.”

“The two wonky trees don't look perfect enough to be AI-generated.”

“[Two] lone ‘Charlie Brown’ looking trees above the clouds was the giveaway - AI wouldn't do that.”

“Trees were unrealistically designed and symmetric in [this] image.”

“[This image] looked too neat and tidy.”

“Trees in [this] image were too perfect, like Bob Ross happy little trees, perfect.”

“Every tree in [this image] appeared to be a carbon copy of the next. Limited imperfections/uniqueness.”

Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning.

If you want to get in front of an audience of 750,000+ developers, business leaders and tech enthusiasts, get in touch with us here.