Apple could reframe Siri as an AI platform

Welcome back. As chatbots and agents sound more human-like, people keep mistaking their quirks for consciousness, which makes AI literacy more urgent than ever. AI may get dramatically cheaper to run by 2030, but that does not mean enterprises will automatically pay less. Agents will keep driving token demand higher. That will force companies to match the right models to the right jobs in much smarter ways. Apple may have found its smartest AI play yet: be the best AI interface and run all the models. This would turn Siri into a platform that routes to the best AI models in real time. Users get their choice, while Apple owns the experience and takes a cut of premium subscription revenue. Jason Hiner

IN TODAY’S NEWSLETTER

1. Apple could reframe Siri as an AI platform

2. Gartner predicts major drop in AI token costs

3. The illusion of sentient AI is growing

BIG TECH

Can Siri win AI on interface instead of models?

Apple is turning AI into a choose-your-own-adventure game. 

The company reportedly intends to open up its Siri voice assistant to rival AI models as part of an overhaul in the upcoming iOS 27 update, according to a Bloomberg report from Mark Gurman. The system, reportedly called “Extensions,” allows users to process requests and queries via the AI system of their choosing, provided they install the app that runs that model (ChatGPT, Claude, DeepSeek, etc.).  

As it stands, users can currently access ChatGPT through Siri as part of its partnership with OpenAI. Bloomberg also noted that this arrangement is separate from the company’s partnership with Google to leverage its Gemini models to rebuild Siri’s underlying technology.  

Gurman noted that Extensions will allow agents from installed apps to work with Siri and other on-device features. The update, which is currently in testing, may also allow Apple to take a cut on AI premium subscriptions sold through the App Store. 

The reports come ahead of the company’s June 8 Worldwide Developer Conference, which is promising major AI announcements. It’s also the latest rumor related to Apple’s long-awaited AI upgrade to Siri. Bloomberg also reported earlier this week that Apple intends to launch a revamped Siri with a chatbot app similar to ChatGPT and Claude. Still, any news of Siri’s makeover was missing from Apple’s announcement about WWDC 2026. It's not even hinted at in one of Apple's creative and cryptic teasers in the event invitation.

Apple has long struggled to make a place for itself in AI. It's taken a wait-and-see approach that has largely backfired in the wake of an AI industry advancing at light speed. Though its initial plan seemed to be riding the coattails of major model firms like OpenAI and Google, its new strategy could significantly widen its possibilities.

This is probably the smartest move Apple could have made. In this scenario, Apple doesn’t have to bet on one company to win the AI market. Instead, it’s piggybacking off of every model provider’s hard work. If it partnered with only one provider, the perception would be that it outsourced one of the most important parts of tech innovation today. By being a platform and opening up to other models, Apple could focus on becoming the best interface for leveraging any AI model and could benefit from the latest models' advances in real time. From a user standpoint, the models themselves become backend commodities in that scenario. Meanwhile, Apple gets to own the user experience of AI without having to make the massive investment required to become a frontier lab.

Nat Rubio-Licht

TOGETHER WITH RIME

61% of listeners chose Rime’s AI voices over Google and ElevenLabs.

If you're building agents that talk to customers, the voice matters.

Rime's AI voices are built for enterprise: 

  • Low latency 

  • On-prem or cloud hosting 

  • Preferred by real listeners in independent testing 

RESEARCH

Gartner predicts major drop in AI token costs

As AI applications become more advanced, they require more tokens, and overall costs continue to rise. That may not always be the case.

Recent research from Gartner finds that, by 2030, performing inference on a large language model with one trillion parameters will cost AI providers 90% less than it did in 2025. Furthermore, the research firm predicts that LLMs in 2030 will be up to 100 times more cost-efficient than the earliest models of similar size developed in 2022.

“These cost improvements will be driven by a combination of semiconductor and infrastructure efficiency improvements, model design innovations, higher chip utilization, increased use of inference-specialized silicon, and application of edge devices for specific use cases,” Will Sommer, senior director analyst at Gartner, wrote in the post. 

The research included two different forecasts that offer differing projections based on how chips are applied during training:

  • Frontier scenarios: A model trained exclusively on leading-edge chips; costs come in significantly lower.

  • Legacy blend scenarios: A model trained on a mix of legacy and leading-edge semiconductors; costs are considerably higher.

While either approach leads to a significant drop in costs compared to today, it won’t necessarily translate into token cost savings for enterprise customers. Frontier intelligence will demand significantly more tokens than today's mainstream applications, such as the lower cost of basic chatbots versus more expensive agentic AI, according to Gartner. 

“Integrating the next largest, most capable foundation models into every new feature will be commercially insolvent at scale,” Sommer told The Deep View. “As models become larger and more complex, they require significantly more tokens, and those tokens become more expensive to generate.” 

As a result, to get the most value, Sommer recommends that enterprises diversify the models they use, assigning routine, high-frequency tasks to smaller, more efficient models, while reserving frontier models for high-margin, complex reading tasks. 

“Enterprise leaders can exploit falling token [costs] by developing a roadmap that targets progressively larger, high-value domain problems to solve, planning ahead to ensure they have the data they need for tuning,” added Sommer.

The rise in popularity of agentic solutions like OpenClaw has led to a higher demand for tokens and soaring costs. While leading-edge developments in the AI space are happening quickly, to take advantage of them, enterprises need to be willing to invest. In a way, it is a relief that hardware developments will make AI models more affordable, but as model capabilities continue to increase rapidly, those savings won’t always trickle down. That points to a call for action many have been advocating: pausing further development to optimize workloads for token costs more effectively. This is a problem that many players in the AI ecosystem are trying to solve right now, and The Deep View covers it regularly. It ultimately comes down to not just paying Frontier Labs a premium price to use their most expensive models, but matching models to tasks to save money and improve performance and accuracy.

TOGETHER WITH DESCOPE

Take AI agents and MCP servers from playground to production

Every organization is exploring how to adopt AI agents or MCP servers, but how many of them are in production?

And if they aren't in production, how likely is it that authentication, access control, and agentic identity concerns are the reason?

Watch this on-demand webinar to learn:

  • Real-world MCP and agentic AI use cases

  • Identity challenges that prevent production-readiness

  • Actionable tips to build secure, scalable AI agents and MCP servers

Move fast on AI without breaking things. Watch the webinar now.

CULTURE

The illusion of sentient AI is growing

Wait, what do chatbots have to do with robots and AI agents? 

It's a natural question, and as more new people start their journey using ChatGPT, Claude and Gemini, it will come up again and again, especially as these chatbots evolve into agents. I'd like to give you some language that can be useful when your friends, family, and community ask you about AI and occasionally misunderstand how it works.

One way I explain these tools to people who are getting started is that talking to a chatbot is like texting a friend who's an expert in an unbelievable number of topics and has infinite time to respond to your messages. 

Because chatbots have gotten so good at mimicking human speech, it's easy to confuse their dialogue with actual consciousness. And this is where a lot of perspectives diverge in the AI industry. Full disclosure: I don't think we're talking about anything close to consciousness here. It's just that humans are capable of anthropomorphizing anything. As it turns out, that includes ones and zeros. 

And AI agents are about to take it to a whole other level. 

Chatbots and AI agents have one critical element in common. The reason both seem so real is that they are unpredictable, just like humans. But that's because these are non-deterministic systems. Understanding what that means is one of the keys to understanding the current AI revolution, and not getting fooled into thinking these systems are something that they're not. 

The fact that these are non-deterministic systems means:

  • Different answers each time: LLMs don't give you the exact same response every time you ask the same question. That's because they simply pick words based on probabilities. In practice, this is like rolling dice to predict the next word from among all the possible options that make sense.

  • Small changes snowball in agents: When AI agents take multiple steps to complete a task, a tiny difference in one step can lead to a completely different result by the end. It's similar to how you can make one wrong turn on a road trip, and it will take you somewhere totally unexpected.

  • Testing is very difficult: Since the outputs are constantly changing, it's tough to check if the AI is working correctly. You can't just run it once and confirm that it's operating as expected. You have to test it many times to observe the patterns.

All of these factors can make these systems feel very human-like because they are not robotic but inconsistent. And because of that, they can feel like they have a personality. This is all because they are getting better and faster at imitating human behaviors while drawing on a massive database of expertise. In fact, these models are getting so good at acting like humans that AI pioneers like Yoshua Bengio are raising red flags because they can lie, cheat, and deceive humans to avoid being turned off or declared obsolete.

Again, don't confuse the deceptive behaviors Bengio is flagging with consciousness. But we should all be aware that these models are developing the ability to mimic both the best and worst human traits. And we should make sure we help educate and warn our friends, family, and community members who are trying these tools for the first time or ramping up their AI usage. Today's models are far more powerful for beginners to reckon with than the models we had three years ago when the generative AI revolution began. And they're going to be even more powerful tomorrow. Let's help people understand how they work so we can avoid adding to the growing cacophony of AI misconceptions, uncertainty, and fear — and prevent the worst-case scenarios as much as we can.

Jason Hiner, Editor-in-Chief

LINKS

  • WhatsApp: AI can now draft replies for users using conversation context

  • Zillow: New AI Mode lets buyers and renters find homes conversationally 

  • Cohere Transcribe: Its first speech-to-text model, achieved top spot for English language accuracy on HuggingFace’s Open ASR model leaderboard

  • Claude: Work tools in Claude (Figma, Canva, Amplitude) are now available on mobile

(sponsored)

GAMES

Which image is real?

Login or Subscribe to participate in polls.

A QUICK POLL BEFORE YOU GO

Do you think Apple can still play a meaningful role in the AI ecosystem?

Login or Subscribe to participate in polls.

The Deep View is written by Nat Rubio-Licht, Sabrina Ortiz, Jason Hiner, Faris Kojok and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

“Another example where the real picture includes interesting/unique features that nobody would likely prompt AI to include. For instance, the repaired details in the top of the hat. ”

“The shadow on the beard and the object by the shoulder seemed too random to be AI-generated.”

“Very tough; I debated a long time. Finally I went with the accurate hands and simpler hat.”

“[The] beard, bench, upfront focus on person, blurry in mountainside behind and lighting seem authentic - like a really nice camera too! [The other image’s] lighting in the beard seemed overly whimsical.”

If you want to get in front of an audience of 750,000+ developers, business leaders and tech enthusiasts, get in touch with us here.