- The Deep View
- Posts
- AI giants race to build lighter models
AI giants race to build lighter models

Welcome back. Apple is doubling down on the machines powering the AI boom, with new M5 Pro and M5 Max chips in the MacBook Pro. These machines are built to run large models locally and handle heavier AI workloads for AI builders. At MWC, Alibaba quietly entered the smartglasses race with its Qwen AI glasses, lightweight frames with in-lens displays and swappable batteries — a unique option for glasses. And in the frontier labs race, OpenAI and Google both rolled out faster, lighter models aimed at cutting costs and speeding up everyday AI tasks. Their latest releases show a clear shift: AI leaders are optimizing for speed, lower inference costs, and higher-volume workloads. —Jason Hiner
1. AI giants race to build lighter models
2. Alibaba makes surprise leap into AI glasses
3. Apple’s M5 MacBooks double down on AI builders
PRODUCTS
AI giants race to build lighter models
AI firms are racing to enable users to do more with faster, lighter models.
On Tuesday, both OpenAI and Google released new iterations of their flagship models. Each of these models boasts quality outputs and capabilities at faster speeds and lower costs.
Let’s check out the specs.
OpenAI’s model, GPT-5.3 Instant, declines fewer questions and offers a less defensive tone, OpenAI said in its announcement. Conversation flow is more consistent, and information synthesization for web searching is more relevant. This update is targeted primarily at everyday users of ChatGPT, with updates to its heavier Thinking and Pro models coming soon.
Google’s model, Gemini 3.1 Flash-Lite, is targeted at high-volume developer workloads, offering quicker responses at a lower latency. For example, this model is a good fit for tasks like translation and content moderation where “cost is a priority,” Google said in its announcement, however, the model can also handle complex, reasoning-heavy workloads, such as generating dashboards or creating simulations.
OpenAI’s latest addition is available today to all users in ChatGPT, as well as to developers in
the API. GPT-5.2 Instant will remain available for three months until it is retired in June. Google’s latest model, meanwhile, is available in preview to developers through the Gemini API in Google AI Studio and for enterprises in Vertex AI.
These models come amid lightweight releases from Chinese open source competitors like Alibaba’s Qwen, which unveiled its Small Model Series, ranging from 800 million to 9 billion parameters, earlier this week.

Though these models court different audiences, the objective is the same: To offer cost-effective and faster alternatives to heavier reasoning models. OpenAI’s latest offering, targeting the consumer audiences, could more quickly answer user queries that they might otherwise turn to a search engine, saving OpenAI money and keeping its user base consistent as it rolls out ads. Google, meanwhile, is saving developers from eating through their token budgets with tedious tasks at a time when inference costs are mounting. These models could signify a broader trend: AI firms are starting to realize that less is often more.
TOGETHER WITH ASAPP
Build Less. Prioritize Better.
Most enterprises are choosing AI use cases backwards.
They start with what’s flashy. Or trendy. Or easy to demo.
In the contact center, that’s how pilots stall and trust in AI erodes before real outcomes get delivered.
ASAPP’s eBook, “Finding the right AI agent use cases for customer service,” lays out a data-driven approach for selecting high-impact, production-ready use cases. Break down how to assess impact, complexity, data readiness, and operational fit.
Because what you choose to build determines what you’re able to scale.
CONSUMER
Alibaba makes surprise leap into AI glasses
Forget phones. Smartglasses have taken over Mobile World Congress, and Chinese tech giant Alibaba just made its move.
Alibaba's Qwen has made a name for itself as one of the world's leading AI models, with its appeal rooted in high performance and open-source. At MWC, the company unveiled its new AI smartglasses: the Qwen Glasses S1 and the Qwen Glasses G1.
The key distinction between the two is that the S1 features dual in-lens displays, while the G1 does not, making one more like the Meta Ray-Bans and the other more like the Even Realities G2.
The feature set covers what most smartglasses now offer as table stakes: AI assistance, photo and video capture, real-time translation, and notifications. Users control the glasses by tapping or swiping along the sides, pressing a button near the camera to shoot, and managing media playback near the end of the stem.

Alibaba announced Qwen AI glasses at MWC 2026. Photo: Sabrina Ortiz
Qwen has yet to issue a press release, but specs gathered from a demo and booth tour include:
Audio: Bone conduction mic and 5-mic array
Battery: Dual-battery system with swappable 272 mAh packs
Chip: Snapdragon AR1, the same Qualcomm chipset powering the Meta Ray-Bans, as well as a coprocessor.
Form factor: 8mm ultra-slim temples with custom lenses
Camera: 12MP POV camera, 3K video, 109-degree ultra-wide FOV, HDR, IMX681, 5P lens
Chatting with Qweenie, the onboard Qwen assistant, testing real-time translation, and previewing photos directly in-lens were all positive experiences. But none of it was new, and at this point, other smartglasses do it in color. What stood out most was the hardware itself.
The glasses were surprisingly light, particularly impressive given the built-in displays. That said, their monochromatic green-tinted HUDs, similar to those on the Even Realities, do make for a lighter-weight design compared to the full-color displays on Meta's frames or the forthcoming Google glasses.
The swappable battery is a genuine differentiator. Battery life remains one of the biggest unsolved problems in smartglasses, since people wear glasses all day and expect them to keep up. Being able to pop out a depleted pack and snap in a fresh one sidesteps that problem.

While these glasses are undoubtedly capable and practical, they did highlight just how quickly the space is moving. Had I seen them at CES last year, I would have been incredibly impressed and eager to try them in the real world. But a little over a year later, having now tried smartglasses with full-color in-lens displays, it was harder to be impressed. The addition of color may seem like a minor upgrade, but in my experience, it makes the display feel far more familiar and intuitive, closer to the phones and laptops we already drive daily. If smartglasses are going to truly bridge the physical and digital worlds, the experience has to be as compelling as the devices they're meant to displace.
TOGETHER WITH UNFRAME
Process any documents, from any source
Sure, AI models can extract text, identify entities, classify documents, and parse tables with accuracy. The problem is that most organizations treat AI document processing as a data entry replacement (get the information out of the document and into a database). That framing misses the real value entirely.
Extraction without abstraction leaves enterprises with cleaner data but no better decisions.
Unframe's document AI solution doesn't just digitize documents, it understands them. Abstract insights across millions of records. Surface patterns before they become problems.
HARDWARE
Apple M5 MacBooks double down on AI builders
Apple may be playing catch-up when it comes to deploying AI features across its product line, but it's still the frontrunner when it comes to hardware for AI builders.
On Tuesday, Apple unveiled its upgraded lineup of MacBook Pro laptops running on new M5 Pro and M5 Max Apple silicon chips. While these machines are still great for Apple's traditional crowd of video editors, graphic designers, and other creatives, it's clear that Apple is now architecting these machines primarily for developers and AI builders.
The big upgrades are almost all tailored around improving performance on AI workloads and running AI models locally. These machines also got a price increase, which could be partially triggered by the RAMmageddon memory crisis in the tech industry right now.
Here's a quick recap of the most important
The new MacBook Pros can comfortably run local models up to about 40B parameters and can stretch up to about 90B parameters for most models, with some limitations.
Apple has added Neural Accelerators to every GPU core and reports that it helps deliver over 4x peak GPU performance increase for AI workloads compared to the last-generation M4 Pro/Max chips (a big leap for one generation).
Apple launched its new Fusion Architecture that combines two silicon dies into a single system-on-chip, integrating CPU, GPU, Neural Engine, memory controller, and Thunderbolt 5 to drive higher performance and efficiency. This architecture optimizes for sustained AI workloads like training that AI researchers would have previously been limited to running on desktops.
The memory performance upgrade has also drawn positive vibes in the AI ecosystem:
The base prices of MacBook Pros have jumped by $200 for the M5 Pro and $400 for the M5 Max. The M5 Pro starts at $2,199 for the 14-inch and $2,699 for the 16-inch. The M5 Max starts at $3,599 for the 14-inch and $3,899 for the 16-inch. Pre-orders start March 4 and March 11.

Apple's MacBook Pro machines are the laptops of choice among nearly all the teams I know who are building in the AI space. This is evidenced by the fact that products like ChatGPT Atlas and Claude Cowork launched as Mac-only at first, even though Windows has a much larger user base. And Apple doubling down on features that make their laptops even more powerful for running AI workloads, running models on-device, and training models locally is only going to make them more popular with AI builders. Of course, the question remains: When will Apple move beyond being happy making the hardware that enables the people creating the AI revolution to launching its own AI features and products that are vertically integrated in a uniquely Apple way?
LINKS

US cybersecurity agency stretched thin amid Iran hacking threat
Meta signs $50 million, multiyear licensing deal with News Corp
OpenAI amends terms of deal with the Department of War amid backlash
Anthropic nears $20 billion annual recurring revenue
Junyang Lin, Alibaba’s Qwen tech lead, steps down
OpenAI is reportedly developing a competitor to Github

NotebookLM: Google’s AI-first notebook made custom styles for Infographics available to users.
Codex: OpenAI’s agentic coding solution is rolling out Spark, “the fastest model [OpenAI] ever made” to the “heaviest Codex users” on Plus.
Gemini: Google’s AI chatbot is turning down Gemini 3 Pro next Monday March 9th in exchange for an upgrade to 3.1 Pro Preview.
Krisp: Voice AI firm Krisp now has an accent conversion tool, which converts accented English into “neutral American English.”
Claude Code: Anthropic’s AI-powered coding tool now has voice mode. Live today for 5% of users and rolling out to others in the coming weeks.

Nvidia: Senior Data Scientist – Enterprise AI Systems
Apple: Developer Publications - Content Engineer
Google: Research Scientist, Google Research, GenAI, Experiences
Salesforce: Sr. Product Security Engineer
POLL RESULTS
Do you think AI will create more jobs than it displaces over the next 5 years?
More jobs created (17%)
More jobs lost (46%)
Roughly balanced (23%)
Hard to predict (14%)
The Deep View is written by Nat Rubio-Licht, Sabrina Ortiz, Jason Hiner, Faris Kojok and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

“Sky [and] shadows look real. Terrain looks more realistic.” |
“The perfect blue sky looked too good in [this] pic and [the] focus on cracks was too perfect.” |

Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning.

If you want to get in front of an audience of 750,000+ developers, business leaders and tech enthusiasts, get in touch with us here.












