• The Deep View
  • Posts
  • Google Lyria 3 brings musical powers to texting

Google Lyria 3 brings musical powers to texting

Welcome back. Google could turn texting into a jukebox with Lyria 3, a tool that can spin up 30-second custom songs, complete with lyrics and album art, in 15 seconds or less. It’s fun and frictionless, but it deepens the copyright fight simmering across the music industry. Fei-Fei Li’s World Labs raised $1B, signaling that VCs are again betting big on world models and spatial intelligence as the next thing beyond LLMs. Apple is prepping three AI devices: glasses, AI-powered AirPods, and a pendant. But trust, execution, and the effectiveness of the Siri-Gemini integration will be the real tests of whether Apple can become a player in AI.
Jason Hiner

IN TODAY’S NEWSLETTER

1. Google Lyria 3 brings musical powers to texting

2. World Labs raises $1B as VCs look beyond LLMs

3. Three Apple AI devices, two big questions

CONSUMER

Google Lyria 3 brings musical powers to texting

Google has released an upgraded AI tool that could be one of the next trends for making your text messages a lot more entertaining. While the tool won’t make you the next Taylor Swift, it  raises copyright red flags that are unavoidable when bringing AI into creativity. 

On Wednesday, Google announced Lyria 3, its latest tool for using AI prompts to create music tracks. These are 30-second songs that are meant to be used for creating short, customized ditties to send to your friends and loved ones for fun — not to make professional music.

Google gave the technology several upgrades aimed at making it much better to use:

  • In addition to text prompts, you can now upload images and video clips to use for making the songs and for the cover art.

  • You previously had to provide the lyrics, but the tool will now create lyrics for you.

  • You can now choose musical genre, vocal styles, and emotional mood.

  • The audio quality has increased from 16-bit / 44.1kHz to 24-bit / 48kHz

  • The average time to create a song has decreased from 15–30 seconds to 5-10 seconds.

I used it to quickly make this 30-second track in K-Pop style celebrating the relaunch of our podcast, The Deep View: Conversations. It's far from an instant classic, but it would certainly provide a quick laugh if I sent it around to our team in Slack or a group text. 

Of course, the elephant in the room is copyright and borrowing from commercial artists without their permission. In its statement announcing Lyria 3, Google acknowledged, "Music generation with Lyria 3 is designed for original expression, not for mimicking existing artists. If your prompt names a specific artist, Gemini will take this as broad creative inspiration and create a track that shares a similar style or mood. We also have filters in place to check outputs against existing content. We recognize that our approach might not be foolproof, so you can report content that may violate your rights or the rights of others." 

Lyria 3 is rolling out on desktop first for users 18 and over in English, Spanish, French, German, Hindi, Japanese, Korean and Portuguese. It will roll out on the mobile app over the next week and to additional languages soon.

The customizability of tools like Lyria 3 is what can make them super fun. You can now make a custom song with unique album art in less than 30 seconds using just a prompt and a photo. I'm sure this is going to be a lot of fun for people to make their loved ones feel special and punctuate important life moments — and create another avenue for consumers to bring AI into their daily lives. We also can't deny the uncomfortable truth that popular commercial artists are going to be mimicked in technologies like this, without their permission and without compensation. If Lyria 3 gets popular, then it will add another layer to the ongoing fight that is quite literally tearing the music industry apart at the seams

Jason Hiner, Editor-in-Chief

TOGETHER WITH AIRIA

Reinvent Your AI Journey with Airia

You want every employee—regardless of skill level—to confidently embrace AI, but that doesn’t mean sacrificing governance or innovation speed.

Airia is the enterprise AI platform built to unify innovation and security while optimizing your AI ecosystem.

  • Empower all employees with no-code, low-code, or pro-code tools for quicker AI adoption and productivity gains.

  • Test prompts, LLMs, and agent variants in safe, production-like environments to reduce development cycles.

  • Implement automated threat detection and governance tools to ensure compliance while eliminating risks.

  • Manage agents, data flows, and security protocols from a single hub for seamless control.

  • Future-proof your enterprise with AI built for complex and regulated environments.

STARTUPS

World Labs raises $1B as VCs look beyond LLMs

The world of AI is moving well beyond language. 

On Wednesday, World Labs, a startup founded by AI pioneer Fei-Fei Li, announced a $1 billion funding round. The round’s investors included AMD, Autodesk, Emerson Collective, Fidelity, Nvidia and Sea, the company said in its announcement. 

Though World Labs didn’t disclose its valuation, previous reports from Bloomberg claim that the company sought funding at a valuation of $5 billion. 

“We are focused on accelerating our mission to advance spatial intelligence by building world models that revolutionize storytelling, creativity, robotics, scientific discovery, and beyond,” World Labs said in its press release. 

World Labs’ success is the latest sign that researchers are looking for breakthroughs beyond what large language models can provide. Investors, meanwhile, may be eyeing this development as their next big bet. 

  • Runway, an AI video startup, announced a $315 million Series E funding round that shot its valuation to $5.3 billion, a source told The Deep View. The company intends to use the funding to bolster its world model research, calling it the “most transformative technology of our time.”

  • AMI Labs, a world model startup founded by fellow AI godparent Yann LeCun earlier this year, is also reportedly in talks for funding at a multibillion-dollar valuation

With their capabilities in real-world perception and action, some developers are slating these models as a catalyst for massive progress in visual and physical AI, including fields such as robotics, self-driving cars and game development. But creating these models is no easy feat. 

“Simulating reality is simulating a dynamic world,” Anastasis Germanidis, co-founder and CTO of Runway, previously told The Deep View. “The static environment problem is much easier to solve than the dynamic world when you want to simulate and understand … the effects of different actions that you can take.”

While world models carry massive promise, they are also far more difficult to build and train than their large language model predecessors. Along with eating up more compute resources and data, creating a machine that can see the world and act on it as humans do is challenging: These machines don’t have millions of years of built-in evolutionary biology to fall back on the way that humans do. And given that the goal is to put these models in charge of training for physical actions, their mistakes have more dire physical consequences then, say, an LLM hallucinating a pizza recipe that calls for mixing Elmer's glue into cheese

Nat Rubio-Licht

TOGETHER WITH CLOUDTALK

Speak, Sell, And Support 24/7 With CloudTalk

If you’ve been watching the Winter Olympics, you know humans can do some pretty amazing stuff… but we do have our limitations. Like having to eat, for example. Or needing to sleep. Or the many various activities and responsibilities outside of work that we enjoy. 

But you know who doesn’t care about all of that? CeTe, CloudTalk’s AI Voice Agent. See, CeTe doesn’t have a bed, or hobbies, or dinner plans. This tireless digital sidekick only cares about one thing: helping your team and your customers. That’s why CeTe speaks multiple languages, never sleeps, and handles both inbound and outbound calls with care. 

BIG TECH

Three Apple AI devices, two big questions

Apple is preparing to make its biggest move in AI by launching three AI-powered hardware devices. And that leaves me with a couple of burning questions.

According to a Bloomberg report, the three AI products that Apple is accelerating for launch later this year or in 2027 are:

  • AI smart glasses: These would be Apple's flagship AI accessory, offering the most capabilities and carrying the highest price tag. The pair in development does not include a display, instead looking to differentiate based on design and camera quality. This is consistent with a report from October claiming that Apple had shifted resources from the Vision Pro team to a team working on AI glasses.

  • AirPods with AI powers: This would be a simpler AI device with cameras added to the earbuds and the ability to use the combination of cameras, microphones, and speakers in tandem. This would allow for simple multi-modal AI queries and enhance existing features, such as the recent AI translation capabilities added to AirPods Pro.  

  • An AI pendant: Perhaps the most controversial idea is an AI pin that would be shaped similar to an AirTag and could be worn as a necklace or clipped onto your clothing. This device could offer features similar to the AI earbuds, but at a more affordable price tag. Given the limited interest in other AI pendants and jewelry to date, this is the most surprising product in the report. 

Historically, Apple has entered markets late, perfected its devices, broadened the appeal, and rapidly gained market share. But, there are two big questions in my mind about whether this time is different:

  1. Trust: Apple hasn't shown much leadership in integrating generative AI features into its products yet, so these devices would need to be a major turning point. Ahead of the devices' launch, we'll need to see the new Gemini-powered Siri bring a wave of AI capabilities that Apple owners start using frequently to build trust that Apple can deliver high-quality AI.

  2. Execution: With the Apple Intelligence launch still fresh in our minds, the company made some big promises about AI that it couldn't deliver. That has naturally called into question whether the company has the right talent and DNA to deliver powerful AI experiences that will lead the way in the years ahead.

Earlier this month, The Deep View polled its newsletter audience and asked, "If Apple dramatically improves Siri, would you switch to it as your primary chatbot?" Only 23% said "yes," 69% said "no," and 8% had other thoughts. The success or failure of Apple AI devices will, of course, also be heavily dependent on the effective overhaul of Siri. There isn't currently much in the product to give us hope that it will improve. Everything is riding on the partnership with Google Gemini and Apple's ability to cohesively integrate Gemini capabilities into Apple hardware. This type of partnership is not a standard play from Apple's vertical integration playbook, and that's another factor that raises the stakes.

Jason Hiner, Editor-in-Chief

LINKS

  • Phoenix-4: A real-time human rendering model by Tavus, allowing users to generate and control emotional states and create renderings that actively listen. 

  • Dreamer: A platform in beta that allows users to discover and build agentic apps for personal intelligence. 

  • ZUNA: A 380-million-parameter AI model by Zyphra that can decode EEG signals more accurately for improved brain-computer interfaces. 

  • Fury: an autonomous AI system by startup Scout AI, built for commanding mixed fleets of ground vehicles and drones using natural language.

  • Nvidia: Senior System Software Engineer - Robotics

  • Apple: Software Engineer - Applied Machine Learning & Localization

  • TikTok: Senior Research Engineer - Foundation Models, Ads Integrity 

  • Anthropic: Engineering Manager, Inference Developer Productivity

GAMES

Which image is real?

Login or Subscribe to participate in polls.

A QUICK POLL BEFORE YOU GO

Would you consider wearing an AI pin, AI earbud, or AI glasses?

Login or Subscribe to participate in polls.

The Deep View is written by Nat Rubio-Licht, Sabrina Ortiz, Jason Hiner, Faris Kojok and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

“[This image] looks more compliant with modern safety standards.”

“Aside from one strut that appeared not to be straight, the details of [this] image - including shadows - were more complex and consistent than those of the [other] image.”

“Touches of rust on struts. [It’s] apparent [there’s] more safety and support in the cars because of struts. ”

“[This one] is too dangerous to be real.”

“AI still struggles with distant shadow casting.”

“I can't imagine such a large modern wheel with open carriages.”

“There are some perspective issues with the alignment of the bars in [this] image.”

Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning.

If you want to get in front of an audience of 750,000+ developers, business leaders and tech enthusiasts, get in touch with us here.