사과는 마침내 Siri가 늙었다는 것을 인정했다

区块律动BlockBeats

特邀专栏作者

2026-06-09 13:00

이 기사는 약 7376자로, 전체를 읽는 데 약 11분이 소요됩니다

WWDC 이번 밤, 애플은 구글에서 모델을 빌리고, 엔비디아에서 연산 능력을 빌리며, 사용자에게서 다시 한 해의 인내심을 빌렸다.

AI 요약

펼치기

핵심 의견: 애플은 WWDC 2026에서 Google Gemini와 Nvidia GPU에 크게 의존하는 Apple Intelligence를 발표하며, 완전 자체 개발 AI 전략에서 '뼈를 빌려 재탄생'하는 전략으로 전환했다. 하드웨어 통제권과 개인정보 보호 프레임워크를 통해 외부 기술을 시스템 수준의 지능형 경험에 통합하여 치열해지는 AI 경쟁에 대응하고 있지만, 중국 시장의 현지화 과제는 여전히 남아있다.
핵심 요소:
1. 전략적 전환: 애플은 Google과 심층 협력하여 연간 약 10억 달러를 지불하고 1.2조 매개변수 Gemini 모델을 사용하며, 증류 기술을 통해 엣지 모델(최소 30억 매개변수)을 훈련한다. 핵심 AI 추론 부분은 Google Cloud의 Nvidia GPU에 의존한다.
2. 제품 출시: Siri는 독립 앱(Siri AI)으로 업그레이드되어 메모리와 기기 간 동기화 기능을 갖추게 되었다. iOS/iPadOS는 알림 요약, 이메일 초안, 카메라 인식 등 시스템 수준의 AI 기능을 구현하며, iPhone 15 Pro 이상의 하드웨어가 필요하다.
3. 기술과 통제: Private Cloud Compute가 처음으로 Google Cloud와 Nvidia GPU로 확장된다. 애플은 여전히 PCC 소프트웨어에 대한 암호화 통제권을 유지하지만, 기술 주권을 양도하고 외부 '골격'(모델과 연산 능력)에 의존하는 것을 인정한다.
4. 중국 시장의 어려움: Apple Intelligence는 규제 신고 문제로 인해 현지화에 적응해야 하며, 기능이 축소될 가능성이 있다. 위챗, 알리페이 등 사용 빈도가 높은 시나리오가 부재하고, 국내 스마트폰 제조업체의 엣지 AI 경쟁으로 인해 그 전망이 불확실하다.
5. 역사적 흐름: 애플은 2011년 Siri를 출시한 이후 AI 분야에서 장기간 '폐쇄적 개발'을 고수하며 인수(예: Workflow)와 칩(Neural Engine)을 통해 엣지 역량을 축적해 왔지만, ChatGPT의 등장으로 애플은 완전 자체 개발 노선을 포기하고 협력을 통한 보충 전략으로 전환할 수밖에 없었다.

Original Author: Sleepy

In the early hours of June 9, 2026, Beijing time, Apple’s WWDC 2026 arrived as scheduled.

During the keynote, it renamed Siri to Siri AI, announced a deep collaboration with Google to train its own next-generation foundation models using Gemini’s capabilities, and extended Private Cloud Compute to Google Cloud and Nvidia GPUs for the first time.

It launched five Apple Foundation Models, with the smallest on-device model having 3 billion parameters and the largest cloud model specifically optimized for Nvidia GPUs. Nearly every everyday app was rewritten. Siri also got its own standalone app, capable of saving conversations, syncing across devices, and possessing memory.

This was Apple’s most information-packed keynote in years.

Taming a Future

Apple’s AI story can be traced back to the fall of 2011, to the iPhone 4S launch event, where Siri first took the stage.

At that time, Steve Jobs was gravely ill, and Apple stood at the crossroads of an era. Siri was like a little thing that had jumped out of a sci-fi movie. You asked about the weather, inquired about restaurants, told it to set an alarm, and it would answer in a slightly mechanical voice. For the first time, you felt a phone wasn't just a piece of cold glass.

Siri originated from SRI International's CALO project, a military-grade AI assistant funded by DARPA. Apple acquired it in 2010, reportedly for over $200 million according to TechCrunch. A year later, Siri debuted with the iPhone 4S, and Apple claimed it could understand natural language and act like a personal assistant to get things done.

In that moment, Apple had secured the world's best entry point for personal intelligence. Then it squandered that lead for over a decade.

Looking back, what Siri changed first was the way humans spoke to machines. In 2011, the iPhone was transforming the phone from a communication tool into a personal computing device; the App Store was redefining software distribution, and the mobile internet was moving from the PC desktop into the palm of your hand. Siri appeared at the crest of a rising wave. But once inside Apple, it quickly evolved from an ambitious personal assistant into a compliant voice remote control.

Apple fundamentally believes in closed systems and control. But a true personal assistant must connect to more services, understand more context, and tolerate more ambiguity. Ambiguity means mistakes, means privacy risks, means the kind of disorder Apple is least equipped to handle.

So Siri was only allowed to perform deterministic tasks, like a tamed version of the future. It had a name, a voice, a personality wrapper, but lacked the initiative and memory essential for a genuine persona. Users were initially amazed by it, then started making jokes about it, and eventually stopped using it much.

Apple was the first to put a "personal assistant" into a phone, and also the first to lock it away.

The agent-based systems the entire industry is building today, in retrospect, Siri in 2011 was almost their prototype. It could be said that Apple was the first company to create an agent prototype, yet ended up being the last to complete it.

AI That Doesn't Look Like AI

Did Apple's AI stagnate during all those years Siri didn't grow up?

The answer is quite the opposite. Apple did a lot of AI, it just did it in a way that didn't look like AI at all.

If you go by keynote hype, it seems like Apple only suddenly started talking seriously about AI in 2024. But tracing back the technology path, Apple has been making moves for a decade.

In 2015, it acquired two companies, one to bolster natural language conversation and another to explore running deep learning directly on phones. That same year at WWDC, it introduced Proactive Assistant, attempting to have the system offer suggestions before the user even spoke. The idea was very advanced, but at the time felt more like a slogan given the technological constraints.

The next year brought SiriKit, opening a limited crack for developers to integrate with Siri, and it publicly introduced Differential Privacy, stating its commitment to learning from large-scale data while protecting individual privacy. In 2017, the iPhone X brought the Neural Engine, Face ID, and the camera began relying on on-device machine learning. Apple also launched Core ML for developers to run models on Apple devices and acquired Workflow, which later became Shortcuts.

These were very Apple-like answers. It wanted AI, but didn't want to bet everything on the cloud and massive personal data like Google. It wanted developers, but didn't want Siri to become a chaotic mix. So Apple chose the hardest and slowest path: focusing on on-device processing, privacy, and system integration.

Around 2020, Apple acquired several more companies specializing in low-power edge AI and voice understanding. In the same year, the M1 chip was released, bringing a 16-core Neural Engine to the Mac, pushing on-device AI compute from the pocket-sized phone all the way to the computer. The following year, Live Text and Visual Look Up were introduced, allowing text in photos to be copied directly and the camera to identify plants, with more voice requests processed locally without leaving the device.

Apple hadn't launched a standalone AI app in those dozen-plus years, but it had indeed made the phone smarter.

There were good reasons for choosing this path. AI on a phone isn't just a question-answering machine; it needs to see photos, hear voice, understand contacts, invoke apps, and sense battery level, location, and time. It's better if it can do some things offline, and it's preferable not to package up and upload every aspect of a user's life to the cloud for every request. Apple's hardware control gave it the privilege to pursue this route.

But there's a deep chasm between being locally smart and being holistically intelligent. Apple excels at breaking technology down into reliable components, but generative AI requires it to assemble those components back into a cohesive whole.

These components lay quietly embedded in the system, waiting for a catalyst.

But the catalyst didn't come first. ChatGPT came first.

When ChatGPT appeared in late 2022, Apple wasn't entirely unprepared. Tim Cook repeatedly emphasized on many occasions that AI and machine learning were core technologies in Apple products for years. Bloomberg reported in 2023 that Apple had internal projects, like the Ajax large model framework and an internal chatbot.

But the problem wasn't whether Apple had cards to play; the problem was that the rules of the game had changed.

ChatGPT shifted user attention from "features" to "capability." Users started simply expecting AI on their phones and then comparing who had the better one. When ChatGPT could already organize a jumble of thoughts into a coherent email, Siri was still saying, "I found these on the web."

At WWDC 2024, Apple presented Apple Intelligence. Writing tools, notification summaries, photo search, personalized Siri understanding, ChatGPT integration. Apple finally acknowledged that relying solely on its own models couldn't meet user expectations, at least in 2024. But the promises it made ultimately failed to materialize according to the announced timeline.

Hiring Google as a Tutor

Behind the delay of Apple Intelligence wasn't just technology falling short, but the entire Siri team's structure failing to keep pace with the current wave of AI.

Multiple media outlets confirmed that Apple’s former head of AI, John Giannandrea, stepped aside, Craig Federighi took over the AI direction, and Vision Pro head Mike Rockwell was brought in to lead the Siri team, with a significant number of Siri engineers being sent to learn AI programming tools. This wasn't a dignified rotation; Apple internally realized that with the old people and the old pace, they couldn't catch up.

In January 2026, Apple and Google issued a joint statement stating that Apple would leverage Gemini technology to customize Apple Intelligence features for iPhones and other products. Reports indicated Apple planned to pay Google approximately $1 billion annually to use a custom Gemini model with 1.2 trillion parameters to support the Siri overhaul. Apple had also tested models from OpenAI and Anthropic but ultimately chose Google.

This was completely different from the ChatGPT integration in 2024. Back then, ChatGPT was more like a backup activated by the user when Siri couldn't answer, branded by OpenAI and appearing in a pop-up interface. This time, Gemini is integrated into the foundation, becoming part of Apple's next-generation base model.

The key action is distillation. Google gave Apple full access to Gemini. Apple uses the large model within Google data centers to generate high-quality answers and reasoning processes, then uses these results to train smaller, cheaper models that can run on iPhones.

A technical paper published by Apple the day before WWDC packaged this collaboration as the third generation of Apple Foundation Models, developed in collaboration with Google. They launched five models: the 3-billion-parameter AFM 3 Core for on-device use, and a 20-billion-parameter sparse model, AFM 3 Core Advanced, which only activates parts for specific requests. On the cloud side, there's AFM 3 Cloud, the image model ADM 3 Cloud, and the most powerful, AFM 3 Cloud Pro.

A more tangible change lies in computing power. No matter how smart on-device models get, they can't do everything. Apple's Private Cloud Compute infrastructure alone couldn't handle full Gemini-level inference. Some requests are processed on Nvidia GPUs within Google Cloud. Apple subsequently confirmed that PCC has been extended beyond Apple's own data centers for the first time, with the tech stack incorporating Nvidia Confidential Computing, Intel TDX, and Google Titan chips. Apple emphasized that it still controls the PCC software, devices only trust programs approved by Apple's encryption, and the related binaries will be open for security researchers to inspect.

Apple hasn't truly given up control, but it has given up the pretense of being fully in-house.

Bones Are Borrowed

To understand Apple's position in the AI era, you must first clearly see its most core asset.

It's not the chips, it's not the models. It's the devices. Devices containing photo libraries, mail, calendars, maps, and payments, carrying the fragments of many ordinary people's lives. Whichever AI can mobilize these fragments isn't just a chatbot; it becomes a true personal intelligence hub.

Apple started paving the way for this hub long ago. Workflow, acquired in 2017, became Shortcuts, deeply integrated with Siri and system automation. App Intents, launched in 2022, allowed third-party apps to expose their capabilities to system entry points. By the Apple Intelligence era, these interfaces became the hands and feet for AI to interact with real-world actions.

With these interfaces, OpenAI can come in, Gemini has come in, and in the Chinese market, local partners can be found in the future. But their method of entry isn't directly taking over the iPhone; it's being integrated into Apple's permission framework and privacy rules.

Apple's biggest fear isn't someone having a better model. Its fear is users bypassing the system and handing their lives over to another entry point. If one day, instead of opening an app, users open an AI assistant that can manage everything for them, Apple becomes just a well-made shell.

So going forward, the "Apple" in "Apple Intelligence" represents product control, but no longer complete technical sovereignty. The skin is its own, the clothes are its own, but the bones are borrowed. Google provides the skeleton, Nvidia provides the joints, and Apple's job is to let this body walk out dressed in its own clothes.

Google gets a massive endorsement from this deal – even Apple acknowledges that Gemini's underlying capabilities are more reliable. Nvidia gets another proof point: even with the best consumer-grade chips and ambitions for self-designed servers, when it comes to frontier inference and complex agent tasks, you still can't bypass GPU clouds.

But the more bones you borrow, the less the body is entirely your own. Behind every borrowed bone lies a supplier's commercial agenda, regulatory considerations, and technology pace. If one day someone wants to pull the bones back, can Apple stand on its own? It doesn't need to answer that question just yet, but it will eventually have to.

A New Tenant Living in the System

Ordinary people don't care about model parameters. What they care about is if their phone can bother them less.

On the WWDC26 stage, Apple said: "There are times when you expect more from Siri."

For Apple, this was almost an apology.

Then it tried to show you a different morning.

You wake up to a screen piled with twenty notifications. In the past, you'd have to swipe them away one by one. Now, the system has already prioritized them for you – the boss's messages are at the top, ads and promotions are collapsed into a single line of gray text. You open an email; a long work email has been summarized into three sentences. You decide to reply, and Siri drafts a response based on your usual tone with this person. You remember you need to call a merchant in the afternoon about a return, and before you even dial, the system has already found the order number from your emails from a couple of days ago and pasted it onto the call screen.

This is the story Apple wants to tell: a layer of intelligence underlying the system, reducing the daily repetitive cognitive overhead. Read less nonsense, search less for files, get interrupted by notifications less often.

To tell this story, Apple almost completely redid Siri's entry point. On the iPhone, it's placed in the Dynamic Island; pull down to talk. On iPad and Mac, it's unified with Spotlight. It has its own app, capable of saving and continuing past conversations, syncing across devices via iCloud. Apple wants Siri to become an AI assistant living within the system, with memory and context, but tries hard not to make it look like ChatGPT.

Vision is also a significant direction. The camera has a new Siri mode; point it at food to get nutritional info, point it at something unfamiliar to identify and search. System-wide dictation isn't just speech-to-text anymore; it automatically adds punctuation and adjusts formatting, turning spoken language into text ready to send.

The developer side is also being paved. Apple opened up the Core AI framework, allowing third parties to load their own models on devices. After the App Intents update, Siri can better understand third-party apps. The Foundation Models Framework now supports not only Apple's own on-device models but also external providers like Claude and Gemini. Apple is laying a path for the entire ecosystem: for Siri to perform actions across apps in the future, developers must expose their content and actions for the system to understand.

If these plans materialize, Apple AI will no longer just be "a chatty Siri."

But this time, Apple is much more cautious than in the past. Siri AI won't be available to users until later this year, initially in beta and only in English. And the same Apple Intelligence, when it reaches China, might not be the same product at all.

For Chinese users, Apple's AI is basically just for amusement. The keynote is exciting, the features look good, but "currently not supported in China."

The Chinese market has a full set of regulations for generative AI, including filing requirements, content security, and data localization. Apple needs to find a local model partner and pass regulatory approval. The issue for Apple Intelligence in China isn't just about launching a few months late; its underlying structure might be completely different.

US users see the combination of Apple's own models plus Gemini. Chinese users might see a version mixed from Apple's system permissions, local cloud services, local models, and regulatory requirements. They are all called "Apple Intelligence," but their actual capabilities and reachable boundaries could be vastly different.

iCloud services in mainland China are operated by Guizhou Cloud Big Data. The cloud stores files, AI needs to understand files; the cloud stores photos, AI needs to interpret photos; the cloud syncs notes, AI needs to extract your plans, habits, and relationships from notes. This data has entirely new uses in the AI era and naturally faces different levels of scrutiny.

A more realistic threat comes from competition. Domestic phone manufacturers are moving very fast in on-device large models, Chinese-language assistants, and imaging AI. For a Chinese user, spending a lot on a new iPhone only to find its core AI features unusable might lead them to switch brands.

The daily scenarios in the Chinese market are particularly tricky for Apple. WeChat, Alipay, Meituan, Douyin, ride-hailing apps, government services, hospital appointment systems – these are what many people use their phones for every day. If an AI assistant can't enter these scenarios, can't understand group chats, receipts, verification codes, and various expressions that only locals instantly grasp, it can hardly be called "intelligent."

Understanding a Person

Apple Intelligence also has another problem: it doesn't cover all iPhones.

iOS 27 might be compatible down to the iPhone 11 and second-generation iPhone SE, but Apple Intelligence requires at least an iPhone 15 Pro or later, or M-series iPads and Macs. The most powerful on-device models demand even more: iPhone 17 Pro, iPhone Air, or M4 iPads/M3 Macs with at least

Odaily 공식 커뮤니티에 가입하세요