苹果终于承认，Siri已经过时

特邀专栏作者

2026-06-09 13:00

Bài viết này có khoảng 7376 từ, đọc toàn bộ bài viết mất khoảng 11 phút

Tại WWDC đêm nay, Apple đã vay mượn mô hình từ Google, sức mạnh tính toán từ Nvidia, và thêm một năm kiên nhẫn từ người dùng.

Tóm tắt AI

Mở rộng

Quan điểm cốt lõi: Tại WWDC 2026, Apple đã công bố Apple Intelligence phụ thuộc sâu sắc vào Google Gemini và GPU của Nvidia, đánh dấu sự chuyển hướng từ chiến lược AI tự phát triển toàn diện sang "mượn xương tái sinh". Thông qua quyền kiểm soát phần cứng và khuôn khổ bảo mật, Apple tích hợp công nghệ bên ngoài vào trải nghiệm thông minh cấp hệ thống nhằm đối phó với cuộc cạnh tranh AI ngày càng khốc liệt, nhưng những thách thức bản địa hóa tại thị trường Trung Quốc vẫn còn tồn tại.
Các yếu tố chính:
1. Bước ngoặt chiến lược: Apple đã đạt được thỏa thuận hợp tác sâu sắc với Google, trả khoảng 1 tỷ USD mỗi năm để sử dụng mô hình Gemini 1,2 nghìn tỷ tham số, đồng thời huấn luyện mô hình trên thiết bị (tối thiểu 3 tỷ tham số) thông qua kỹ thuật chưng cất. Phần suy luận AI cốt lõi phụ thuộc vào GPU Nvidia trên Google Cloud.
2. Sản phẩm ra mắt: Siri được nâng cấp thành ứng dụng độc lập (Siri AI), có khả năng ghi nhớ và đồng bộ hóa giữa các thiết bị; iOS/iPadOS triển khai các tính năng AI cấp hệ thống như tóm tắt thông báo, soạn thảo email, nhận dạng camera, yêu cầu phần cứng iPhone 15 Pro trở lên.
3. Công nghệ và kiểm soát: Private Cloud Compute lần đầu tiên được mở rộng sang Google Cloud và GPU Nvidia, Apple vẫn giữ quyền kiểm soát mã hóa đối với phần mềm PCC, nhưng thừa nhận sự nhượng bộ về chủ quyền công nghệ, phụ thuộc vào "bộ xương" bên ngoài (mô hình và sức mạnh tính toán).
4. Khó khăn tại thị trường Trung Quốc: Apple Intelligence cần thích ứng bản địa hóa do vấn đề đăng ký quản lý, có thể đối mặt với việc cắt giảm chức năng; thiếu các tình huống tần suất cao như WeChat, Alipay, cùng với sự cạnh tranh AI trên thiết bị từ các nhà sản xuất điện thoại nội địa khiến triển vọng của nó trở nên bất định.
5. Bối cảnh lịch sử: Kể từ khi ra mắt Siri vào năm 2011, Apple đã "xây dựng trong phòng kín" trong lĩnh vực AI trong thời gian dài, tích lũy năng lực trên thiết bị thông qua các thương vụ mua lại (như Workflow) và chip (Neural Engine), nhưng sự xuất hiện của ChatGPT đã buộc Apple từ bỏ lộ trình tự phát triển hoàn toàn, chuyển sang hợp tác để bổ sung bài học.

Original author: Sleepy

In the early hours of June 9, 2026, Beijing time, Apple's WWDC 2026 arrived as scheduled.

At the event, it renamed Siri to Siri AI, announced a deep collaboration with Google to train its own new generation of foundation models using Gemini's model capabilities, and extended Private Cloud Compute for the first time to Google Cloud and Nvidia's GPUs.

It released five Apple Foundation Models, ranging from a 3-billion-parameter on-device model to a cloud-based giant specifically optimized for Nvidia GPUs. Almost every everyday app was rewritten. Siri also got its own standalone app, capable of saving conversations, syncing across devices, and possessing memory.

This was Apple's most information-packed keynote in years.

Taming a Future

Apple's AI story can be traced back to the fall of 2011, at the iPhone 4S launch, when Siri first took the stage.

At that time, Steve Jobs was gravely ill, and Apple stood at the crossroads of an era. Siri seemed like a little thing that had wandered out of a sci-fi movie. You'd ask about the weather, find a restaurant, or set an alarm, and it would respond in a slightly mechanical tone. For the first time, your phone felt like more than just a cold piece of glass.

Siri originated from SRI International's CALO project, initially a military-grade AI assistant funded by DARPA. Apple acquired it in 2010, reportedly for over $200 million according to TechCrunch. A year later, Siri debuted with the iPhone 4S, with Apple claiming it could understand natural language and act as a personal assistant.

At that moment, Apple held the world's best entry point for personal intelligence. And then it let it languish for over a decade.

Looking back, Siri's first significant change was the way humans interacted with machines. In 2011, the iPhone was transforming the phone from a communication tool into a personal computing device. The App Store redefined software distribution, and the mobile internet moved from the PC desktop into the palm of your hand. Siri appeared right at the crest of this rising wave. But once inside Apple, it quickly evolved from an ambitious personal assistant into a compliant voice remote control.

Apple's core philosophy embraces closure and control. But a truly personal assistant must connect with more services, understand more context, and tolerate more uncertainty. Uncertainty implies errors and privacy risks, the kind of disorder Apple is least adept at handling.

So Siri was limited to deterministic tasks, like a tamed future. It had a name, a voice, and a personality package, yet it lacked the initiative and memory essential for a genuine personality. Users were initially amazed, then started joking about it, and eventually, they just stopped using it much.

Apple was the first to put a "personal assistant" into a phone, and also the first to lock it away.

Today, the entire industry is focused on Agents. In retrospect, the Siri of 2011 was almost their prototype. You could say Apple was the first company to create an Agent prototype, but ended up being the last to finish building it.

AI That Doesn't Look Like AI

During the years Siri didn't grow up, did Apple's AI stand still?

The answer is quite the opposite. Apple did a lot of AI; it just made it look very un-AI-like.

By keynote volume, it might seem Apple suddenly started talking about AI seriously only in 2024. But if you trace the technological path backwards, Apple has been active for a decade.

In 2015, it acquired two companies in quick succession: one to bolster natural language dialogue, and another to explore running deep learning directly on a phone. At that year's WWDC, it introduced the Proactive Assistant, aiming to offer suggestions before the user even asked. The idea was ahead of its time but felt more like a slogan given the technological limitations of the day.

The following year, SiriKit was launched, opening a limited crack for developers. Apple also publicly discussed Differential Privacy, committing to learning from large-scale data while protecting individual privacy. In 2017, the iPhone X brought the Neural Engine, Face ID and the camera began relying on on-device machine learning. Apple simultaneously introduced Core ML for developers to run models on Apple devices, and acquired Workflow, which later became Shortcuts.

This was a very Apple-like set of answers. It wanted AI, but didn't want to bet everything on the cloud and massive personal data like Google. It wanted developers, but didn't want Siri to become a chaotic mess. So Apple chose the most difficult and slowest path: focusing on on-device processing, privacy, and system integration.

Around 2020, Apple acquired several more companies specializing in low-power edge AI and voice understanding. That same year, the M1 chip was released, featuring a 16-core Neural Engine on Mac, spreading on-device AI power from the pocket to the computer. The next year saw the arrival of Live Text and Visual Look Up, allowing text in photos to be copied and the camera to identify objects, with more voice requests processed locally.

So, Apple didn't launch a standalone AI app for a decade, but it undoubtedly made the phone smarter.

Choosing this path made sense. AI on a phone isn't just a question-answering machine; it needs to see photos, hear voice, understand contacts, invoke apps, and perceive battery, location, and time. It's better if it can do some tasks offline, and ideally, it shouldn't upload every aspect of your life to the cloud. Apple's hardware control allows it to pursue this path.

However, there's a deep chasm between local intelligence and holistic intelligence. Apple excels at breaking technology down into reliable components, but generative AI requires it to reassemble those components into a cohesive whole.

These components lay quietly embedded in the system, waiting for an opportunity.

That opportunity didn't come first. ChatGPT came first.

When ChatGPT emerged in late 2022, Apple wasn't completely unprepared. Tim Cook repeatedly emphasized that AI and machine learning were core technologies in Apple products for years. Bloomberg also reported in 2023 that Apple had an internal Ajax large model framework and a chatbot project.

But the problem wasn't whether Apple had cards to play; it was that the rules of the game had changed.

ChatGPT shifted user attention from "features" to "capabilities." Users began to assume their phones must have AI and then compared which one was better. When ChatGPT could already organize a jumble of thoughts into a coherent email, Siri was still saying, "I found these on the web."

At WWDC 2024, Apple unveiled Apple Intelligence. Writing tools, notification summaries, photo search, personalized Siri understanding, and ChatGPT integration. Apple finally acknowledged that relying solely on its own models, at least in 2024, couldn't meet user expectations. However, the promised roadmap ultimately failed to materialize on schedule.

Hiring Google as a Tutor

Behind the delay of Apple Intelligence wasn't just a technical gap, but the structure of the entire Siri team failing to keep pace with this AI wave.

Multiple media outlets confirmed that Apple's original AI head, John Giannandrea, stepped back, with Craig Federighi taking over AI direction. Vision Pro head Mike Rockwell was brought in to lead the Siri team, and many Siri engineers were sent to learn AI programming tools. This wasn't a graceful rotation; internally, Apple realized that with the old people and pace, they couldn't catch up.

In January 2026, Apple and Google issued a joint statement, announcing that Apple would leverage Gemini technology to customize Apple Intelligence features for iPhones and other products. Reports indicated Apple planned to pay Google approximately $1 billion annually to use a customized 1.2 trillion-parameter Gemini model to power Siri's transformation. Apple had also tested models from OpenAI and Anthropic but ultimately chose Google.

This was entirely different from the ChatGPT integration in 2024. That time, ChatGPT was more like a backup dancer invited by the user when Siri failed. The branding was OpenAI's, and the interface was a pop-up. This time, Gemini went directly into the foundation, becoming part of Apple's new generation of base models.

The key action was distillation. Google gave Apple full access to Gemini. Apple used the large model in Google's data center to generate high-quality answers and reasoning processes, then used these results to train smaller, cheaper models capable of running on an iPhone.

The day before WWDC, Apple published a technical article packaging this collaboration as the third-generation Apple Foundation Models, developed in partnership with Google, resulting in five models. On-device, there's the 3-billion-parameter AFM 3 Core, and a 20-billion-parameter sparse model, AFM 3 Core Advanced, which activates only parts of its parameters per request. Cloud-based models include the AFM 3 Cloud, the image model ADM 3 Cloud, and the most powerful AFM 3 Cloud Pro.

A more practical change lies in computing power. On-device models, however smart, cannot complete all tasks, and Apple's Private Cloud Compute infrastructure alone can't handle full Gemini-level inference. Some requests will run on Nvidia GPUs in Google Cloud. Apple subsequently confirmed the expansion of PCC beyond its own data centers for the first time, with the tech stack covering Nvidia Confidential Computing, Intel TDX, and Google Titan chips. Apple emphasized it still controls the PCC software; devices only trust programs encrypted and approved by Apple, and relevant binaries will be open for inspection by security researchers.

Apple hasn't truly abandoned control, but it has abandoned the pretense of full in-house development.

Bones are Borrowed

To understand Apple's position in the AI era, one must first recognize its core asset.

It's not the chip, not the model; it's the devices. These devices hold photos, emails, calendars, maps, and payments, carrying the fragments of everyday life for billions of people. Any AI that can access these fragments becomes more than just a chatbot; it can become a true personal intelligence hub.

Apple started paving the way for this hub long ago. Workflow, acquired in 2017, evolved into Shortcuts, deeply integrated with Siri and system automation. App Intents, launched in 2022, allowed third-party apps to expose their capabilities to the system. In the Apple Intelligence era, these interfaces serve as the hands and feet for AI to perform real-world actions.

With these interfaces, OpenAI can come in, Gemini has come in, and the Chinese market could find local partners in the future. But they don't enter by taking over the iPhone directly; they are inserted into Apple's permission framework and privacy rules.

Apple isn't primarily afraid of a better model. It's afraid of users bypassing the system and handing their entire digital life to another entry point. If, one day, users open not an app but an AI assistant capable of orchestrating everything for them, Apple would be relegated to a nice-looking shell.

So, from now on, "Apple" in "Apple Intelligence" represents more of a product control layer, rather than complete technological sovereignty. The skin is grown in-house, the clothes are tailored in-house, but the bones are borrowed. Google provides the skeleton, Nvidia provides the joints, and Apple's job is to dress this body in its own clothes and walk it out.

Google gets a massive endorsement from this deal: even Apple admits Gemini's underlying capabilities are more reliable. Nvidia receives another proof point: even with the best consumer-grade chips and ambitions for custom servers, when it comes to frontier inference and complex agent tasks, GPU clouds are unavoidable.

But the more bones you borrow, the less the body is fully your own. Behind every borrowed bone are the commercial interests, regulatory landscapes, and technological cadences of suppliers. If one day the bones need to be pulled back, can Apple stand on its own? It doesn't need to answer this question yet, but it will eventually have to.

A New Tenant Living in the System

Regular people don't care about model parameters. They care if their phone can be less bothersome.

On the WWDC26 stage, Apple said: "There are times when you expect more from Siri."

For Apple, this was practically an apology.

Then it tried to show you a different morning.

You wake up to twenty notifications on your screen. In the past, you'd swipe through them all. Now, the system has prioritized them for you. The boss's message is at the top; ads and promotions are collapsed into a single line of grey text. You open your email; a long work email has been summarized into three sentences. You decide to reply, and Siri drafts a response based on your usual tone with that person. You remember needing to call a merchant about a return in the afternoon; before you even dial, the system has found the order number from an email a few days ago and pasted it onto the call interface.

This is the story Apple wants to tell: a layer of intelligence beneath the system, handling the repetitive cognitive overhead of daily life. Less time reading fluff, less time searching for files, fewer interruptions from notifications.

To tell this story, Apple almost completely reworked Siri's entry point. On the iPhone, it's integrated into the Dynamic Island; swipe down to talk. On iPad and Mac, it merges with Spotlight. It has a standalone app that saves and resumes past conversations, syncing across devices via iCloud. Apple wants Siri to be an AI assistant living within the system, with memory and context, but trying its best not to look like ChatGPT.

Vision is also a crucial direction. The camera has a new Siri mode: point it at food to get nutritional info, point it at anything you don't recognize to identify and search. System-wide dictation isn't just speech-to-text anymore; it automatically adds punctuation, adjusts formatting, and turns spoken words into message-ready text.

The developer side is also being paved. Apple has opened up the Core AI framework, allowing third parties to load their own models on devices. With upgraded App Intents, Siri can better understand third-party apps. The Foundation Models Framework no longer calls only Apple's own on-device models but also supports external providers like Claude and Gemini. Apple is laying a path for the entire ecosystem: if Siri is to perform actions across multiple apps, developers must make their content and actions understandable to the system.

If these plans materialize, Apple AI will be more than just a "chatty Siri."

However, Apple is much more cautious this time. Siri AI will be available to users in beta form later this year, starting with English. And the same Apple Intelligence might not be the same product in China.

For Chinese users, Apple AI is largely something to watch for entertainment. The keynote is exciting, the features look good, but "not available in China" is the reality.

The Chinese market has a complete set of regulations for generative AI, including security assessments, content safety, and data localization. Apple needs to find local model partners and pass regulatory approvals. Apple Intelligence in China isn't just a matter of being a few months late; the underlying system could be fundamentally different.

US users will see a combination of Apple's own models and Gemini. Chinese users might see a version woven together from Apple's system permissions, local cloud services, domestic models, and regulatory requirements. They are both called Apple Intelligence, but their actual capabilities and reach could be vastly different.

iCloud services in mainland China are operated by GCBD. The cloud stores files, and AI needs to understand them; the cloud stores photos, and AI needs to see them; the cloud syncs notes, and AI needs to extract your plans, habits, and relationships from them. This data has entirely new uses in the AI era, and naturally faces different levels of oversight.

A more immediate threat comes from competition. Domestic phone manufacturers are very fast with on-device large models, Chinese assistants, and imaging AI. For Chinese users, buying a new iPhone for a significant sum, only to find the core AI features unusable, might make them switch brands.

Everyday scenarios in China are particularly tricky for Apple. WeChat, Alipay, Meituan, Douyin, ride-hailing, government services, hospital appointments – these are what many people actually use their phones for every day. An AI assistant that can't enter these scenarios and can't understand group chats, receipts, verification codes, and local colloquialisms can hardly be called "smart."

Understanding a Person

There's another issue with Apple Intelligence: it doesn't cover all iPhones.

iOS 27 can cover the iPhone 11 and the second-generation iPhone SE, but Apple Intelligence requires at least an iPhone 15 Pro and later, or an M-series iPad or Mac. The most powerful on-device models have even higher requirements: iPhone 17 Pro, iPhone Air, or at least 12GB of unified memory on M4 iPad or M3 Mac.

Upgrade cycles have been lengthening over the past few years. Screens are good enough, cameras are adequate, and many people don't upgrade their phones annually. AI might be the catalyst Apple needs to stimulate upgrades again. On-device AI requires more powerful chips and memory, making hardware thresholds inevitable. A personal capability packaged as "understanding you better" ultimately becomes a price barrier.

For over a decade, Apple has been asking, "What

Chào mừng tham gia cộng đồng chính thức của Odaily