YC合伙人:如何打造一家自我進化的AI原生公司
- 核心觀點:AI 正在推動公司從層級化組織(「羅馬軍團」模式)轉變為由遞迴、自我改進的 AI 循環驅動的智慧系統。未來的瓶頸不再是人力,而是 token 使用量、業務上下文品質與組織知識的可讀性。
- 關鍵要素:
- 傳統公司資訊傳遞依賴層級,AI 正打破這一假設。未來關鍵在於將散落於郵件、會議與文件中的業務知識提取為 AI 可讀、可調用的組織上下文。
- 未來公司應被設計成一組自優化的 AI 循環:系統透過感測器感知外部變化(客戶郵件、產品數據),經由規則與工具層執行決策,並根據結果自動學習與修正。
- YC 已實現 agent 自動監控查詢失敗原因,編寫程式碼修復,並提交、審查、合併與部署,實現創始人不在場時的持續優化。
- 組織形態將改變:公司將「燃燒 token 而非堆人數」,中層管理協調功能將被 AI 大幅替代;個人貢獻者與負責高風險的判斷角色將更重要。
- 實現的前提是讓組織對 AI 可讀——必須記錄一切數據(郵件、Slack 訊息、會議錄音等),並將其壓縮提煉為可迭代的上下文。
- 內部軟體應被視為可丟棄的臨時生成物;真正有價值的是業務上下文與 skills,這些構成了永續更新的「公司大腦」。
- 人類的角色位於公司大腦的邊緣,負責處理 AI 暫時無法勝任的複雜、高風險的現實世界互動,如倫理判斷與銷售現場。
Video Title: How to Build a Self-Improving Company with AI
Video Author: YC Root Access
Compilation: Peggy
Editor's Note: In this latest YC batch talk, YC General Partner Tom Blomfield discusses not "how to use AI to improve employee efficiency," but a more fundamental question: When AI is no longer just a Copilot, but can perceive, make decisions, call tools, accept feedback, and self-correct, how should the company itself be redesigned?
Tom's core judgment is that traditional companies still operate like "Roman legions": information moves upward through hierarchies, commands are distributed downward through management chains. But AI is breaking this organizational assumption. What really matters is not letting engineers write 20% more code, but extracting the business knowledge scattered in emails, Slack, meetings, documents, and people's minds, turning it into organizational context that AI can read, call, and iterate upon.
In his view, the future AI-native company will consist of a series of recursive, self-improving AI loops: the system perceives external changes from customer emails, support tickets, and product data, executes decisions through rule layers, tool layers, and quality gates, and finally automatically learns and corrects based on results. YC is already experimenting with a similar mechanism internally: the agent doesn't just answer questions, it monitors which queries fail, determines whether new tools, databases, or indexes are needed, and automatically submits code, reviews, merges, and deploys. In other words, the company can continue to optimize while the founder sleeps.
This also means AI's impact on companies will not stop at the tool level but will further change the organizational structure. Tom proposes "burn tokens, not headcount"—the bottleneck for future startups may no longer be the number of employees, but token usage, the quality of business context, and the readability of organizational knowledge. The coordination functions of middle management will be largely replaced by AI, while ICs, directly responsible individuals, and human roles capable of handling high-risk judgments in the real world will become more important.
What's most worth paying attention to is not that AI makes companies more efficient, but that it is changing the very form of the "company" itself. When software can be generated on the fly, processes can be automatically improved, and experience can be continuously deposited into the company brain, what founders truly need to build may no longer be a team with clear hierarchies, but an intelligent system capable of continuous learning and self-optimization.
Below is the original text:
Rewriting Operations: Companies Should No Longer Operate Like Roman Legions
This part is somewhat based on a previous talk by Diana. The video from the weekend is already online and is excellent. Also, Jack Dorsey posted some tweets about two or three weeks ago that I found very interesting, so I "borrowed" quite a few of his ideas and stuffed them into this talk.
This talk will be quite conceptual and high-level, mainly discussing how we should rethink building a company.
The design of the Roman legion was essentially to project power outward from the center of Rome, covering two continents, all the way to Hadrian's Wall near Scotland. It relied on a nested hierarchy, with a stable span of control at each level. Each level had a clear person in charge, responsible for passing commands down and sending information back up.
If you look at most companies today, you'll find they still operate like a Roman legion: people are the channels through which information flows up and down. One thing that struck me in Jack Dorsey's thread of tweets is that we have always assumed hierarchical organization is the best way to organize economic units of value. But I believe AI is essentially breaking this assumption.
A year ago, if you asked people what AI is for, they would usually talk about "productivity": like Copilot making engineers 20% more efficient, integrating Copilot into workflows to help teams deliver more software. But I think this is a problematic way of understanding it. It's like putting a more powerful engine on the old way of working. What's really worth thinking about is not how to add an AI tool to an old organization, but to reimagine what a company itself is and how it should operate.
For example, what Garry just talked about, I truly believe he can now output more code than an entire engineering team. What really keeps me thinking is how to extract the domain knowledge within a company and define it as context, a skill set, or whatever you want to call it.
This domain knowledge, business knowledge, know-how, originally scattered in people's minds, Slack messages, emails, Notion documents. This information collectively defines how your company operates. Once you can make this knowledge clear and readable, you can shift from a hierarchical organization to an intelligent organization driven by AI-native software.
Making the Company Better While You Sleep: How AI Loops Automatically Discover, Fix, and Deploy
AI is not something attached to the side of a company. It's not just a tool for engineers to improve efficiency. I believe we can reimagine a company as a set of recursive, self-improving AI loops. This point is very important because once a company reaches this stage, it will continuously self-optimize even while you sleep.
Here's an example.
Diana also mentioned this AI loop in her talk. It first has a "sensor layer." This sounds fancy, but it can be quite simple: customer emails, support tickets, code changes, user cancellations, product telemetry data—these are all sensor data used to gather information from the outside world.
Then there is the policy or decision layer, meaning the rules: what can AI do, what must ask for human permission, what operations must be logged. Next is the tool layer, which is somewhat like the skills and code Garry mentioned, essentially deterministic APIs, such as querying databases, checking calendars, etc., a set of tools AI can call.
Then comes the quality gate, like Eva mentioned, with deterministic checks, security filters, and human review for high-risk items. Finally, there is the learning mechanism: the system interacts with the real world, finds out what isn't working, and feeds that feedback back to the start of the loop.
If each step can run without human intervention, or with minimal human intervention, the system will get better and better while you sleep.
I can give you some examples we are actually running now. Initially, we made an agent where you could ask questions, and it had some deterministic tools to query our database. For example, a simple question: When was the last time I had office hours with this company?
Later, it got a bit smarter. For instance, I'm doing office hours with a company, and they need to meet someone in the petrochemical industry. The system could query the database in different ways, using methods like RAG, to find five relevant founders and recommend them to you.
But this was still just a sidekick, an assistant agent. It was still the way of using AI from last year: AI makes me, as a group partner, more efficient, improving my work efficiency by 20% or 30%.
What truly gave me the "aha moment" was when we added a monitoring agent on top of this system. It looks at every query initiated by every YC employee, determines which queries succeeded and which failed. Then it asks: Why did it fail? How can we make this query succeed? Do we need new deterministic tools? Do we need to update the skills file? Do we need a new database? Do we need a new index?
These things really happen automatically at night. It writes code, submits a merge request to YC's codebase, has another agent review it, then merges and deploys. So the next day, when a human asks the same question, the query will succeed.
For me, that was the critical moment. It wasn't just about making a human 20% or 30% more valuable. It was about AI completing the loop itself and finding a way to self-improve.
I believe that if you can identify which parts of the company can operate this way and minimize the human role in execution and supervision, you can throw tokens at the problem, and the company itself will continuously get better.
There are many other examples. For instance, if you have product analytics data, you can have an agent analyze product data to find the friction points in the sales funnel. It can research best practices, set up an A/B test, run it for a week, pick the best-performing version, and deploy it.
This will happen over and over. Your product will have a self-optimizing product loop.
The same goes for customer service. Customer suggestions come in constantly, and you can use an agent for triage. This agent acts somewhat like your Chief Product Officer and Chief Technology Officer, making judgments: this suggestion we don't want to do, discard it; but that one aligns with our roadmap and can be done tonight. So it writes code, deploys, goes live, delivers directly to the customer, all without human intervention.
So, if you can see every part of the company as a self-improving recursive AI loop, it becomes something completely different from a "Roman legion" style hierarchical company.
Burn Tokens, Not Headcount: AI-Native Companies Will Reshape Organizational Structure
So what does it mean if you want to do this?
First point: consume tokens, not stack headcount. We now see that many companies at Demo Day have per-person revenue about 5 times higher than 18 months ago. I think this trend will continue into the Series A and B stages. Soon, the real constraint won't be employee count, but token usage.
The crudest way to measure this now is to look at each person's token usage. Of course, this metric is stupid in extreme cases and easily gamed. But directionally, I think it's right. We are in a phase of exploring "what is possible," so everyone should experiment maximally to see what this crazy new intelligence can do.
Once you put it on a leaderboard and tie promotion or firing to this metric, it will be gamed and distorted. But directionally, figuring out who in the organization is using tokens to the extreme and who isn't is indeed a way to decide which employees you should spend your time on.
I believe middle management is over. At least for this kind of coordination problem, I don't think we need middle management anymore; AI should do it.
For me, there are two important roles in the future. Jack Dorsey mentioned three, but I don't like the third, so I removed it. I think the truly important roles are two: everyone must become an IC, an individual contributor, builder, operator. And the key is to have a directly responsible individual. For anything to move forward, it needs a clearly named person responsible, not a committee or a group of people.
I believe companies can be built entirely on ICs. Middle management is truly over. And building a self-improving company is this vision.
By the way, I think everyone is still at the forefront of this. I'm also very curious how far you all have progressed. It feels like everyone is still exploring the boundaries. I'm not sure if anyone has already built a truly self-improving company across every function. Maybe I'm wrong; you can prove me wrong.
If I were to start, what would I do first?
The first very important thing is to make the entire organization readable and understandable for AI. What does that mean? It means you must record everything.
Simply put, all our partners' emails now, if you email a YC partner, that email goes into the YC database. Every Slack message, every DM, every office hours session—we've been recording all of them for the past three or four months. Everything that happens, if it is recorded, it happened for AI; if it is not recorded, for your intelligent system, it didn't happen.
I was chatting with some founders here earlier, and we talked about a lot of good content related to their companies. Every time I chat, I think I really should record this conversation. Because a minute ago, someone needed me to introduce them to someone, and now I can't even remember who the introduction was for. I agreed and said okay, and then told him to email me later because I knew I would forget, as I still had 20 more people to talk to.
So this might require phones, recording devices, smart glasses, or putting microphones in every room. In short, everything needs to be recorded so AI can read it.
Then, as Garry said, you also need speaker diarization and summarization. You can't just stuff 100,000 hours of recording into a context window. You must organize them, aggregate, compress, and distill them into important parts, and leave some clues for the AI.
For example: How many of you have read the YC User Manual? Hope everyone in this room has opened it at least once. It's okay. Most of that manual was written five to ten years ago, it's a bit outdated.
Last weekend, it suddenly occurred to Harsh: Since we've accumulated about 2000 hours of office hours recordings over the past three months, why not regenerate a version of the user manual?
So you give the system a set of instructions: first organize, compress, and synthesize the recordings, then classify them by topic like fundraising, hiring, co-founder disputes, etc., and then have it write a new user manual. By the end of the weekend, he had generated a 150-page user manual with significantly better quality than the existing version.
More importantly, now we can update it every month. So our user manual becomes a self-improving system. Every new piece of advice is compared with the existing manual; either it gets absorbed or discarded. Thus, the user manual becomes a continuously updated living brain, carrying the advice we give to founders every week.
Of course, it doesn't stop at the user manual level. You can input this as context into an AI agent. Suddenly, you can pose a question to a super-intelligent AI and receive the combined wisdom of 16 YC partners. But the prerequisite is that this knowledge must be readable by AI. So you must record everything.
The second point is similar: if something can create a self-improving artifact that is readable by AI, keep it; if not, discard it.
The third point is that every function should be able to generate its own software. In the past, we might say "dashboard," but now it's not just dashboards; it's software generated on demand. Codex 5.5 is now good enough that for most simple internal software and dashboards, you can generate them to a fairly high quality in one go. I tried it over the weekend with some of our internal stuff; the results were truly incredible.
So, all internal operations teams should sit on top of this layer: have an intelligent understanding of the business, and then generate their own dashboards and workflows.
Moreover, I would treat this software as entirely disposable. What should be very carefully preserved is the data. Like Garry said, he saves all his emails as Markdown and never throws anything away. But the software itself is transient, temporary. You can generate it, and you can regenerate it.
What is truly valuable is the understanding of the business in people's minds: how this function works, how we run a YC event, etc. As for the actual software used to execute the event, you can generate one for this event, use it, and then discard it. A month or two later, when the model gets smarter, you throw away the old software, give it the original instructions again, and generate a new version of software.
So I believe the valuable things are business context and skills. The software built on top of them is transient.
So, in this world, what is the role of humans?
I think we are essentially talking about a "company brain." I know many people in this room are working on something similar. The middle part—all your data, all emails, DMs, skills, know-how—is the company brain.
Humans are on the edge of this brain, responsible for interacting with the real world. That is, humans are where this intelligent system touches reality. Humans can enter scenarios that models temporarily cannot. Like meeting venues, or novel, complex situations. I was going to use the phone as an example, but now AI can easily enter phone scenarios too.
More typical are unfamiliar situations, ethical judgments, high-stakes moments. For example, a founder comes to us saying they are considering separating from their co-founder. In these genuinely high-risk, emotionally charged moments, you still want a human present.
That is the human's place. For many of your companies, the same goes for sales conversations. For the next 20 years, I believe there will still need to be a human in the room for sales.
So, I think humans will live at the edge of the company brain, responsible for bringing intelligence into the real world.
I've run out of time; the moderator might be about to pull me off stage. I'll leave you with one final question: If you were to start your company again today, would you design it this way from the beginning?
Most of your companies are still small enough to do this. So I think you have no excuse. And I know there are a few people here who are taking apart and rebuilding their companies.
I'll stop here and hand it over to Pete. Thank you.
Video Link


