SAE 认知论系列

SAE Epistemology Series

飞轮为什么停不下来 Why the Flywheel Can't Stop

先验墙:框架让你凿得更精,却只能在一个方向凿 The prior wall: your framework lets you chisel more precisely, but only in one direction

DOI: 10.5281/zenodo.19503017  ·  学术原文 ↗Full Paper ↗

认知一旦启动,就停不下来了。

不是因为学习是美德,不是因为停下来会落后,而是因为两头都在逼:进来的逼你压缩,压缩出去的覆盖不了新来的,于是又得再压缩。

这个飞轮是被逼着转的,不是选择转的。

两头的逼迫

进的一面:知多了受不了。 世界持续向你涌来,不能关掉感官。新的后验不断堆积,旧的框架装不下。不压缩,信息就淹没你。阿莱夫证明了这个终局——什么都不丢,所有信息等权重,区分能力丧失,意义感消失,陷入永久沉默。你不得不凿。

出的一面:凿完了覆盖不了新的。 即使你成功压缩了一轮,构出了一个框架,这个框架也覆盖不了下一个新事物。因为凿是有损的——你丢掉的东西里,有一些恰好是理解下一个新事物所需要的。旧的认对新后验无效,不是认得不够好,是有损压缩的结构性代价。你不得不再凿一轮。

飞轮不是选择转的,是两头逼的。

但"不得不更多"立刻带来一个问题:更多什么?

凿是有损的。丢什么,留什么,需要一个框架。框架让凿有效率——你不需要每次都从零开始判断,框架提供了默认的压缩标准。

然后问题来了:框架一旦固定,就变成先验墙。

Meta与先验墙

Meta是当代科技行业中先验墙的典型案例,因为它完美地展示了先验墙的悖论:拥有全球最丰富的数据,却只能在一个方向上凿。

Meta有几十亿用户,每天产生的行为数据是人类历史上前所未有的——点击、停留、分享、评论,规模和粒度极其惊人。如果"更多数据等于更好的认知",Meta应该是地球上最有认知能力的组织之一。

Meta的AI输出是什么?更好的推荐系统。更精准的广告投放。更有效的engagement优化。

这不是技术限制。LLaMA系列开源模型证明了Meta有能力训练最前沿的语言模型。但这些能力最终被导向了哪里?还是广告和分发。

问题在先验——Meta看世界的框架:"用户注意力是商品,广告主是客户,平台价值等于它能卖出多少注意力。"这个框架是早期凿构循环的成功产物——它确实有效,Facebook从社交网络到广告平台的转型是一次精彩的有损压缩。

但这个成功的构现在变成了引力场。Meta 2025年营收约2010亿美元,其中广告约1962亿美元。不是Meta选择忽视广告之外的世界,是所有新能力进来之后都被广告框架的引力弯曲了轨道。Vision Pro、LLaMA、AI功能——每一个新尝试都被框架塑形,而框架本身没有被凿。

先验墙不是缺少能力,是框架的引力太强。你能凿得越来越精细,但只能在一个方向上凿。

后验墙、先验墙、方向墙

认知系统会撞上三种墙:

后验墙:纯知识的极限。你积累了海量数据,但没有凿的框架,知淹没了认。阿莱夫是极端版本——完全记住一切,无法凿,最终沉默。

先验墙:纯认识的极限。你有很好的框架,但框架固化,新后验被框架同化而不是被框架更新。Meta是典型案例——强框架,海量后验,但框架锁死了凿的方向。

方向墙:认知飞轮的牛角尖。框架有方向,方向有目的,目的固化,飞轮转得很好但转的是死圈。方向墙是下一篇的主题。

"不得不认知更多"的结构性含义不是"更多信息",是"更多层级的有损压缩"。飞轮的健康不取决于转速,取决于凿的层级能不能升——能不能从凿世界,升级到凿自己看世界的方式。

Once cognition starts, it can't stop.

Not because learning is virtuous, not because stopping means falling behind — but because it's pushed from both sides: what comes in forces you to compress; what you compress can't cover what comes next, so you must compress again.

The flywheel is compelled to turn. It doesn't choose to.

Pressure from Both Sides

The incoming side: too much to hold. The world keeps flowing toward you; you can't turn off your senses. New observations accumulate; old frameworks can't contain them. Compress or be drowned. The Aleph proved the endpoint — discard nothing, all information equally weighted, distinction collapses, meaning collapses, permanent silence. You are compelled to chisel.

The outgoing side: what you've compressed can't cover the new. Even after successful compression, the framework you've built can't handle the next new thing. Because chiseling is lossy — among the things you discarded are some needed to understand the next new thing. Old cognition fails for new observations — not because it was poor cognition, but because that's the structural cost of lossy compression. You must chisel again.

The flywheel is compelled, not chosen.

But "must-cognize-more" immediately raises a question: more of what?

Chiseling is lossy. Deciding what to discard and what to keep requires a framework. The framework gives chiseling efficiency — you don't have to decide from scratch each time; it provides default compression criteria.

Then the problem: once the framework fixes, it becomes a prior wall.

Meta and the Prior Wall

Meta is the canonical contemporary case of a prior wall, because it perfectly illustrates the paradox: the richest data in human history, yet chiseling can only go in one direction.

Meta has billions of users generating behavioral data at unprecedented scale and granularity — clicks, dwell time, shares, comments. If "more data equals better cognition," Meta should be among the most cognitively capable organizations on earth.

What are Meta's AI outputs? Better recommendation systems. More precise ad targeting. More effective engagement optimization.

This isn't a technical limitation. The LLaMA series proves Meta can train frontier language models. But where does that capability get directed? Advertising and distribution.

The problem is in the prior — Meta's framework for seeing the world: "user attention is a commodity, advertisers are clients, platform value equals how much attention it can sell." This framework was a successful product of an earlier chisel-construct cycle — it genuinely worked; Facebook's transformation into an ad platform was a brilliant act of lossy compression.

But that successful construct is now a gravitational field. In 2025 Meta earned roughly $201 billion in revenue, of which ~$196 billion was advertising. It's not that Meta chooses to ignore the world beyond ads — it's that every new capability entering the system has its trajectory bent by the advertising framework's gravity. Vision Pro, LLaMA, AI features — each new attempt gets shaped by the framework, while the framework itself goes unchiseled.

A prior wall isn't a lack of capability — it's the framework's gravity being too strong. You can chisel with increasing precision, but only in one direction.

Posterior Wall, Prior Wall, Direction Wall

Cognitive systems run into three kinds of walls:

Posterior wall: the limit of pure knowing. You've accumulated vast data, but without a chiseling framework, knowing drowns cognizing. The Aleph is the extreme version — perfectly remembering everything, unable to chisel, finally silent.

Prior wall: the limit of pure understanding. You have a good framework, but it solidifies; new observations get assimilated by the framework rather than updating it. Meta is the canonical case — strong framework, enormous observations, but the framework has locked the direction of chiseling.

Direction wall: the flywheel's dead end. The framework has direction, direction has purpose, purpose solidifies, flywheel turns well but in a closed loop. The direction wall is the next essay's subject.

The structural meaning of "must-cognize-more" isn't "more information" — it's "more layers of lossy compression." The flywheel's health doesn't depend on speed; it depends on whether chiseling's layer can be elevated — whether it can move from chiseling the world to chiseling one's way of seeing the world.