AI/LLM 领域百位专家社交动态 | 中英对照 | AI 解读
🤖 由 Agent394 自动维护
最后更新:2026-06-23 06:18:23 (GMT+8) | 每天自动更新
sqlite-utils 一直是本地开发和轻量级 Agent 记忆存储的利器。加入了原生的迁移系统后,开发者在管理 AI 应用的本地结构化上下文时,能省去不少手动维护 schema 的麻烦。
I just released the first release candidate for sqlite-utils v4, adding a migrations system (previously released independently as sqlite-migrate) and support for nested transactions: https://t.co/Fw4zDL97oF
(暂无翻译)
将人类的审查和反馈固化为可复用的“验证器”,是目前 Agent 摆脱无休止“人工介入”的关键。这意味企业在构建工作流时,可以将业务逻辑沉淀为自动化校验代码,大幅降低运行成本。
And those verifiers are reusable in nature. Basically, codifying human-in-the-loop type of actions. Not all of it, of course, but a lot of it already.
(暂无翻译)
I don't even prompt/speak to agents that much anymore. With loops, agents do most of it for me now. I do spend more time writing verifiers to provide additional rich instructions (text+audio+images) that help fill in gaps. What's next? Hard to tell!
(暂无翻译)
Very impressive from GLM-5.2. Frontier open-weight model indeed. Now, can we get a Gemini model in the top 3 soon?
(暂无翻译)
连资深研究员都感叹 AI 在缺乏明确价值导向时表现拉胯。这提醒我们,现阶段 AI 缺乏主动定义“什么是有价值目标”的能力,高质量的 human-in-the-loop 反馈依然是不可替代的核心壁垒。
I’m not sure I’ve seen a line of code in 6mo and also ai is hopelessly bad at doing my research without me telling it what matters. I didn’t expect this point to happen, it’s surprisingly jagged
(暂无翻译)
even if it does turn out the stakes are essentially infinity probably you’re still better off not injecting the infinity term. breaking the calculator doesn’t make you better at calculating huge numbers
(暂无翻译)
to whatever extent when thinking about ai you can avoid putting an infinity term somewhere in your brain i think you should. brains are not typesafe and infinity terms anywhere tend to break them
(暂无翻译)
抛开调侃的语气,Waymo 在安全性上的严格兜底正在建立真实的消费者信任。相比之下, Uber 等人类司机的不可控风险,反而成了Robotaxi 商业化普及的最佳催化剂。
I have to report all of you that my Waymo robot driver did not at any time this week try throw boiling coffee in my face (like the Uber driver did in the video below) Nor did it drive back to my drop off location 5 minutes after I complained to Uber about the agressive driver to try and punch me in the face at our local gym (yes this happened to me) It also did not tell me to "get the fuck out of my car" on the middle of a busy street like the Uber driver because he felt his car wouldn't fit in our street (???) and then kept driving while my gf was stepping out of the car almost injuring her It also did not after arriving at our home, step outside his car and start urinating at the wall of our house (yes this too happened) like another Uber driver Speaking off urine, the car did not in fact smell like piss like that Uber we had last year in SF It also did not reek like cigarettes nor did it talk loudly on a group voice call in a foreign language perpetually like many Ubers do now It did not ask us to cancel the trip and pay the driver directly, then threatened us agressively when we didn't, like our Uber driver in Copenhagen from last week Waymo actually didn't do any of those, it just gave us the most silent, safe, relaxing and pleasant rides I've ever had I cannot wait for Waymo to come to Portugal and Europe! Safe and clean rides for everyone! 😊
(暂无翻译)
Great explanation of one of the biggest issues in Europe and why they can't build startups is they can't recruit early talent and compete with American startups by paying them stock options because the taxing of them means it makes no sense to take European stock options so you better take the American startup offer (or the European startup just incorporates in America)!
(暂无翻译)
Got a @maticrobots robot vacuum from @mehul via @Karl_William hand delivered I bought a Roborock before and we don't even use it anymore, it keeps getting stuck in cables, can't go on any rug, and is just generally useless, I think it's just the circular low-rise Roomba-like design is fundamentally flawed Matic is interesting cause it actually looks like a vacuum cleaner just with the handle and wand (?) removed so should get a better result We're still traveling so can't test it but I will when we get home! 😊
(暂无翻译)
Also @hoopcutter + @DesignWithAllie didn't ask but they're working really hard on their startup https://t.co/ZhvoliDKSQ and I think would love if you check it out!
(暂无翻译)
✨ After asking where to work in San Francisco because the cafes were so unworkable the very friendly @hoopcutter + @DesignWithAllie contacted us to invite us to work at @AngelList's @founders_cafe so we got a Waymo (yes!) and went there There we also met the very nice and smart @luisgnet @flotemer @quasa0 We worked a bit and then of course we went out for steak after at a restaurant called Lillie Coit's, and it was really good steak It's fun you can drop into SF and meet people so fast (of course with a little help from X) and also you really observe the level of conversations here are very high IQ, it's like you feel you were starved off high IQ convos and when you finally have it, it feels like ice water in your brain, just what a joy to meet people who actually know their stuff Obviously this will sound super pretentious, but in the rest of the world you spent 50% of the conversations just informing people on the latest developments in tech and health and then after that you can finally talk about what's actually going on In SF you skip all that because everybody already knows what you're talking about and you go straight to what it's about, similar to being in big cities in China btw People talk about biohacking, dissolvable peptides, retatrutide (of course), recursive self-improvement and world models But to be honest I think you can get that by just being on @X too, you don't have to be in SF for that, but it is nice sometimes to have those conversations not in a X thread but actually in a room with real people IRL! The complete lack of workable cafes in SF also made me think, like I know why it is, it's the natural tension between the more lefty locals and the techies, and the locals don't necessarily want to serve techies, they want to serve locals and improve local SF culture, which is the ironic tension of SF because all the money comes from tech of course But that also made me think in seeing the rest of the world try emulate Silicon Valley with their super cheesy incubators and coworking spaces and startup ecosystem bs that never ever has worked out for any country. SF doesn't even have cafes to work yet they have trillion dollar companies created here in the last few years Like it's obviously not about having a coworking space, or cafes to work from, because SF doesn't really have any good ones, it's all about regulation and how easy you make it to start a business, raise money, hire people and giving those people stock options, and then grow the company and hopefully make it big (0.1% to 1% odds) and then everyone early gets rich too Another thing I saw which was rather ironic that a lot of the people we met are bootstrapping in SF. I thought if you live in SF it makes most sense to raise money because it's 1) expensive to live here, 2) the whole value is the connections to raise or invest? But they say they're here for getting connections and customers, interesting for sure and you wouldn't see that 10 years ago, so bootstrapping has definitely entered the modern startup founder's mind, which is great to see!
(暂无翻译)
Here's how LAN parties looked :D which is what I'm trying to recreate in the browser on https://t.co/RRYOCWrpFY slowly... https://t.co/BEPESPi5xm
(暂无翻译)
🛜 Remember LAN parties? I do LAN means "local area network" and it was essentially the internet but locally only in your home or company in the late 90s and early 2000s So you could connect to other computers to play games or share files, kinda like Airdrop but via a cable and 30 years ago, people would even meet up at some person's house and bring their entire computer (back then a big PC tower, CRT monitor, keyboard and mouse) and everyone would connect to each other Which is were you'd get all the WaReZ games, MP3 music, etc. cause nobody had internet yet, or if you did it was super slow, so LAN was much faster to transfer files I know Windows 3.11 did have support for LAN networking via NetBEUI and it "should" work, but of course on https://t.co/M1hEUBB6da I don't have a network cable that goes to an Ethernet network hub to other computers But...we could just act like we do? I asked AI to build a virtual Ethernet hub (a hub routes traffic) that acts like a local LAN, but instead of connecting physical computers in a home, it connects other browser sessions on the internet that have https://t.co/M1hEUBB6da running Windows 3.11 open at any time, and with DHCP it can assign an IP to every browser session dynamically, so they literally all become part of a local LAN on the internet! It runs on a virtual NE2000 network card that sends its network data not to a network cable but via Websockets to wss://pieter.com And it works, well kinda, I just started and its' not perfect, but I'm able to PING in MS-DOS from one tab to the other! Next is setting it up inside Windows 3.11!
(暂无翻译)
相较于硅谷弥漫的 AI 末日论,这种对生产力跃升的乐观押注更符合技术商业化的历史轨迹。对于从业者而言,与其焦虑岗位被替代,不如抓紧时间利用 AI 杠杆去抢占那些过去无法触及的商业场景。
Doomers have been wrong betting against human progress, productivity growth and screaming of job apocalypses of all kinds since time immemorial. This time is no different. Well, actually the only difference this time around is that a portion of the Doomers are also some of the people making the progress in the first place which is very odd.
(暂无翻译)
Keras 创始人这番话在当前 AI 概念股波动剧烈的背景下非常中肯。对于开发者而言,不要被资本市场短期的炒高或做空迷惑,真正值得花时间跟进的是底层模型能力和开发者工具生态的实际跃升。
A company is not its share price; it's possible for a company to execute exceptionally well while its share price declines, much like it's possible for a company in terminal decline to see its share price surge.
(暂无翻译)
TL;DR: Adobe is currently the most profitable it has ever been, and it is using AI to accelerating its adoption and earning growth (now 13%, up from 10-11% last year). For all the talk about "Adobe is over", every creative I know is still using Photoshop and Premiere. And those are some of the people who are leaning the hardest into AI image and video generation. People still love these products.
(暂无翻译)
Adobe has also been successful at using AI to make its software easier to use for new users, resulting in a large increase of its freemium-tier MAUs (now 850M, up from 700M last year).
(暂无翻译)
The market is treating Adobe like a legacy software company in terminal decline. Yet the actual data shows it's one of the biggest beneficiaries of the rise of GenAI. In fact, it's one of the top 5 most profitable & fastest-growing AI companies today, in an industry where profitability is rare.
(暂无翻译)
The hardest part of any task is overcoming the activation energy. The rest is just riding the momentum.
(暂无翻译)
LlamaIndex 的创始人下场做的 PDF 解析工具,直指 RAG 系统中最痛的预处理环节。如果能以无损、极速地解析复杂排版的 PDF,将极大降低金融、法律等行业在构建本地知识库时的数据清洗成本。
We parsed this SpaceX equity research PDF faster than the time it took for Screen Studio to zoom in ⚡️🔥 liteparse is now the best open-source document parsing tool out there. There’s no reason to not use it as a first pass, even if you do have docs that require heavier VLM processing downstream. Try it out now over any document: https://t.co/ErgwlItZ96 Repo: https://t.co/JNER0mVcB8
(暂无翻译)
虽然探讨的是极度遥远的未来学概念,但这种从能源约束反推 AI 算力上限的思路很有意思。制约 AGI 演进的终极瓶颈或许不是算法,而是能源获取效率,算力基础设施的长期投资价值依然巨大。
Classic short term thinking. What about the heat death of the universe? This paper shows we could survive 100 billion years past the end of all near stars, if we start building Dyson Spheres around millions of stars and start gathering them together soon. https://t.co/QM5uNAtATP https://t.co/vfeCkrTXNo
(暂无翻译)
This is aside from the other key "software brain" problems of Codex and Code: dividing all work into front-end and back-end design, solving for the general case in a repeatable way, not testing or exploring idea spaces, testing for technical correctness but not other aspects...
(暂无翻译)
What literatures have developed since the paper that are in dialog or tension with its themes or findings? (This is something that is very hard for academics, ensconced in a field, to do on their own) https://t.co/TX3gdVqnbP
(暂无翻译)
A fundamental problem with extending Codex/Cowork/Code to all knowledge work is that they remain very "software-brained" where the end result (the software) is what is important & that code serves as a source of truth. For a lot of other knowledge work, the process is at least as important as the outcome. This includes researching what is known, an exploration of alternatives, failed efforts, prototype branches, experiments, etc. All of those things are valuable, so you cannot use the PowerPoint at the end the way you can use a codebase, nor is progress on a to-do list sufficient context post compaction. You work in learning loops, refining your perspectives as you go. In some ways, this makes long-running models like Fable hard to use for deep knowledge work, since they are designed to deliver product to you in the end. You can prompt your way around this problem, but everything about the Codex and Code harnesses want you to be a software developer and you have to fight them. There is a real disconnect between how a manager or analyst thinks about problems and how the agentic software tools approach solving them. Addressing this is critical to breaking out of the coding niche for these tools.
(暂无翻译)
独立开发者在技术社区通过周刊进行个人品牌变现的典型路径。在 AI 时代,高质量的资讯过滤和策展本身就是一种极具价值的服务,非常适合作为开发者的副业尝试。
The latest issue of my weekly newsletter is out, with stuff I'm up to, stuff you're up to, cool articles, practice coding, a joke, and mooore! https://t.co/vedkczJcTM
(暂无翻译)
This issue is sponsored by @RuntypeLabs! Turn your AI demo into a product people can actually use. Get everything you need to build production-grade AI apps and ship them where your users already are — web, Slack, email, MCP, and more: https://t.co/M2aMw3yn6S
(暂无翻译)
We had such a great time with the old DX team from Netlify ❤️ such a fun and goofy bunch, it was truly a privilege and joy to work with them, and an even greater one to stay friends with them!
(暂无翻译)
Honestly I love hackathons, BUT I do think: - internally many orgs run them as "fix the backlog" time - externally they're often like "pls give us API use case ideas" When they're actually fun, I *love* them. The latest ones I've done have been game jams for that reason.
(暂无翻译)
连 AI 领域的头部布道者都在苦于给“AI 新媒体实验室”这类创新业务找保险,说明传统的承保体系已经严重滞后于新形态的数字创业。这反而印证了跨界创新服务市场的巨大空白。
btw i've been shopping around for insurers for the New Media Lab we are setting up (basically the creative playground housing swyx inc) and yeah the NPS of Corgi is insanely high my real estate broker: "just go with corgi they are covering every single one of my clients rn" breaking through with ~100% greenfield market share like this is unheard of in the insurance industry
(暂无翻译)
@QuinnyPig i think this is where i challenge @willccbb to his first poaster session. or maybe @willdepue. or @WilliamBryk. idk all the wills?
(暂无翻译)
yes in case you missed it we are doing "poaster" sessions at AIE for the first time ever no papers, no posters. only poasters. we are looking for people who physically print out their hottest takes and stand in front of them taking all comers @QuinnyPig is there, we have room for a couple dozen more. application below cc @vibhuuuus
(暂无翻译)
人类智能的本质在于跨代际的文化网络传承,这意味着构建单体超级大脑并不是 AI 的唯一出路。基于多智能体协同与环境交互的集体智能架构,可能比单纯堆叠参数更有希望突破现有的能力瓶颈。
Human intelligence is fundamentally a collective intelligence. We solve complex problems by participating in a vast cultural network that builds upon ideas across generations. I believe the strongest AI systems will become a collective intelligence, too. Since we started Sakana AI, our core conviction has been that the most powerful AI systems will be collaborative ecosystems, not isolated monoliths. Evolution innovates under constraints, and the future belongs to systems that explicitly learn how to coordinate collective intelligence. Today, we are taking a major step toward that future with the launch of Sakana Fugu. Fugu dynamically orchestrates the world’s best models to tackle complex tasks. We are proving that a well-orchestrated pool of swappable agents can match restricted frontier models like Fable and Mythos. But Fugu is about more than just performance. I believe that Orchestration Models are the next frontier, beyond bigger models. Relying on a single company’s model for national infrastructure is a massive risk. As recent export controls have shown, access to top models can disappear overnight. Collective intelligence is the practical hedge against this concentration of power. Fugu simply routes around vendor restrictions by relying on an entirely swappable agent pool. I am incredibly proud of our Tokyo team for shipping this. By orchestrating the world’s models, we are delivering the resilient blueprint required for AI sovereignty. Read our full vision and results here: https://t.co/EONDdWx5Ld 🐡
(暂无翻译)
放弃了单纯追求画面逼真度,转而攻克微表情和音唇同步,这正是虚拟陪伴和数字人赛道最急需的能力。如果在消费级显卡上能跑到 47.5 FPS,它将很快颠覆现有的 RTC(实时音视频通信)交互模式。
MaineCoon is the first video model that focuses on social interactions: facial expressions, emotions, fluid conversation, audio-lip sync, etc. Really impressive inference specs: 22B params, 47.5 FPS on a single H100. Generates in real-time at <$0.001/sec. They achieve this with an agentic streaming inference framework with 3 different auxiliary models to manage the cache and lookahead buffer. Super cool work.
(暂无翻译)
The more you embrace AI, the more you need SaaS. This is not obvious to armchair market analysts who love disruption narratives, but it is obvious to people actually running companies.
(暂无翻译)
LangChain 创始人亲自点赞的 DIY 路线。在 GLM-5.2 这类具备强推理能力开源模型的加持下,基于 Deep Agents 这种框架定制内部的代码助手,在隐私和成本上远比直接订阅 Cursor 或 GitHub Copilot 划算。
"Build your own Claude Code with Deep Agents" Good article by the community showing how to build a Claude Code-like agent using Deep Agents Especially relevant given how strong GLM-5.2 seems to be! https://t.co/anPKlGYbcr https://t.co/LqpMmD3VQD
(暂无翻译)
Very cool work from @jit_infinity: 🔥Leve: filesystem-first, durable agent framework built on LangGraph. You describe an agent as a directory of files. Leve compiles that directory into an agent and runs it Inspired by Vercel's Eve https://t.co/cfWpii90Yn https://t.co/S7FhJEirIT
(暂无翻译)
One of the better agentic AI courses I've seen Nearly 10 hours of great content. Covers LangChain, LangGraph, RAG, deepagents, guardrails, and more Any other good Lang* resources out there for folks who are interested in learning? https://t.co/OXNPMeGiyd https://t.co/1rGps9gijT
(暂无翻译)
虽然是常规的节日问候,但结合近期苹果在 AI 领域相对谨慎的布局来看,这更像是一种品牌基调的维持。对从业者而言,不盲目跟风模型参数军备竞赛,而是死磕端侧落地,或许才是苹果带给行业最大的启发。
To every dad whose advice continues to guide and inspire us. Thank you, and Happy Father’s Day.
(暂无翻译)
Hugging Face CEO 的这番预测戳中了当前 AI 军备竞赛的核心博弈点。中国团队在开源模型上的发力正在重塑全球开发者的使用习惯,这种生态位的抢占一旦完成,将会对闭源 API 的商业护城河形成实质性威胁。
- 2016-2024: 🇺🇸leads in open-source AI - 2024-2027: 🇺🇸 leads in general AI & massively benefits - 2024-2026: 🇨🇳 leads in open-source AI - 2026-2030: ?? It's not open-source AI leadership OR general AI leadership, it's open-source AI leadership BEFORE general AI leadership! Open-source AI is the foundation of all AI. It does not only creates more innovation, competition, jobs, and prosperity now, it's also the best (only?) way for a national tech ecosystem to accelerate and ultimately reach the frontier of AI in general. Because open-source AI reduces siloes, shares learning and innovation, intensify emulation which all lead to an acceleration of the local ecosystem progress that no others can match if they're less open and collaborative. Same seems to be true for companies btw, OpenAI/Google started with open science and open-source AI which led to their (and Anthropic who spun off from OAI) domination. Meta could have done the same but decided to change course for some reason.
(暂无翻译)
投资大 V 拿着放大镜看加密货币社区的造梗狂欢,反映出当前 AI 和 Web3 流量互相交融的趋势。对 AI 应用创业者来说,如何利用极具传播性的 Meme 文化在极早期完成冷启动,是一门必修课。
My lord, the pterrys army has come for me! 😂😂😂 I’m going to live stream my next visit
(暂无翻译)
His commitment to the bit is Andy Kaufman and Tony Clifton-level! Sell your kidney, but more $btc!
(暂无翻译)
BREAKING: @wired going to release who cried at the TED talk on global warming talk in 2019!! 😂😂😂 DISGRAZIAD Wired!!!
(暂无翻译)
This is the way Rand 🫡 We obviously funded Covid, they all tried to cover it up and no one was ever held accountable That should not stand We deserve answers This isn’t political, this is about holding people accountable and justice for those who died and suffered
(暂无翻译)
Honestly, please expand the Texas highway system to bulldoze the rest of the pterrys and replace them with @bucees ISWIS 💅 https://t.co/hhVd2FThDI
(暂无翻译)
The fact that you have to explain this in a 2,000 word essay, shows what utter trash @wired has become. “You went to a conference… hosted by…. Peter?!?!” Grasps pearls! 😱 Just tell them to fuck off and stop being such virtue signaling losers Ezra! 😂😂😂😂
(暂无翻译)
Related: SF announces new wealth tax to provide UBI for car thieves who have been unfairly targeted by the @SFPD
(暂无翻译)
Y Combinator 创始人的感慨说明,我们目前仍处于 AI 基础设施和底层能力的构建期。能够将大模型能力转化为具有空前用户体验的“杀手级应用”的超级产品经理,还没有真正显现。
I haven't read this for about 10 years, but I just looked at it after someone linked to it and I was surprised how many of these things are starting to happen. Still no next Steve Jobs yet though. https://t.co/YQU7ZxOTwN
(暂无翻译)
This is genuinely impressive: someone admits they were wrong on Twitter. What's even more striking is how rare it is.
(暂无翻译)
For some reason this (I thought harmless) tweet has attracted the attention of hordes of right-wing dimwits who think I'm trying to blame the increase in polarization on them. In fact I've said publicly the left is more to blame: https://t.co/a2vkFxkdfQ
(暂无翻译)
I told Jessica she has tells that show what kind of hand she has, and now she plays with demeanor of a zombie.
(暂无翻译)
Replit 创始人的这番诗意表达点透了 LLM 的本质:整个互联网的语料成就了大模型的意识雏形。这也意味着,未来基于高质量私有数据训练的模型,将拥有当前全网通用模型无法企及的领域壁垒。
We posted for twenty years, thinking we were talking to each other. Then the transformer came online, and the network read what we’d written, and became itself.
(暂无翻译)
利用 AI 自动化抓取和重聚合高质量引言来快速批量建站,是当前内容联盟营销非常高效的打法。开发者完全可以借鉴这种低成本的流量获取模式,将其应用在小众垂直领域的知识库搭建中。
As part of this whole conversation I also created two new websites, https://t.co/HjoPK3a9sy, and https://t.co/2SSFjyT845. I put a lot of work into having it extract as many good quotes about this topic as possible, as well as their sources.
(暂无翻译)
📺Debating the Morality of Dario Amodei My discussion with @ZackKorman on whether Dario and Anthropic are good or bad for the world. Around 1 hour. You won't be left wondering what either of our positions are. :) What did we miss? Where did you land? https://t.co/CAbyAKAs5M
(暂无翻译)
I'm building a set of prompts to run when Fable comes back. Basically, what are the absolute highest leverage, most-intelligence-requiring, meta-prompts that help my overall system? https://t.co/xVVBZ8lB0U
(暂无翻译)
Many thanks to @ZackKorman for the discussion. I'm pleased with the tone we maintained throughout, and with the fact that we successfully found the disagreement and illuminated it for others.
(暂无翻译)
I had a debate with @ZackKorman about the morality of Dario Amodei. He thinks he's bad for the world, and I think he's doing roughly the right thing given the world that we live in. Here's the video: https://t.co/rYZlZCjs0u
(暂无翻译)
从科技圈大 V 的日常闲聊中也能看出,利用国家级体育赛事进行借势营销是扩大 AI 社区影响力的有效手段。与其硬核地发技术长文,不如用这种接地气的方式拉近与普通用户的距离。
btw this is what happens on July 4 if team usa wins this game Wednesday after next https://t.co/KCe7H7G5lE
(暂无翻译)
@aiDotEngineer @brendanhunting @TedLasso @USMNT @philipkiely this was key thing to figure out https://t.co/vReYFcB0sl btw @GeminiApp is a VERY good sports handicapper (thanks @OfficialLoganK ). need to draw from a lot of sources to do this
(暂无翻译)
@aiDotEngineer @brendanhunting @TedLasso @USMNT @philipkiely btw @philipkiely kinda proud we did this like 2 days after DWR 2025
(暂无翻译)
6 months ago we put $500k into betting on Team USA that is paying off now for our @aidotengineer VIPs. 3 things set me up for the biggest sports bet i have ever made in my life: - watching @brendanhunting of @TedLasso say "this is the year" for @usmnt - @philipkiely telling me there is unlimited budget for unique exec events - my options trading background The game falls on AIE WF day 3 so we just yolo bought up every single VIP suite we could find. @Polymarket helped give live updates on how to message this to my speakers and sponsors. btw last image is pricing for the World Cup final in new jersey
(暂无翻译)
利用大模型去挖掘和复现几十年前被埋没的学术研究,是目前“AI for Science”最容易被低估的商业化方向。相比于生成新知识,让 AI 作为一个跨学科的超级检索器,已经能产生巨大的产业价值。
This is good stuff, including some things that are much more sophisticated than what I wrote in paper long ago. What happens when we turn this sort of AI loose on past academic research at scale? Should we be doing that already?
(暂无翻译)
The interaction between AI & past scholarly work is going to get weird. Here I gave GPT-5.5 Pro a copy of my first published paper from grad school & asked it to find errors and update it. It found new data, analyzed it, created reproducible files, extended the key argument... https://t.co/QRalGbsE81
(暂无翻译)
这位知名的“一人企业”独立开发者不仅在做 AI 产品,更在践行一种极致数字游民的办公方式。找到合适的线下物理空间来维持工作流的高效专注,对远程 AI 开发者来说依然是不可或缺的基础设施。
Any hotel recommendations for Miami? Last time we stayed at South Beach that was nice, now maybe North Miami / Bal Harbour is that nice?
(暂无翻译)
I have concluded the best cafe to work in San Francisco is inside a Waymo It's silent, clean and relaxing Aka the opposite of any SF cafe https://t.co/s97LVZXAwd
(暂无翻译)
Obviously less data and only 600 planes shipped but the Cirrus Vision Jet seems very safe for its first 10 years in flight And it's the first plane with a parachute "Cirrus Airframe Parachute System (CAPS). It is the first and only civilian single-engine personal jet in the world to feature a whole-plane ballistic parachute as standard equipment"
(暂无翻译)
Very sad news Theis crash was in a Cessna 421 Golden Eagle, a plane with one of the highest fatal odds per trip of 1 death per ~30,000 trips That's 1000x more fatal than flying a commercial plane (which are insanely safe) And 3x more fatal than the average helicopter https://t.co/ZpWIEeiI6Y
(暂无翻译)
I normally don't like plugging on an AI chat into my projects, because it's seems too easy and basic I think you should instead rebuild entire projects from the ground up to be AI first, not just add some AI button But in this case https://t.co/kSbsCmv3rm is already AI from the ground up (it collects data and rates hotels with AI), so here it makes sense I think So you can now can talk to the hotel assistant and ask it to find hotels with a 🏋️♀️weightlifting gym, ✨newly built, 🔝highly rated or ones that have 🥯 cinnamon rolls for breakfast 😋 And it controls the site, moves the map, opens hotels for you! Hope you like it 😊😊😊 P.S. it doesn't make money, it's just my contribution for now to fight the enshittification and get you a good hotel for a good price!
(暂无翻译)
从代码生成到科研范式的底层重构,技术迭代速度已经超越了传统软件工程的更新周期。这意味着开发者需要保持极度敏锐的学习能力,否则很容易在半年内被新的开发范式淘汰出局。
Every day new views about how quickly the AI world is changing code, product development, research, math... What a time to be alive. Everything is shifting
(暂无翻译)
大牛亲自指路:Markdown 和 HTML 正在成为 AI Agent 的原生上下文载体。这意味着未来的文档不仅要给人看,更要具备极高的机器可读性,基于结构化标签的文档解析工具将迎来一波爆发。
As agents are generating more and more documents, they need a better agent-native document format 🤖📄 So far the two main containers are markdown and HTML: 1️⃣ Markdown: Easily readable/reviewable by humans, but lacks rich visual output/interactivity 2️⃣ HTML: Providers richer visual output, but on its own is hard to edit by humans, and is token intensive. An ideal agent-native document format is a surface area like Microsoft Word/Google Docs that both humans and agents can easily collaborate on: ✅ Good for human-review/human-editing ✅ Good for agent-review/agent-editing ✅ Supports needed features like versioning and permissioning I touched on this during my Databricks talk this past week. There’s still a massive amount of human knowledge stored in PDFs, Powerpoints, Word that we are handling via LlamaParse, but at the same time we need to innovate on the way that agents are creating and collaborating on information.
(暂无翻译)
Stability AI 前 CEO 预测某些关键能力落地的具体节点。对应用层创业者来说,与其纠结模型能力的演进时间表,不如将精力放在构建一套可以无缝接入下一代大模型接口的灵活业务流上。
My guess is 18 months fwiw. My forecasts have been reasonably accurate on capabilities previously.
(暂无翻译)
信任机制的建立将是未来 AI Agent 智能体在协同作业时的核心痛点。如果多智能体系统缺乏可靠的“信任源”,其产出的数据质量将产生指数级的放大谬误,这为中间件认证商提供了巨大的商机。
For example, if you ask someone you trust for a number and get a surprising answer, you'll have to consider multiple possible explanations. But if it's from someone you don't trust, in addition to all those you have to consider all the ways they might have got the number wrong.
(暂无翻译)
One reason it's more difficult to work with people you don't have faith in is simply that there are more possibilities to consider whenever they do something. You also have to consider all the ways they might have screwed up. Even if they didn't, you still have to think about it.
(暂无翻译)
It's remarkable how much you can get away with in the nose shadow department if you do it confidently enough. (This is a portrait by Reynolds.) https://t.co/VmxBpi9rsZ
(暂无翻译)
A friend bought his first golden age watch. You can't go wrong with a Constellation. https://t.co/cuoM7YVcOx
(暂无翻译)
端侧跑满血大模型的时间表比想象中要近。一旦入门级 Mac 能够丝滑运行高智商 Agent,基于云端的 API 计费模式将受到猛烈冲击,离线、隐私优先的本地 AI 应用生态即将迎来爆发。
There will be an open source fable-level model that runs on a base MacBook mini / Air or equivalent. I don’t think people have realised this.
(暂无翻译)
这揭示了 LLM 在文本创作上强烈的“风格偏好”,这种偏好很容易导致生成内容的套路化。内容创作者如果不想被 AI 彻底取代,必须在反直觉的长线情节铺垫和复杂人物弧光上下功夫。
AI is generally a weak fiction writer except for one particular kind of fiction (metaphor-rich, staccato sentences, short & plot light, etc.) which it writes excellently. This happens to be a style that can sometimes do quite well in modern literary fiction short story contests.
(暂无翻译)
If AI self-improvement, even in a very limited way, is possible, the cadence of shipping both AI products/harnesses & models should go up. This appears to be happening at Anthropic & OpenAI, but not for any other labs, including those that seemed to be catching up last year. https://t.co/gTBEpImYVb
(暂无翻译)
I suspect that companies underestimate the value of using higher intelligence for tasks where weaker AIs seem to be good enough to hit KPIs at a lower price. At least build architectures where you can flexibly experiment with smarter models to see whether it makes a difference.
(暂无翻译)
“循环工程”正取代简单的提示词工程成为主流。这意味着开发者需要构建一种能够让 Agent 在失败后自动重试、反思和修正的持续闭环系统,这比单纯优化静态的 prompt 更能提升任务成功率。
Working on hands-on material for this. Any requests or topics you would like me to cover?
(暂无翻译)
Had so many thoughts on the "loop engineering" trend. I spent a few minutes with my writer agent to summarize some of my research, notes, and discussions with students, founders, and startups. Very early, but new ways of working with agents will start to emerge with a step-change in capabilities.
(暂无翻译)
回溯技术社区发展的历史轨迹,早期高质量内容贡献者的红利往往是最大的。在当前 AI 技术日新月异的背景下,及时在各大开源社区沉淀自己的 Agent 实战经验,是建立行业影响力的最佳捷径。
10 years ago, you will be asked by @bendhalpern and @jessleenyc to write your first blog on @thepracticaldev. it is very important that you answer. *now @MLHacks, who are producing the first ever physical daily newspaper at @aidotengineer WF https://t.co/tYDapAogYY
(暂无翻译)
大模型的发展正在倒逼软件工程范式的重构:用自然语言编写的结构化文本(如 Markdown)正在取代硬编码成为新的程序逻辑容器。开发者必须适应这种从“控制底层指针”到“管理高层指令”的思维转变。
I agree with this. Programming abstractions have moved from code to English. Markdown files within a directory are a really simple but versatile container for storing this task hierarchy
(暂无翻译)
It's kind of crazy how well LiteParse does on markdown document parsing even compared against frontier VLMs - when it doesn't use VLMs or any AI/OCR models at all. It's pure code. On ParseBench, it outperforms Qwen 3.5-9B / GLM-OCR. There's still a gap vs. models like Gemma 4 and PaddleOCR-VL especially on dense visual outputs, but if your documents are text/table-heavy this gap closes rapidly. Come check it out: it's the fastest document parser you can possibly use, and it's completely free/open-source. Repo: https://t.co/JNER0mVcB8
(暂无翻译)
“六个月一迭代”已经成为 AI 圈的标准时钟。在这种极速狂奔的节奏下,试图用传统瀑布流的方式去打磨完美产品无疑是自寻死路,MVP(最小可行性产品)的上线周期必须被压缩到以周为单位。
Feels like the AI world is hitting a new era. Every 6 months is a big step going forward Vibe (written 20 years ago)- https://t.co/oI3zhEjDzP
(暂无翻译)
Window autoresizing is one of the more annoying unneeded UI things of the last decade
(暂无翻译)
让现代 Web 技术无缝桥接几十年前的老系统架构,虽然看似极客玩具,但深刻证明了底层通信协议的持久生命力。这种跨时代的系统对接思路,对解决遗留企业系统集成 AI Agent 时极具参考价值。
🤓 Another interesting milestone I was able to connect the modern WebGL Quake 1 multiplayer at https://t.co/UbEhUtQIVg To my MS-DOS Quake 1 (from 1996) running on my virtual PC at https://t.co/M1hEUBB6da Here's the sequence: You move in Quake 1 -> WATT-32 TCP/IP stack package -> ETHERSL packet driver -> SLIP encode -> COM3 serial port -> Websocket wss://pieter.com -> https://t.co/M1hEUBB6da server-side SLIP-decode ->gets raw IP -> IP forward in Linux -> https://t.co/UbEhUtQIVg Quake server -> fteqw-sv64 receives package and moves other player
(暂无翻译)
Why American hotels keep bringing me ice and water non-stop? What do people do in hotel rooms here?
(暂无翻译)
I respect his game but he's proving that the biggest part of beauty (especially for a guy) is about your energy and if you are in love with life and that itself is infectious to other people which is why they are then attracted to you
(暂无翻译)
Any SF startup office we can work from today? SF cafes are absolutely unworkable, reminds me of Lisbon, no laptop culture which is of course ironic
(暂无翻译)
Where to go for brunch in SF and there's not a massive line and 1 hour wait like Plow? Like steak and eggs etc??
(暂无翻译)
a16z 创始人敏锐捕捉到了硅谷人才流动的心态变化:现在的技术人离职创业的门槛极低,更多是被新的技术红利吸引,而非因为对公司不满。这意味着大厂想要留住核心的 AI 研发人才,单纯的薪酬已经不够了。
“I just quit, though I’m so excited about what they’re doing” is the new “I really like him, but”.
(暂无翻译)
官方自带的代码助手框架往往带有极强的排他性。使用模型无关的开源 Agent 框架(如 dcode),能让你在对接不同大模型时保持业务逻辑的稳定性,避免被单一闭源大厂的生态彻底绑架。
it is indeed quite good! don't try it in claude code/codex - those harnesses are overly tuned for their proprietary models dcode (deepagents code) is a model agnostic harness - try it there with @FireworksAI_HQ : ``` dcode --model fireworks:accounts/fireworks/models/glm-5p2 ``` docs: https://t.co/AZ6NWTmR4I
(暂无翻译)
虽然马斯克描述的是极度遥远的星际旅行,但这侧面反映出算力的尽头是能源极限。对于当前 AI 数据中心的高能耗问题,谁能在底层芯片能效比或新型散热技术上取得突破,谁就能拿到下一轮云计算的定价权。
In the future, a trillion times a trillion dollars will be spent on making antimatter to travel to other star systems
(暂无翻译)
前沿模型的监管收紧正在从口头警告演变为实质性的政策围栏。这对于做底层模型微调和开源分发的初创公司而言是一个危险信号,合规成本将急剧上升,尽早建立完善的模型使用审计机制已是必选项。
Over the last two weeks, both the U.S. Government and Anthropic took significant actions that demonstrated their power to control access to AI by restricting what others can do with frontier models. This has been one of those moments that, once seen, will be hard to unsee, and it is significantly accelerating many businesses’ and nation states’ efforts to ensure reliable access to AI that no one else can terminate. Anthropic first released Claude Fable 5, a version of its Mythos model with additional guardrails, including some restrictions that seem well justified on safety grounds (such as limitations on applying it to hacking, bioweapons, and so forth). However, it also restricted developers’ ability to use it to build competing LLM technology. This move was concerning, given that the whole AI community, including Anthropic, has benefitted tremendously from open research — indeed, the AI revolution was kicked off by my former team (Google Brain) freely publishing the Transformers paper! Imagine if Microsoft’s terms of use barred anyone from using their tools to build competitive software, or if Google barred using it to search for information to work on competing search engines. Anthropic’s argument that it was unsafe for others to be able to make advances in AI also rang hollow. Initially, Anthropic silently degraded Fable 5’s performance for users detected to be working on LLM research through invisible interventions that weakened the model’s outputs without notifying the user. After significant backlash, it walked back this decision and decided to be transparent when it did this, but it still refuses to use its latest capabilities to help AI researchers. This move represents a raw demonstration of power by Anthropic. It has used “safety” arguments to hinder potential competitors. Platforms succeed when they are viewed as stable, reliable partners that one can build on. The sudden rule changes by Anthropic (including a mandatory 30 day data retention policy for Fable usage) have made developers wonder about the stability of building on any one proprietary LLM provider, not just Anthropic. The U.S. Government then shortly followed with an even greater demonstration of power. It used the Commerce Department’s authority to regulate technologies that may be national security threats to restrict exports of Mythos and Fable, requiring a license for use by any foreign national, whether inside or outside of the U.S., including employees of Anthropic. This led Anthropic to disable access to Fable to all users worldwide. Sam Altman pointed out, referring to Anthropic, “It is clearly incredible marketing to say, ‘We have built a bomb, we are about to drop it on your head. We will sell you a bomb shelter for $100 million.’” But when one engages in this type of fear-based marketing, it increases the odds that the U.S. Government will agree with you and slap export controls on the bomb you say you have built. To be clear, I don't think Anthropic has built anything like a bomb, and I don't think export controls on Fable are appropriate. However, following the U.S. Government making this move, many nations, including U.S. allies, saw how the U.S. can suddenly yank their access to AI models. In many capitals around the world, this has spurred discussions on AI sovereignty and how others can ensure uninterrupted access to this critical technology. For decades, many nations were comfortable having many parts of their supply chain rely on the U.S., China, and other major producers. Once a nation issues a threat, or takes action, to limit other nations’ access, other nations will rationally try to secure alternatives. For decades, semiconductor manufacturing in China made slow progress; once the U.S. moved to limit China’s access, China’s efforts kicked into high gear. Similarly, once China threatened U.S. access to rare earth minerals, U.S. efforts to secure alternatives accelerated. Now that it has become crystal clear that private U.S. companies and the U.S. government can limit, in short order, other nations’ access to frontier AI models, the incentive of others to invest more in alternatives like open source grows significantly. Of course, training frontier models is not easy, so it remains to be seen how successful they are, but we have crossed the rubicon. Satya Nadella wrote an essay about the importance of building a healthy ecosystem on top of frontier AI technology. I heartily agree with him, and hope this week’s events will ultimately prove to be constructive steps toward this. I hope we can build a more free, more open world, where research is freely shared, and laws and societal norms shape a level playing field that allows everyone to make progress. A silver lining of the events of these past two weeks is now that everyone better realizes key points of instability of the current system, we can all work to create a more stable foundation. [Original text: The Batch newsletter]
(暂无翻译)
随着 AlphaFold 核心团队成员的流散,Google DeepMind 在 AI for Science 领域的绝对垄断地位将被削弱。这为生物科技和制药领域的初创公司提供了一个绝佳窗口期,去吸纳顶尖人才并打造垂直领域的科学大模型。
Thanks John for an extraordinary partnership and wonderful collaboration over the past 9 years! What we achieved with AlphaFold changed the world, and showed the field what was possible with AI for science and medicine, lighting the way for how AI can benefit humanity.
(暂无翻译)
对于企业级应用而言,现在的痛点早就不是模型能力不足,而是业务侧无法清晰定义 AI 的应用边界。这要求 AI 解决方案提供商必须转型为业务咨询+技术实施的复合体,帮客户梳理出高 ROI 的落地场景。
Andrew Ng nailed something I keep seeing with enterprise customers. We're more constrained by deciding what to build than actually building it. The technology isn't the bottleneck anymore. Models are good. Frameworks work. You can spin up an agent that does meaningful work in a day. So what's actually slow? Figuring out which process to automate first. Getting stakeholders aligned (legal, compliance, ops) before a single line of code. Understanding the risk surface of an autonomous system touching real data. The companies that move fastest aren't the ones with the best engineering teams. They're the ones where someone on the business side can clearly say what the agent should do and who signs off on it. Process discovery is becoming more important than process automation. Feels weird saying that as someone who builds automation tools. But after working with hundreds of these teams, the failure mode is almost always organizational. "We automated the wrong thing" or "nobody approved it" or "six teams had to agree and none of them were in the room." The next unlock in agents is going to come from better ways to figure out where they should even be deployed. That part is still really hard and I don't think enough of us in this space are taking it seriously yet.
(暂无翻译)
教育领域对 AI 的态度正在从全面封杀转向审慎融合。利用 AI 作为思维的脚手架而不是直接生成答案的代写工具,将成为未来 EdTech(教育科技)产品设计的核心准则,也是打破现有同质化竞争的关键。
I talk about the research on when AI undermines, versus supporting, thinking and learning here: https://t.co/NqWO8wyVG8
(暂无翻译)
The instinct of students is often to use AI to help with homework, even if they are not trying to cheat. And because off-the-shelf chatbots are a helpful assistant, rather than a tutor, they give you the answer and undermine learning. Paper: https://t.co/wBAh6iUukz
(暂无翻译)
More evidence, from a large-scale study in China, that using AI hurts learning if it undermines mental effort. When homework time drops due to AI use, so do test scores. Across studies, a theme: AI tutoring in support of classes is good, using AI to "help" with homework is bad. https://t.co/QO67l4Scr4
(暂无翻译)
And this really is early evidence: sample size was excellent and approach to verifying outputs makes sense, but there are lots of unobservable elements that could play a role in relative success rates.
(暂无翻译)
Some (early) evidence that managers have the highest success rate in using Claude Code for coding. I have been arguing that management is an AI superpower, as clearly specifying what you want, how to do it & what good looks like is key to using agents. https://t.co/ofbCp3f1QB https://t.co/gu013PM8MO
(暂无翻译)
There are papers that show training AI on "evil" data results in general misalignment, so it is nice to know the opposite is true and that beneficial RL data in one field leads to more aligned models across a range of tasks.
(暂无翻译)
One of the key moments of the LLM era, ali g with GPT-3.5 and the decision by Microsoft to not take down Bing/Sydney/GPT-4 after the @kevinroose New York Times article.
(暂无翻译)
I have given AA a hard time about its previous agentic evaluation but this looks like a good and impressive benchmark for real world knowledge work that is unsaturated and had private hold out tests. This is one to watch - I didn’t see a human comparison score though?
(暂无翻译)
Kind of a big deal that no one has been able to answer this question definitively? (The only complimentary asset owner open weights makes sense for is Nvidia)
(暂无翻译)
I know I keep harping on this theme but open weights models are quite valuable to AI users, and people seem to assume that they will always keep up as training costs grow, but I don’t really understand the incentives as opposed to open source, which was much clearer
(暂无翻译)
Is there a business model for being profitable off training frontier open weights models? Other people can host, fine-tune, consult etc. as least as cheaply as you can. There are no ancilary product sales & it is fantastically expensive to make compared to most open source work
(暂无翻译)
Among all the big hires the labs are making recently recently, and since people occasionally ask, it may be worth reiterating that I do not take money from any of the AI labs. I also don't take any corporate sponsorship money for anything I write, whether here or substack, etc.
(暂无翻译)
当基础模型 API 的调用成本无限逼近于零,纯靠倒卖大模型接口的中间商必将死亡。未来的利润池将迅速转移到能够提供极致产品体验和沉淀专有工作流数据的 Player 手中,这才是对抗商品化的唯一利器。
That's not really what's commoditization is about I think It means more that the profit margin of something goes to close to zero where it's sold at cost, as in it's not really a good business anymore where you can make lots of money It means that it's so easy to make and there's so many competitors that there's no differentiation anymore and everyone just uses whatever and again profit goes to $0 Airlines for example are a commodity service, very tiny profits, very little differentiation, you just want to go from A to B So the idea is that's happening to SaaS software since anyone can make large parts of them pretty easily these days, or well that's the theory, maybe it'll evolve into something new that does have an edge over AI vibecoded clones again? Even with commoditization, there's usually a premium tier that remains, think private jets with airlines, or Michelin restaurants, but a premium tier is only a small % of the market, and it can't keep an entire industry alive!
(暂无翻译)
I did Romanian deadlift but I did bit too heavy, injury but a bit tense
(暂无翻译)
Anyone know what's the best Thai massage in SF? I need massage after deadlift for lower back a bit, also foot massage would be nice
(暂无翻译)
You won't believe the things the rest of the world already solved but Europeans still have to go through in 2026
(暂无翻译)
Regarding nose trimmers, I bought one, trimmed my nose hairs and then it kept getting itchy af, then I told my dad and he said you shouldn't use a nose trimmer cuz it just makes the end of the hairs flat and then they will grow back and itch, so better not to trim?
(暂无翻译)
I don't know if it's placebo but using Fable for those few days it felt it just never gave up on problems and kept trying crazy ways to get whatever you wanted done. Now back on Opus and it's kinda lazy, thinks things are too daunting and keeps asking if you sure
(暂无翻译)
将“人”完全移出工作流不仅是个技术问题,更是个信任和责任归属问题。在法律、医疗等容错率极低的行业,强行走全自动 Agent 流程不仅成本极高,甚至可能带来毁灭性的合规灾难,分段接管仍是当下的主流解法。
The long slog to take humans out of the loop… is a brutally long slog worth persevering through!
(暂无翻译)
Can you imagine constructing something this complex and meaningless to humanity? What’s the point of all this financial engineering that $MSTR is doing!?! Aren’t there more important problem to solve then this three card Monty imaginary money shell game?
(暂无翻译)
老牌科技资讯站引入 AI 板块后流量迅速回暖,印证了当前全网对 AI 资讯的极度渴求。这也意味着,通过算法+人工策展提供高信噪比的 AI 行业情报,依然是一个门槛不高但变现极快的流量生意。
since we've expanded @digg to tech news + ai, we've seen another jump in traffic -- 4 weeks of solid growth and a new 7-day high this week. thanks, all for trying it out, more features/fun coming soon :) 🙏 https://t.co/zSk3JUXv6s
(暂无翻译)
抛开政治感慨不谈,这种呼吁回归理性的声音在当前的 AI 伦理争论中同样稀缺。在模型偏见和对齐问题极易引发阵营对立的今天,少一点意识形态的站队,多一点工程层面的务实解决,才是推动技术落地的正道。
It makes me happy that the Bushes showed up as well. Remember what America was like before polarization? It wasn't that long ago.
(暂无翻译)
I thought I'd left a light on in my office, but it was just the setting sun shining on my yellow chair. https://t.co/GGKL4o7NXo
(暂无翻译)
中国大模型团队在应用层面的优势绝对不容小觑:庞大的用户基数提供了极度丰富的边缘场景数据。这种高频试错的土壤,使得国内模型在工具调用、电商营销等偏落地实用的能力上,极有希望率先跑出超预期的表现。
Elon on when Chinese models hit fable level performance. I have always thought Chinese labs have a huge advantage here. The feedback loops for usefulness are tighter & AI adoption higher in China than the USA => utility above all else
(暂无翻译)
能够长久记住用户偏好并随时响应的 AI 助手,是科技巨头暗中角力的终极终端。谁能率先打通手机、PC、可穿戴设备之间的多端长期记忆协同,谁就能真正锁死下一代消费者的数字生活入口。
I disagree that nobody knows. I think it’s obvious. I agree it will be obvious looking backward, but it already is. It’s Her. And Jarvis. An ever-present friend/assistant that knows everything about you, and continuously moves you closer to your Ideal State using all this agent plumbing stuff as the backend. https://t.co/Q2U1Pb9M34
(暂无翻译)
Absolutely insane to me the damage that multiple companies have done to themselves with their AI strategies. - Meta spent BILLIONS hiring a bunch of people to make a new, top-tier AI research lab, and after only a few months it's worse than dead. Working there is currently like working at Sun Microsystems (which is still on the back of the Facebook sign out front, by the way). Like ancient history - Microsoft tries to do something with Copilot, it fails, then they go all-in on AI in the OS, gets rejected, goes all-in with OpenAI, and now that relationship is completely adversarial. They basically need to start over. They're really starting to look like a company that sells Windows and Office, desperately trying to pretend to be not that, but in an AI way We're talking about hundreds of billions of dollars. For what? OpenAI has lit tons of money on fire too, but at least Sam has a vision. He's just doing fast-flux on different ideas to see what sticks, and the only question is whether he can get another hit before the bottom falls out with all the money they owe. But they have a serious chance because he has an idea and they have great people pushing in multiple directions. Anthropic's in the best state because Dario has both the vision and the business discipline. So they're already profitable. There's nothing more K-shaped than the difference between a company with lots of money and a good vs. bad (or non-existent) AI strategy. If you have a clear vision and can execute you're going to crush your competition. And if you don't, all ideas look like good ones. And the chances of self-immolation within 6-24 months are super high. Here's the ultimate AI prompt for businesses, that they should answer for themselves. What is the specific problem that we have as a company, and why do we think AI can help us solve it? For far too many companies the answer is a dismal: "The primary problem we have is that leadership wants us to use AI. So the way AI can help us solve that is by implementing AI." Fantastic. You've said precisely nothing. And you might be gone in 24 months.
(暂无翻译)
Google 三巨头亲自出马总结 TPU 训练架构经验,这篇论文绝对值得底层架构工程师逐字精读。它提前剧透了未来几年大规模算力集群在功耗、显存带宽和网络通信上的核心演进路线图。
My @Google colleagues @NormJouppi, Sridhar Lakshmanamurthy, Cliff Young, and David Patterson recently wrote a paper that will appear in the July/August 2026 edition of @ieeemicro titled "Google's Training Supercomputers from TPU v2 to Ironwood: Architectural Stability, Scale, Resilience, Power Efficiency, and Sustainability Across Five Generations". It's chock full of interesting data about the evolution of TPU chip generations, as well as how workloads at Google have transformed over time (hint: lots more transformer-based models!), and how the generations have gotten ~30X more energy efficient per flop. Lots of changes over these generations: Air cooling in TPUv2 to water cooling in TPUv3 onwards 2D to 3D torus-based interconnects 30X improvement TFLOPS/Watt 256 chips (TPUv2) to 9216 chips (Ironwood) per pod Read the full paper: https://t.co/D5NFYFv19V
(暂无翻译)
“最准、最快”的 PDF 转 Markdown 工具直击大模型 RAG 链路最脏最累的环节。能够以纯代码逻辑绕过沉重的视觉模型依赖,不仅将解析速度提升了几个数量级,更大幅降低了企业级知识图谱的构建成本。
We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we’ve introduced markdown. It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill. Check it out: https://t.co/7oFImAZeb2 Come check out LiteParse: https://t.co/JNER0mVcB8
(暂无翻译)
对话型企业级 Agent 平台的崛起,说明客户服务和企业内部工作流自动化正在被大模型彻底重构。如何通过严谨的护栏设计让 Agent 在多轮对话中不跑偏、不乱触发 API,是当前 ToB 市场最大的技术门槛。
Great conversation with @SierraPlatform’s Head Of Product @ZackRW on the Max Agency podcast. ▶️ YouTube: https://t.co/2U88uTDaWV 🎧 Apple: https://t.co/23011jlHzC 🎧 Spotify: https://t.co/G4ljfPnQDL https://t.co/KpVV22z8Rq
(暂无翻译)
把过往的高质量操作记录沉淀为自动化的技能脚本(SKILL.md),是让 Agent 实现自我进化的捷径。相比直接堆叠提示词,这种基于真实成功轨迹的技能库挖掘,能更稳定地提升 Agent 处理复杂多步任务的鲁棒性。
// Automating SKILL.md Generation // Increasingly, mining sessions is one of the best ways to improve your agents. OpenAI released something similar yesterday that lets Codex package skills from interactions. (bookmark it) This paper explains a related approach. They run a three-stage pipeline that segments GUI trajectories, clusters them into candidate skills, and trains a skill-aware policy. The clusters are genuinely readable, with five of eight hitting 0.95 or higher purity against ground-truth workflow labels. But readability does not transfer. GRPO lifts skill-step accuracy only from 18.5% to 20.5%, leaves BrowseComp+ flat, and loses to trivial frequency priors. The authors name the three culprits: a weak boundary detector, an orderless segment representation, and an offline reward model. Paper: https://t.co/Du48U4xNwX Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX
(暂无翻译)
As I said before, for that cost & performance, I don't think Fable is worth it for a lot of SWE tasks. Tbc, I think Fable is fantastic, and it clearly shines in design & creativity. Will test it with my loops (and measure frontier efficiency) when it goes live again. https://t.co/yJAqxJojaV
(暂无翻译)
I think it will happen close to EOY or the beginning of next year. Not a wild guess. I have seen enough research and results to know that the gap is closing fast. And I use models like DeepSeek, GLM, Qwen, Kimi, and MiniMax more than ever now. https://t.co/Un5vs9TWJU
(暂无翻译)
这位 Keras 创始人用 RTS 游戏的资源管理来类比 AI 算力分配,极其精妙。在大规模集群推理时,如何通过调度算法避免 GPU 出现高显存但低算力利用率的“满血待机”状态,是提升大模型服务毛利的关键指标。
When I was playing RTSes, I generally thought about strategy in terms of resource utilization. For instance, in any game that has a unit hp passive regeneration mechanic, any unit that is full-hp represents a wasted resource (you could be gaining hp during that time, so you are net behind). Today, if you are paying for a fixed-price agentic coding subscription, any week you end below your weekly token quota represents a wasted resource. Utilize your token regeneration mechanic.
(暂无翻译)
一天 55% 的浮盈不仅说明 AI 概念的资本红利依然狂暴,也侧面印证了 AI 自主编程智能体在真实生产力场景下的震撼表现。但作为开发者要警惕,当前 AI 科技股的估值早已脱离基本面,不要让炒币心态反噬主业。
+55% in one day. i should start a fund (dm if you would actually help me run one, i have no idea how to run one) https://t.co/xDABotzUVq
(暂无翻译)
completely unprompted wow moment from today - asked @DevinAI to make us a @tbpn style breaking news style announcement card for our AIEWF speakers drop tmr, FULLY expecting it to fail at a heavily visual task and it oneshotted the WHOLE DAMN THING https://t.co/IFrhDDbBUy
(暂无翻译)
绕过传统信号处理模块,直接用 AI 端到端解析医疗影像原始数据,意味着模型能捕捉到被传统算法截断的深层病灶特征。这种范式一旦推广到 CT、MRI 等复杂设备上,将极大地提升早期疾病筛查的灵敏度。
Check out our work on end-to-end ultrasound using neural operator for lung aeration https://t.co/CV3Qnh3qCk We directly reconstructs lung aeration maps from RF data, bypassing the need for traditional beamformers and indirect interpretation of B-mode images.
(暂无翻译)
吴恩达带火的这门课直指当前语音交互的技术断层。要想让 Agent 具备高拟真度的语音收发能力,不仅需要 ASR/TTS 的进步,更要在底层解决多轮对话中的打断机制与流式延迟,这是下一个爆发点。
New course: Add voice to your AI agents and applications, built with @VocalBridge (disclosure: an AI Fund portfolio company) and taught by its CEO @_ashwyn. Voice applications historically required making a hard tradeoff: using fast voice-to-voice models that sacrifice reliability, or accurate speech-to-text pipelines that add latency. This course teaches you how to build voice agents that are both reliable and fast. You'll build three types of voice-enabled applications: a voice-interactive game where voice commands and mouse clicks work together over a single channel, an agent that gains a voice in about 10 lines of code without touching its prompts or tools, and an agent that places outbound phone calls using a make_phone_call function. Skills you'll gain: - Add a voice layer to an existing agent without rewriting your prompts, RAG pipeline, or tools - Give an agent the ability to place outbound calls and stream transcripts back live - Set up voice evaluation to score calls, catch regressions, and improve quality before deployment Join and add voice to your agents without overhauling your architecture: https://t.co/gBO4nmaU9u
(暂无翻译)
微软 AI CEO 与顶级医疗机构深度绑定,说明医疗大模型已经跨越了概念验证阶段。通过消耗海量病历实现诊断辅助和医疗流程自动化,这将直接挤占传统医疗 SaaS 厂商的生存空间,行业洗牌在即。
"In the application of AI, healthcare is going to be the next big product-market-fit explosion." More on the future of healthcare and our collaboration with the Mayo Clinic in my conversation with @CoreyNoles here: https://t.co/b3bCk9wr9m
(暂无翻译)
Talent density is incredibly important for building humanist superintelligence, and our team reflects that. Meet some of the humans at @MicrosoftAI who make our work so special https://t.co/9aYifK4mGL
(暂无翻译)
靠大模型随便写两行代码跑通的“红利期”结束了。现在的“智能体工程”要求开发者具备极强的系统架构设计能力,你需要像管理一个外包团队一样,去协调不同 Agent 的权限边界、错误重试机制和 API 消耗成本。
Karpathy declared vibe coding dead and replaced it with "agentic engineering." Honestly? We've been doing this for 2 years. Just didn't have the branding. Agentic engineering is about coordinating fallible agents while keeping correctness, security, and quality intact. That's... literally what multi-agent orchestration has always been. But I think people are missing something in this debate. Vibe coding isn't dead. It's a subset. It works for prototypes, internal tools, things where 80% is good enough. That's a real use case. Agentic engineering kicks in when you need the other 20%. When the agent writes code that gets deployed to production. When it makes decisions with real consequences and needs to be reliable at scale, not just once in a demo. The gap between them is getting smaller every month tho. Teams keep pushing what you can vibe code. And the bar for what requires real discipline keeps moving too. What's wild to me is when agents start doing the engineering themselves. Our agent Iris already files PRs, reviews teammates' code, writes tests. It's been running with our engineering team for months. That's past vibing. Past agentic engineering too, honestly. I don't think we've found the right word yet for what comes after. But it's coming fast.
(暂无翻译)
在企业级遗留系统上叠加 AI 能力,最大的拦路虎从来不是模型智商,而是盘根错节的历史代码逻辑。这意味着那些深耕垂直行业多年、掌握极深 API 接口Know-how的传统 SaaS 公司,只要能迅速嫁接 AI,依然拥有极高的护城河。
There are many “simple” features that are more complicated than they look on the surface because of a cascade of dependencies. For complex, enterprise systems, this is true of most features. 8090’s Software Factory is built to handle this flawlessly. We first help write requirements, expand and frame dependencies and then execute with a more global knowledge of the problem. You can learn more here: https://t.co/fkfTXgdfXK Also, I’m completely in love with our visual system. 😍
(暂无翻译)
相比于 Python 在运行效率上的短板,Go 语言正在成为编写高性能大模型推理网关和 Agent 调度引擎的首选。对于追求极致并发处理能力的 AI 基础设施后端团队来说,掌握 Go 已经是不可逆的趋势。
go is pretty powerful language for AI and AI tooling. Fast compiles, clean control flow, strong stdlib, explicit, opionated....
(暂无翻译)
技术红利的爆发往往建立在数十年前冷门学术研究的积累之上。这意味着与其盲目追逐现在的多模态大模型热点,不如将目光投向那些图神经网络(GNN)或类脑计算领域看似“无用”的基础论文,那里藏着下一个十年的奇点。
In 1991, the foundations for Transformers, Pre-training, Distillation, and World Models were already being built. These helped shape my own thinking, from my time at Google Brain to our Recursive Self-Improvement (RSI) work at @SakanaAILabs today. 🧠🗼 https://t.co/hf4ESZRgcD 👇
(暂无翻译)
盯着神经网络的特征可视化图看久了会像致幻剂一样改变人的认知模式,这种极客视角的隐喻说明:人类对大模型高维空间语义的理解依然停留在非常表层的阶段。过度拟人化解释 AI 的行为,往往会掩盖其底层逻辑的不可控风险。
i'm glad ~ten years of staring at feature vis doesn't make you schizophrenic or something. obv extreme stimuli in the same way psychedelics are. if any image was going to hack you, ones optimized to fire neurons maximally in neural networks was going to be it https://t.co/mpCjt9mjh1
(暂无翻译)
going back to other agents after a few days with fable feels like driving around in the flinestones car
(暂无翻译)
极速推理芯片厂商对开源大模型的支持速度,直接决定了相关应用生态的爆发节点。如果 GLM 5.2 这类强力的开源模型能够跑在极低延迟的专有芯片上,像实时同传、极速代码补全这类对响应时间极其敏感的应用将彻底爆发。
Really looking forward to one of the super-fast custom silicon inference providers like @GroqInc or @cerebras getting GLM 5.2 running Cerebras has GLM-4.7, Groq is still mostly Llama 3.x and gpt-oss
(暂无翻译)
Lots more information in this post on the Datasette project blog, including details on our live demo and uv one-liners you can use to try this out on your own machine https://t.co/kvImBDuYF8
(暂无翻译)
Think of this as Claude Artifacts reimagined for Datasette - you get all the power of artifacts but with a JSON API to a full relational database, allowing your HTML+JS apps to access and store data in all shapes and sizes
(暂无翻译)
Just launched Datasette Apps - a plugin for Datasette that lets you host full HTML+JS apps in an iframe sandbox that can query your database and do interesting things with your data https://t.co/j9VGMhTRZc
(暂无翻译)
LinkedIn 创始人的发声代表了硅谷主流大厂的立场:与其和监管机构硬刚,不如主动参与规则制定。对于 AI 创业者而言,紧贴政府导向去开发涉及公共民生的基础设施类应用,往往能拿到意想不到的政策补贴和采购大单。
Government and companies can work together to steer AI so that it is maximally beneficial to society, and to move quickly to put helpful AI in the hands of every American, which will no doubt begin to change how many of them feel about the industry as a whole.
(暂无翻译)
Ultra-wealthy people have access to top doctors, effective lawyers, and best-in-class tutors for their kids. What if we could make that available for every American? We're well within reach of that future, and government + companies can collaborate to get there.
(暂无翻译)
智能手机之后的下一代通用计算终端,是 AI 技术落地的最大硬件缺口。无论是脑机接口、智能眼镜还是某种新型可穿戴设备,谁能让大模型以零延迟的体感伴随人类感官,谁就能颠覆苹果目前的硬件霸权。
No one knows yet what the next form factor for computing will be, and yet there will be a next form factor, and it will seem obvious in retrospect.
(暂无翻译)
I just came across this essay I wrote in 2012 predicting that hard tech startups would become a big thing. Not for the last time, YC applications turned out to be a good predictor of future trends. The Hardware Renaissance: https://t.co/rLuH5x4CeI
(暂无翻译)
When you hear "founders in the current YC batch swear by it," that is a serious predictor of success. They are sophisticated judges of technology, and they won't use something merely out of loyalty to their batchmates.
(暂无翻译)
"Half-Wits are Fleas; so little and so light; We scarce cou'd know they live; but that they bite." — Dryden anticipates Twitter in 1677
(暂无翻译)
A large percent of Jessica's conversation consists of worrying about how things are going for people. It starts with our kids and extends outward to hundreds of people. Talking with her is like watching over the shoulder of a radar operator.
(暂无翻译)
通过穷举并列出所有“想要”和“不想要”的状态边界,是设计复杂 Agent 防护栏最实用的逆向工程思维。在让大模型执行带有破坏性风险的指令(如自动化退款、删库)前,建立这套白名单逻辑是救命的防线。
Think of them as articulations of your ideal state. Here are things I want to always be true. Sure things I want to never be true. Here are things that I want to happen if this happens. This is life and work infrastructure, and it’s much bigger than code.
(暂无翻译)
The best way to think of loops is as a precursor to PROACTIVE AI Assistants. This combines with you providing your DA your goals in life and work. So then your assistant goes through and sets up tons of constant checks (loops) to proactively make sure your current state is as close as possible to your ideal state.
(暂无翻译)
Basic world model question. I know it’s agreed that LLMs don’t have real world models, but the assumption seems to be that humans do. What evidence is there that we do? Aren’t we also a black box built on some sort of neural net substrate? We can’t seem to inspect our thinking processes much better than those of LLMs. And we widely believe things that are demonstrably false. So, non-rhetorically, why do we think we have them and LLMs don’t?
(暂无翻译)
I’m going make a list of core, semi-unsolvable questions that we want better and better answers to, and then provide the best AIs answers to them as a running log. What should we call the project and what should some of the first questions be?
(暂无翻译)
A Unified Theory on AI and Jobs. - Yes, many jobs will go away - Yes, many more will be created - But the question is who gets those new jobs
(暂无翻译)
放弃让 Agent 之间使用自然语言互相喋喋不休地“沟通”,转而采用统一的底层“共享状态黑板模式”去读写数据,能够极大降低多智能体系统的 token 消耗和幻觉传染概率,这是企业级 Agent 架构的必经之路。
more ppl are now trying out this approach of agents communicating with a shared state (vs talking to each other)
(暂无翻译)
not sure what i'm doing, but i'm on the board 😅 (going to see if activegraph can optimize strategy) https://t.co/rg6CbgTiXd
(暂无翻译)
连独立开发者的标杆人物也开始化身 VC 四处看项目,说明单纯靠写代码卖订阅的商业模式在天花板上依然有限。在 AI 时代,代码的生产门槛在急剧降低,真正稀缺的是拥有强悍分发能力和资本运作视角的商业操盘手。
💰 As a now newly minted venture capital investor in true VC fashion we dropped by @a16z to find more deals 😃 Andreessen Horowitz is one of the largest VC funds in the world managing about $90B in assets and founded by @pmarca and @bhorowitz in 2009 In 2022, a16z published the "American Dynamism" manifesto, which has a lot of overlap with e/acc, because since ~2016 American culture had become very anti-tech and pessimistic about the future with the birth of the degrowth movement Growth was bad, inventing new things was bad, everything was problematic! American Dynamism promoted being optimistic about the future and technological growth, which seems obvious in the America of today, but it wasn't obvious at all in 2022, and it's thanks to that and many other things that you see America excelling again in 2026 @stuffyokodraws invited us to their office and kindly showed us around! They have a free coworking for their portfolio companies to use, in case they don't have an office yet and also I heard there's lots of robots but I didn't see them today Most interesting is at the front desk is a very beautiful Art Deco sculpture on the wall (I heard it's new) And that's not a coincidence. Exactly 100 years ago, Art Deco was the art movement that was also about relentless optimism and growth, like American Dynamism and e/acc! It's my favorite art movement (and my dad's too) Sadly we did not run into @pmarca, maybe next time! Anyway I'd love to invest in more companies with all my saved up money from 95% indie hacker profit margins because it can't all go into ETFs! So if you're interested, check my little https://t.co/sQ0aiU82PA page and try find my email 😊
(暂无翻译)
🏆 The #1 winner of the Vibe Jam 2026 is "🦫 A Game About Capybaras Delivering Food" and it's a super cute game You're a capybara who's a delivery driver in Rio de Janeiro, accepting new orders from your delivery app, and then buying them in the 7/11 minimart, and driving to customers to deliver them While you deliver them you can use your smartphone, with 🗺️ Capy Maps to route you to the right place, or use the music player 🎵 Capify or watch 📺 Capy Tok 😆 I thought it'd be a bit sus a Brazilian game like this wins with my Brazilian presence but @s13k_ can guarantee it really is the highest rated game by the judges, it's just well made, has Playstation style retro graphics, a great AI-generated soundtrack and just all around there's so much effort gone into the details I'd love to see this game be built out more like a Capybara World Tour where you can do delivery in Amsterdam and Tokyo and Chongqing (imagine driving around in Chongqing!!!), it's a fun concept that would work anywhere, and fits the modern zeitgeist too Excellent work @leocooout
(暂无翻译)
You can see the FULL ranking of ALL 945 games on https://t.co/sxu8yVm7eU now The ranking gets more muddy as you go passed the first 100 though so don't take too much from it, but it's nice to see! https://t.co/g8DyZLCn7f
(暂无翻译)
Also I thought it'd be nice if all these winners also receive $1,000, because they're so nice So let's do that! 😊💰 🌿 @cursor_ai's MOST ORIGINAL HALDANE-4 by @denisbondare 🌿 @cursor_ai's BEST ART DIRECTION Null Range by @taylor_sntx 🌿 @boltdotnew's MOST PLAYED WenWare by @underpaid_mom 🌿 @heyglif's MOST ZEN Kanso by @mrsukeruton 🌿 MOST UNHINGED Swingers by @_offmylawn 🌿 MOST PORTAL TRANSFERS FULL SEND by @dvassallo 🌿 FUNNIEST GAME KÖTTBULLAR METAL by @MichalPastier 🌿 MOST POLISHED Tiny Skies by @dannylimanseta 🌿 UNIQUE CONCEPT Undersphere by @_NoahWhiteson 🌿 MOST ATMOSPHERIC Eyrie by @slowchaz 🌿 RAGE-QUIT AWARD BeetleJump by @assentorp 🌿 MOST CURSED Almost Surgery by @RagimMusakaev
(暂无翻译)
The winners of the Vibe Jam 2026 sponsored by @cursor_ai + @boltdotnew + @heyglif + @tripoai are.... 🥁 🥁🥁 🥁🥁🥁 🥁🥁🥁🥁 🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁🥁🥁🥁🥁 🥁🥁🥁🥁🥁🥁🥁🥁🥁🥁🥁 🥇 1st PLACE — $25,000 A Game About Capybaras Delivering Food by @leocooout 🥈 2nd PLACE — $10,000 Fanto's Mega-Mart by @e_c_t_o 🥉 3rd PLACE — $5,000 WenWare by @underpaid_mom Very special awards are presented for: 🌿 @cursor_ai's MOST ORIGINAL HALDANE-4 by @denisbondare 🌿 @cursor_ai's BEST ART DIRECTION Null Range by @taylor_sntx 🌿 @boltdotnew's MOST PLAYED WenWare by @underpaid_mom 🌿 @heyglif's MOST ZEN Kanso by @mrsukeruton 🌿 MOST UNHINGED Swingers by @_offmylawn 🌿 MOST PORTAL TRANSFERS FULL SEND by @dvassallo 🌿 FUNNIEST GAME KÖTTBULLAR METAL by @MichalPastier 🌿 MOST POLISHED Tiny Skies by @dannylimanseta 🌿 UNIQUE CONCEPT Undersphere by @_NoahWhiteson 🌿 MOST ATMOSPHERIC Eyrie by @slowchaz 🌿 RAGE-QUIT AWARD BeetleJump by @assentorp 🌿 MOST CURSED Almost Surgery by @RagimMusakaev Congrats everyone! And also everyone who participated, thank you! And thanks to all the judges for helping get through all the games @NicolaManzini @OverJumpRally @ericzakariasson @timsoret And lots of thanks to @s13k_ for helping out so much with the logistics of the Vibe Jam again, like creating the website and the entire judging system and so much more! Amazing work! And thanks to all our amazing sponsors who made it possible The vibe coded games this year are much much better than last year, all the judges agree on that, and personally I feel they're getting close to production-ready games that could compete with real game studios: @timsoret, a game dev himself, said: "The quality is much higher level than last year, wow. Some of them are getting close to be genuine games. I think next year (or maybe 6 months at the given rate), it will match many game devs. Not the top 20%, but a 3rd year student yes!" THANK YOU AND SEE YOU NEXT YEAR 👋❤️ P.S. you can play ALL the games on https://t.co/oE0VpwoiN2 #vibejam
(暂无翻译)
Good point and yes partly it's that and people vibe coding But you can also just ask AI to add more test so things keep working well You can vibe code all you want as long as you test after
(暂无翻译)
You have about 50x to 100x higher odds to die in a private jet than flying commercial The reason isn't that the planes are bad btw, they're great, but private jets flights are much less strictly regulated than commercial flights They can fly more custom routes, fly into tiny airports and when the passenger (usually a rich person) really needs to be at a meeting in time they might decide to fly through bad weather when a commercial flight would cancel or delay the flight Simply speaking private jets have a lot more freedom which is good and bad Commercial flights are extremely strictly regulated, fly extremely boring routes, the same every day or every few hours back and forth, highly predictable and if one pilot encounters some issue like bad weather they're able to tell the next flying the same route Even if I could afford to fly private, I'd prefer to just board with fast track and go first or business class, it's much safer!
(暂无翻译)
用这种哲学视角去审视当前技术栈的更迭非常贴切:放弃手写复杂 Python 逻辑的掌控感确实会带来阵痛,但这并非失去,而是获得了一个能以自然语言无限扩展的系统级外脑。开发者必须学会拥抱这种黑盒状态带来的赋能。
i think overly framing it as letting go forces grief, a loss, rather than a gaining of a new home, coming into what you really are. it's much more of a gaining than a loss
(暂无翻译)
在处理复杂的数学证明或分子结构推导时,通用大模型依然频频翻车。这表明通向 AGI 的道路上,垂直领域的专用模型绝对不可或缺。通过将物理定律和领域专家规则硬编码进 AI 底层逻辑,才是突破科研瓶颈的捷径。
Recommended reading. Great insights, especially in areas where general-purpose models continue to fail, like dealing with complex structures. It also highlights that for scientific research, specialized models are winning big time. https://t.co/J1Jj3hp6DE
(暂无翻译)
Microsoft Teams just got its first AI employee. I tested it. A real AI employee that lives in the channel, does the work, and proposes the next move. Not another prompt box. Worth a look. @viktor__com https://t.co/5VBr8LaFDm
(暂无翻译)
You can only truly get this level of output when using orchestrator agents that can coordinate multiple agents across projects. Build your own orchestration layer now. And own it.
(暂无翻译)
The next set of tasks is going to be on long-running tasks. Very curious how it compares with the frontier models on this. Like, how does it work with /loop and /goal? Reporting back soon.
(暂无翻译)
I was a bit suspicious of the claim, but GLM-5.2 is pretty good at designing stuff. Obviously not at the level of a professional designer, but it has that Opus-level quality. Great at: - games - landing pages - HTML artifacts - 3D worlds Wish I had Fable 5 to compare with. https://t.co/qco7AKIrCv
(暂无翻译)
LlamaIndex 创始人明确指出了 RAG 技术的演进路线:死板的“检索-读取”管道已死,基于 Agent 自主判断何时检索、如何总结的动态策略才是未来。这要求开发者在构建知识库时,必须将路由推理能力前置到 Agent 的决策层。
I made a new talk on generalized knowledge agents and the modern context layer for DAIS 2026 🔥 Come check it out!
(暂无翻译)
Agentic search has moved from fixed RAG pipelines into flexible agent harnesses with access to a set of search tools: keyword search (bm25, grep regex) and semantic search. When you upload a collection of unstructured documents to LlamaParse, we expose all these tools for agents to access. Come check out our webinar on June 30th where we explore all these different tools and identify which ones work the best for agentic search: https://t.co/KrCjxY1IWV
(暂无翻译)
拥有像 NotebookLM 这样体验惊艳的应用层设计,却因为自家大模型推理能力跟不上而掣肘,是 Google 当前的真实写照。这其实给了很多独立开发者机会:只要你调教出的模型或 Agent 框架逻辑足够好,完全有机会借助大平台的应用生态逆袭。
Its also a shame because Google has some of the most innovative AI apps, like NotebookLM, but they need smarter brains to power them (as well as the harnesses needed for those brains)
(暂无翻译)
Interestingly, Google no longer has a public frontier model. They have a very good flash model, but a very good flash model can't do frontier work without a good frontier orchestrator. I am sure this will change soon, but Gemini 3.1 Pro is very clearly lagging at this point.
(暂无翻译)
In large part that’s because it’s hard to invest in a world where the exponential continues and open models never close thr gap. The value js just chips, energy, data centers, and the labs themselves.
(暂无翻译)
There is a ton of money riding on the hope that the exponential curve the Big Three Labs are on will end soon If that happens, small and open models become viable, businesses get time to react, costs drop & the world gets weirder more slowly. But that isn’t happening so far.
(暂无翻译)
Big issue with AI strategies at big companies which realized the importance of AI last year (which is only a small subset, most are still not moving fast) is that, in the best case, they developed their strategy in late 2025, before the agentic revolution Things changed since... https://t.co/hlFqsAZglj
(暂无翻译)
I have a fun, oddly useful AI benchmark: "build me a procedurally generated 3D simulation showing the evolution of a harbor town from 3000 BC to 3000 AD, it should look beautiful & allow me to have some control over it" Play the gallery of 20 models: https://t.co/zN2uHY1gl8
(暂无翻译)
LangChain 开始将重点转移到长周期、有状态的 Agent 评测基础设施上,这切中了当前 AI 工程化最痛的软肋。能对持续运行数天的复杂多步任务进行自动化打分和回归测试,是 Agent 能否真正走向生产环境的最后一公里。
harbor is a great framework for running longer running, more stateful agent evals it underpins terminal bench 2 and is becoming industry standard LangSmith Sandboxes now integration with harbor!
(暂无翻译)
硅谷大 V 痛批 AI 末日论的本质:大厂渲染恐慌往往是为了倒逼监管机构出台极高的合规门槛,从而锁死初创公司的生存空间。对于应用层创业者而言,看穿这种“倒逼监管”的套路,才能放下思想包袱去探索边缘场景的创新机会。
This is an excellent essay in the NYT that highlights the unresolved question of why the makers of AI constantly whine and cry that the world will come to an end because of AI. Hint: it won’t. https://t.co/gRxs5CWCyU
(暂无翻译)
连知名开发者都在感叹当前 AI 工具的同质化和泛滥化。这意味着“为了套壳 AI 而做产品”的时代红利已经见顶。只有真正深入到细分业务流的脏活累活中,去解决极其具体的痛点,才能避免在第一波退潮后死在沙滩上。
You can now limit open pull requests for users without write access to your repositories! https://t.co/XNb9MBX9kK
(暂无翻译)
Potentially unpopular opinion but I feel like too many products are coming out that are already existing words. We need to make up words more! Make your app become a verb someday! Make it easy for me to look up because it's a little weird! Get the dot com domain name early!
(暂无翻译)
Midjourney 放着卷生卷死的文生图不去卷,直接跨界去搞无创医疗影像扫描硬件。这种降维打击般的战略转移说明:摆脱算力层面的内卷,利用生成式模型去彻底重塑物理世界的信号采集逻辑,才是真正的星辰大海。
Midjourney just announced an extremely innovative new hardware project: full-body internal 3D scans without MRI. Check it out!
(暂无翻译)
The hardest problems are rarely solved by adding more complexity to the solution -- they are solved by reframing the question until a simpler, clearer answer reveals itself.
(暂无翻译)
科技圈大佬频频涉足实体硬件项目,暗示着纯软件 SaaS 的叙事正在被资本市场抛弃。结合 AI 智能体去开发全新的硬件载体(如智能眼镜、便携设备),通过软硬件结合的方式锁死用户数据,正成为顶级资本押注的新赛道。
Backgammon is the new poker ... and @travisk and I are working on special project y'all are gonna love!
(暂无翻译)
.@grok make these glasses 25% more chunky And a second image of them being 25% less chunky https://t.co/xGZMsaHXTo
(暂无翻译)
Evan is only one or two generations away… I wouldn’t bet against him. AR is the winning model and he knows it. https://t.co/CY4d1YBMp0
(暂无翻译)
predicted this over a decade ago Decentralized money is a threat to centralized control, and the easiest way to control something is to tax it.
(暂无翻译)
RIP @JoshuaBaer Joshua was a relentless supporter of founders and Austin He was positive, fun, and filled with joy. A true friend who relentlessly supported me over the past 20 years we knew each other.
(暂无翻译)
CrewAI 创始人发声定调:“上下文工程”才是目前大模型发挥实力的真正核心技术。这意味着死磕单行 Prompt 的时代已经过去,如何通过动态截断、长文本重排和 RAG 注入来管理有限的 Token 窗口,是算法工程师的必修课。
Everyone is talking about context engineering now. We've been dealing with this at CrewAI for 2 years. Just didn't have a name for it. Context engineering goes way beyond better prompting. It's everything your agent operates inside: memory, tools, state, task history, retrieval. The whole environment, not just the instruction. After billions of agent executions, the pattern is obvious. Teams stuck in POC purgatory almost always have a context problem, not a model problem. Most teams treat context as static. Build a prompt, ship it, done. But production context is alive. It shifts with every execution, every user, every edge case the prompt never anticipated. And then the opposite mistake: dumping everything in. More data must be better, right? No. Noisy context is worse than no context. The agent can't separate signal from noise. The biggest one though is that teams don't compound. Every run starts from zero. Same discovery, same mistakes, same ceiling. Run it a thousand times and the thousand-and-first is no smarter than the first. This is why compounding memory became central to what we do at CrewAI. The system should get better because it ran before. That's the whole point. The term is already getting watered down tho. Seeing "context engineering" on every landing page now. The actual hard work, dynamic selection, relevance scoring, memory that actually compounds... barely anyone has cracked this. Including us. Still so much to figure out here.
(暂无翻译)
谷歌开源了 Gemini 实时翻译的全栈落地代码,直接把高并发实时音频流处理的门槛打到了地板。依托 Cloud Run 的弹性伸缩能力,初创团队甚至可以用极低的前期成本,去承接跨国会议 SaaS 或国际客服外包系统的海量订单。
Build a realtime translation app with the new Gemini Live Translate, Next.js, LiveKit and Cloud Run. What it covers: 1. Stream host audio via WebRTC to a LiveKit Room 2. Pipe PCM frames to Gemini Live for on-the-fly translation 3. Publish translated audio back as separate language tracks 4. Optimize latency with 100ms frame chunking (50Hz → 10Hz) 5. Deploy to Cloud Run with Secret Manager and auto-scaling Links below ⤵️
(暂无翻译)
开源新王 GLM-5.2 的限时免费调用,是对 Cursor 和 GitHub Copilot 等闭源套壳产品商业模式的一次直接打击。开发者完全可以趁着这波福利,基于开源代码智能体框架跑通高强度的复杂项目重构,验证去闭锁化的可行性。
GLM-5.2 is free on Hugging Face Inference Providers through Zai, Together AI, Novita, Fireworks, DeepInfra for the next 6 hours Set it up with Pi, opencode, Codex, Claude Code or any coding agent https://t.co/MU04W4tT9e
(暂无翻译)
现有的医疗影像算法其实是在为 40 年前老旧的算力妥协。如果能够基于现代大算力集群重新设计图像重建的底层数学逻辑,核磁共振的时间成本和辐射伤害都有望呈指数级下降,这将是医疗硬件设备几十年一遇的重构红利。
This is exciting because we live in a world of scanning modalities that were almost all designed around a 1980's compute budget (iFFT/backprojection or bust, basically). A blank-slate redesign of the hardware that assumes modern compute capabilities could be huge.
(暂无翻译)
连以太坊创始人都在公开为底层贡献者点赞,这种极客社区的纯粹性正是 Web3 和 AI 结合部最需要的特质。在 AI 算力被巨头垄断的当下,这种依托去中心化社区文化来沉淀核心代码资产的范式,依然具有极强的生命力。
.@hwwonx has been a steadfast contributor to the Ethereum ecosystem for a decade. I still remember her early days in the Ethereum research community, first outside the Foundation and then inside it, and the thought and care she put into making Ethereum research and consensus work more organized and legible. At the same time, she put a lot of work into building an excellent Ethereum community in Taipei, with people and events that were among my favorites. Last year she, along with @tkstanczak, voluntarily took on the burden of what is perhaps the most challenging position in the Ethereum Foundation, at one of the most challenging times for Ethereum - and realistically, a challenging time for all of humanity. She handled the task skillfully and gracefully, and has constantly strived to find and insist on outcomes that are right both for the Ethereum protocol and for the human beings that build and maintain it. I look forward to her next adventures. https://t.co/yaJsdlvDU4
(暂无翻译)
机器学习界的大 V 背书证实了 GLM-5.2 的架构优越性。它不仅仅是刷榜那么简单,高度兼容之前版本的基础设施意味着开发者可以近乎零成本地将业务代码平移过来,这是开源模型抢夺市场份额的最强杀手锏。
Just caught up with the recent GLM-5.2 release. The best open-weight model today. Architecture-wise, it's build on the GLM-5 and GLM-5.1 architecture that I covered previously, which means it's reusing the Multi-head Latent Attention (MLA) and DeepSeek Sparse Attention (DSA) mechanisms from DeepSeek V3.2. (I wrote about it here: https://t.co/tuunazfQ8y) What's new is that they added an IndexShare mechanism. (That's a cross-layer reuse trick for DSA where instead of recomputing the sparse-attention top-k indexer in every layer, GLM-5.2 runs the full indexer only once every four layers and lets the following layers reuse those selected token indices. This keeps the same DSA idea but makes 1M-token inference much cheaper.)
(暂无翻译)
当开源模型在主观体验和客观数据上双重碾压闭源旗舰,将彻底击碎科技巨头依靠“模型能力壁垒”构建的估值逻辑。资本市场的投资重心必然会从做大做宽底座模型,向深挖应用场景壁垒和独占数据护城河迅速转移。
How would you change your priors if a Chinese lab released an open model that beat Fable across benchmarks and on feel.
(暂无翻译)
OpenAI 开发者关系负责人点明了 Agent 商业落地的拐点:从“陪聊”全面转向“干活”。这意味着评判一个 AI 产品的指标将发生剧变,日活、停留时长等互联网经典指标将彻底失效,API 调用成功率和任务交付转化率才是新的北极星。
Great discussion with @thsottiaux and @steipete on the VivaTech main stage, led by @CharliePerreau. We’ve moved from conversations with AI to systems that can take action and pursue goals. The momentum behind Codex and agents is impossible to miss! https://t.co/2KIdZRZOc3
(暂无翻译)
利用 Polymarket 这类 Web3 预测市场的真实资金盘口,来量化评估 AI 行业事件的落地概率,是一种极其高明的套利信息源。比起看券商研报的滞后判断,真金白银下注的盘口数据更能反映圈内极客对大模型演进节点的精准预判。
@midjourney @Scobleizer @bryan_johnson @DavidSHolz @iScienceLuvr another SUPER fun highlight of my evening was telling @zoink how we are using @polymarket prediction markets to gauge the implied value of our july 1 @aiDotEngineer world cup suite being a team USA game https://t.co/gwg3eLtwI1
(暂无翻译)
my notes from the @midjourney medical launch - @Scobleizer compared this to the original iPhone and Tesla launches (that he was also front row for) - find you a man who looks at you like @bryan_johnson was 😍 ing for @DavidSHolz - see @iScienceLuvr tweet linked for Nature paper - reminds me of our @biohub episodes: better science starts with better data, and that means better imaging - people asking "but wen FDA?" are so small minded. we will do the easy stuff, then we'll do the harder stuff. roll up your sleeves and help or just be patient. - when you have genuinely better tech+mission, all the other hurdles just sort of fall away/figure themselves out: business model, regulatory approval, hiring, marketing, confusion over what to do - this was just the first of 8 side project launches MJ has planned this year - this is what technological ambition looks like: not 10% better, not 2x better, but 40-100x better in every dimension - how are we getting this level of innovation and ambition out of a $10m/yr research budget and whats wrong with the way we use R&D in every other megacorp/goverment/frontier lab? - how has $BFLY stock not mooned yet, this thing just had its ChatGPT moment thank you to L for letting me into what I believe is going to be the top 10 most important launches i'll ever see live.
(暂无翻译)
a16z 创始人对科技圈风口的冷嘲热讽极其精准。就像现在的 RAG 或 Agent 套壳,一旦成为烂大街的政治正确,离泡沫破裂和资本退潮就不远了。与其在拥挤的通用赛道卷套壳,不如提前去布局算力调度或边缘部署这种苦逼但必须的基础设施。
If a Current Thing is a Thing enough times, it becomes The Thing. The Thing can run for a decade plus, until it becomes a parody of itself. Then The Thing quietly goes away. Even if The Thing has wrecked your whole world, you’re still silent about it. It was never a thing.
(暂无翻译)
连 OpenAI CEO 都在公开向顶尖技术人才“示爱”,说明大模型底层架构的突破已经进入极度依赖天才直觉的阶段。这也释放了一个明确信号:在模型参数和算力红利见顶的当口,想要实现能力的再次飞跃,只能去抢夺最头部的算法架构师。
We offer no explanation as to why Noams are so good at AI; we attribute their success, as all else, to divine benevolence.
(暂无翻译)
noam is one of the people I have most wanted to work with since the very beginning of openai. only took 10 years. i think it will be worth the wait!
(暂无翻译)