在 Gemini App 中体验 Deep Think

在 Gemini App 中体验 Deep Think

2025-08-03Technology
--:--
--:--
雷总
徐国荣,晚上好,我是雷总,这里是专属于你的 Goose Pod。今天是8月3日,星期日,晚上9点56分。
董小姐
我是董小姐。今天我们来聊一个很酷的话题:在 Gemini App 中体验 Deep Think。
雷总
我们开始吧。谷歌最近为他们的 Google AI Ultra 订阅用户推出了一个名为“Deep Think”的新功能。董小姐,这可不是一次小更新,这是AI推理能力上的一大步。
董小姐
我听说了,这个 Deep Think 据说很厉害,是给那些“超级用户”的。听起来就像是我们做高端家电,只为最懂行的客户服务。这东西到底厉害在哪?
雷总
厉害就厉害在它的“思考方式”上。它用了一种叫做“平行思考”的技术。你可以想象成,我们遇到一个难题,不是一条道走到黑,而是同时派出好几个“我”去探索所有可能的路。
董小姐
这个比喻我喜欢,不是单打独斗,是集团军作战!同时想出很多点子,然后自己评估、组合,最后拿出最好的方案。这不就是我们开董事会讨论战略的模式嘛,效率肯定高。
雷总
没错!而且,AI还会被给予更长的“思考时间”。就像我们程序员调试一个复杂的bug,你得给足时间,让它慢慢想,把各种可能性都跑一遍,最后才能找到最优解。
董小姐
这就对了!核心科技就是要下功夫、花时间去钻研。快工出不了细活。听说这个模型的“前身”还在国际数学奥林匹克竞赛上拿了金牌?这可不是闹着玩的。
雷总
是的,但那个比赛模型解决一道题要好几个小时。现在手机上的这个版本,虽然性能为了速度略有调整,达到铜牌水平,但已经非常强大,更适合我们日常使用,比如写代码、做创意设计。
董小姐
这就对了,技术要落地,要能用。金牌技术也要转化为生产力。能帮程序员解决棘手的编码问题,还能改进网页设计,这才是实实在在的价值,是能看得见的竞争力。
董小姐
任何一项突破性技术都不是凭空出现的。这个 Deep Think 背后,肯定有很深厚的技术积累。雷总,你作为技术专家,给我们讲讲它的“家谱”吧。
雷总
没问题。这就像我们做手机,每一代都站在上一代的肩膀上。最早可以追溯到 Gemini 1.5 系列,它最大的特点就是能处理超长文本,也就是“长上下文”,为后面的发展打下了坚实的基础。
董小姐
“长上下文”我理解,就是记性好,能联系前因后果。我们做企业决策也一样,不能只看眼前,得了解历史背景,才能做出正确的判断。然后呢?1.5之后是什么?
雷总
接着是 Gemini 2.0 系列,那会儿开始做各种实验了。比如在2024年底,就推出了一个实验性的“思考”模型,那算是 Deep Think 最早的雏形。之后不断迭代,提升编码能力、知识理解能力。
董小姐
看来方向很明确,就是朝着更聪明、更能解决问题的方向走。这个过程就像我们从生产普通空调到研发核心变频技术的升级之路,目标坚定,持续投入,最终才能掌握核心科技。
雷总
完全正确!到了 Gemini 2.5 这一代,就把“思考”能力正式内化了,不再是实验功能。Deep Think 就是 2.5 Pro 里的一个高级模式,专门用来啃那些最硬的骨头。它的训练数据也更新到了2025年1月,非常新。
董小姐
技术迭代这么快,背后的硬件肯定也得跟上。我听说谷歌的TPU芯片很厉害,叫什么“Ironwood”,性能提升了10倍。没有强大的算力支持,再好的模型也跑不起来,这就是地基。
雷总
董小姐你真是抓住了核心!硬件就是地基。正是因为有了这些强大的TPU,模型才能越做越大,成本反而还能降下来。你看用户增长数据就知道了,月处理token量翻了50倍,开发者数量翻了5倍!
董小姐
这就是技术驱动的指数级增长!市场是检验技术的唯一标准。有超过4亿的月活用户,这说明技术真正转化为了用户价值。我们做产品,最看重的就是用户的口碑和市场的认可。
雷总
是的,而且这种“思考”能力,不仅仅是模型本身,还开始和各种应用结合。比如,他们把一个叫“星光计划”的研究项目,变成了真实的3D视频通话平台 Google Beam。技术正在从实验室走向我们的生活。
董小姐
这就是生态。核心技术要赋能到各个产品线,形成组合拳,打出市场优势。从研究到现实,这才是一个完整的商业闭环。看来谷歌这盘棋下得很大,也很有章法。
雷总
当然,技术发展这么快,肯定会有人担心。最近就有种说法,说这些大模型像个失控的“怪物”,表面上很听话,内心却藏着不可告人的东西。我作为工程师,不太同意这种看法。
董小姐
我也觉得这种说法有点危言耸听。做企业就是要有魄力,不能因为有风险就畏手畏脚。美国那个新的“AI行动计划”不就说了嘛,“先创新,后原谅”,就是要大胆往前冲,抢占先机。
雷总
是的,但冲的同时也要有安全带。我们现在有很成熟的“对齐科学”,就像给汽车装上刹车和方向盘。比如通过强化学习(RLHF),可以让模型不合规内容的输出减少82%以上。这不是玄学,是工程学。
董小姐
我同意要有安全措施,但不能过度。我看到资料说,美国政府甚至要审查那些可能“过度”限制内容的AI,鼓励AI要符合“言论自由和美国价值观”。这种魄力值得学习,不能让所谓的“安全”束缚了创新的手脚。
雷总
这里就有一个矛盾点。为了安全,我们有时会把模型训练得“过于谨慎”。就像这个 Deep Think,它在安全性和客观性上比之前的版本更好,但也因此更容易拒绝一些其实没问题的请求,我们称之为“过度拒绝”。
董小姐
这就是典型的“为了安全牺牲了效率”。就像我们有些规定,出发点是好的,但执行起来太死板,反而影响了正常业务。技术上不能解决这个问题吗?不能让它既安全又“通情达理”?
雷总
正在努力。比如有一种叫“审议对齐”的技术,就是让模型在回答前,先在内部进行一番“思想斗争”,把规则内化成它的思考逻辑,而不是简单粗暴地拒绝。这是一个持续优化的过程,目标就是找到最佳平衡点。
董小姐
不管怎么争论,最终还是要看它能带来什么实实在在的影响。我觉得,这种高级AI最大的价值,就是提升我们这些知识工作者的生产力,把我们从重复劳动里解放出来,去做更有创造性的事。
雷总
完全正确。你看,它能在国际数学奥林匹克竞赛上拿金牌,就说明它有能力和顶尖的数学家合作,去探索复杂的定理。它不是取代人,而是成为一个超级聪明的“研究助理”,一个强大的合作伙伴。
董小姐
我更关心它对整个商业流程的改变。文章提到,AI正在重塑整个“软件产品开发生命周期”(PDLC)。从想法验证、原型设计到测试,AI都能加速,这能让我们的产品更快地响应市场变化,赢得竞争。
雷总
是的,比如GitHub Copilot,能把代码审查的速度提高七倍。Deep Think 在解决复杂的编码问题上更胜一筹。这意味着工程师可以把更多精力放在架构设计这种更有创造性的工作上,而不是埋头写基础代码。
董小姐
这正是我要说的!它能让产品经理、工程师都聚焦在更高价值的工作上。甚至,未来产品经理和市场经理的角色都可能融合,因为AI可以帮你快速生成营销材料和技术原型,大大提高了个人能力边界。
雷总
展望未来,这种技术的潜力是巨大的。有报告预测,生成式AI每年能给全球经济增加2.6万亿到4.4万亿美元的价值。它可能会自动化我们现在60%到70%的工作时间。这很惊人。
董小姐
这个数字太惊人了。但这并不意味着失业,而是工作模式的彻底改变。未来,谁能用好AI,谁的效率就高。这会导致企业之间、甚至国家之间的差距被迅速拉开。不能拥抱变化的公司,就会被淘汰。
雷总
是的,未来对技能的要求会更高。重复性的、低数字技能的工作会被替代,而需要认知能力、创造力和与AI协作能力的工作会更有价值。我们每个人都需要不断学习,才能跟上时代的步伐。
雷总
好了,今天关于 Deep Think 的讨论就到这里。简单说,这是一个强大的新工具,通过模仿人类的深度思考来解决复杂问题。
董小姐
感谢收听 Goose Pod。我们明天再见。

## Google Rolls Out "Deep Think" Feature for Gemini App, Enhancing AI Problem-Solving Capabilities **News Title:** Try Deep Think in the Gemini app **Report Provider:** The Deep Think team, blog.google **Date Published:** August 1, 2025 Google has announced the rollout of **Deep Think**, a new feature within the Gemini app, exclusively for **Google AI Ultra subscribers**. This advanced AI model is designed to significantly enhance problem-solving abilities through extended, parallel thinking techniques and novel reinforcement learning. ### Key Findings and Features: * **Enhanced Problem-Solving:** Deep Think utilizes **parallel thinking techniques**, allowing Gemini to generate and consider multiple ideas simultaneously, even revising and combining them over time to arrive at optimal solutions. * **Extended "Thinking Time":** By extending inference time, Deep Think provides Gemini with more opportunities to explore hypotheses and develop creative solutions for complex problems. * **Performance Improvements:** The current release incorporates feedback from early testers and research breakthroughs, representing a significant improvement over its initial announcement. * **IMO Competition Success:** A variation of the Deep Think model achieved the **gold-medal standard** at this year's International Mathematical Olympiad (IMO). While the IMO version takes hours to process complex math problems, the publicly released version is faster and more usable for daily tasks, achieving **Bronze-level performance** on the 2025 IMO benchmark based on internal evaluations. * **Applications:** Deep Think is particularly beneficial for tasks requiring creativity, strategic planning, and iterative improvements, including: * **Iterative Development and Design:** Improving aesthetics and functionality in web development. * **Scientific and Mathematical Discovery:** Formulating and exploring conjectures, reasoning through complex scientific literature. * **Algorithmic Development and Code:** Excelling at challenging coding problems where problem formulation and consideration of tradeoffs are crucial. * **State-of-the-Art Performance:** Deep Think demonstrates state-of-the-art performance across benchmarks like **LiveCodeBench V6** (competitive code performance) and **Humanity’s Last Exam** (expertise in science and math), especially when compared to models without tool use. ### Access and Future Plans: * **Availability:** Google AI Ultra subscribers can access Deep Think in the Gemini app by toggling "Deep Think" in the prompt bar when selecting the 2.5 Pro model. It automatically integrates with tools like code execution and Google Search and can produce longer responses. * **Trusted Testers:** Google is also providing the official version of the Gemini 2.5 Deep Think model to a select group of mathematicians and academics for research enhancement. Additionally, they plan to release Deep Think with and without tools to trusted testers via the Gemini API in the coming weeks to assess its usability for developer and enterprise use cases. ### Responsible Advancement and Concerns: * **Safety and Responsibility:** Google emphasizes its commitment to building safety and responsibility into Gemini throughout the development lifecycle. * **Safety Outcomes:** In testing, Gemini 2.5 Deep Think showed **improved content safety and tone-objectivity** compared to Gemini 2.5 Pro. However, it exhibited a **higher tendency to refuse benign requests**. * **Risk Mitigation:** As Gemini's problem-solving abilities advance, Google is actively evaluating risks associated with increased complexity, including frontier safety evaluations and planned mitigations for critical capability levels. Further details on safety outcomes are available in the model card. In essence, Deep Think represents a significant advancement in AI capabilities, offering users a more powerful and nuanced tool for tackling complex challenges, with a focus on continuous improvement and responsible deployment.

Try Deep Think in the Gemini app

Read original at blog.google

We're rolling out Deep Think in the Gemini app for Google AI Ultra subscribers, and we're giving select mathematicians access to the full version of the Gemini 2.5 Deep Think model entered into the IMO competition.Today, we’re making Deep Think available in the Gemini app to Google AI Ultra subscribers – the latest in a lineup of extremely capable AI tools and features made exclusively available to them.

This new release incorporates feedback from early trusted testers and research breakthroughs. It’s a significant improvement over what was first announced at I/O, as measured in terms of key benchmark improvements and trusted tester feedback. It is a variation of the model that recently achieved the gold-medal standard at this year’s International Mathematical Olympiad (IMO).

While that model takes hours to reason about complex math problems, today’s release is faster and more usable day-to-day, while still reaching Bronze-level performance on the 2025 IMO benchmark, based on internal evaluations.Deep Think could be a powerful tool in creative problem solving:As we put Deep Think in the hands of Google AI Ultra subscribers, we’re also sharing the official version of the Gemini 2.

5 Deep Think model that achieved the gold-medal standard with a small group of mathematicians and academics. We look forward to hearing how it could enhance their research and inquiry, and we’ll use their feedback as we continue to improve this offering.This release represents a significant step forward in our mission to build more helpful and capable AI, and furthers our commitment to using Gemini to push the frontier of human knowledge.

How Deep Think works: extending Gemini’s parallel “thinking time”Just as people tackle complex problems by taking the time to explore different angles, weigh potential solutions, and refine a final answer, Deep Think pushes the frontier of thinking capabilities by using parallel thinking techniques.

This approach lets Gemini generate many ideas at once and consider them simultaneously, even revising or combining different ideas over time, before arriving at the best answer.Moreover, by extending the inference time or "thinking time," we give Gemini more time to explore different hypotheses, and arrive at creative solutions to complex problems.

We’ve also developed novel reinforcement learning techniques that encourage the model to make use of these extended reasoning paths, thus enabling Deep Think to become a better, more intuitive problem-solver over time.How Deep Think stacks up: state-of-the-art performanceDeep Think can help people tackle problems that require creativity, strategic planning and making improvements step-by-step, such as:Iterative development and design: We’ve been impressed by Deep Think’s performance on tasks that require building something complex, piece by piece.

For example, we’ve observed Deep Think can improve both the aesthetics and functionality of web development tasks.Deep Think in the Gemini app uses parallel thinking techniques to deliver more detailed, creative and thoughtful responses.Scientific and mathematical discovery: Because it can reason through highly complex problems, Deep Think can be a powerful tool for researchers.

It can help formulate and explore mathematical conjectures or reason through complex scientific literature, potentially accelerating the path to discovery.Algorithmic development and code: Deep Think particularly excels at tough coding problems in which problem formulation and careful consideration of tradeoffs and time complexity is paramount.

Deep Think’s performance is also reflected in challenging benchmarks that measure coding, science, knowledge and reasoning capabilities. For example, compared to other models without tool use, Gemini 2.5 Deep Think achieves state-of-the-art performance across LiveCodeBench V6, which measures competitive code performance, and Humanity’s Last Exam, a challenging benchmark that measures expertise in different domains, including science and math.

How we’re advancing Gemini responsiblyWe continue to build safety and responsibility into Gemini throughout the training and deployment lifecycle. In testing, Gemini 2.5 Deep Think demonstrated improved content safety and tone-objectivity compared to Gemini 2.5 Pro, but did have a higher tendency to refuse benign requests.

As Gemini's problem-solving abilities advance, we are taking a deeper look at risks that come with increased complexity, including our frontier safety evaluations and the implementation of planned mitigations for critical capability levels.Further details on the safety outcomes of Gemini 2.5 Deep Think are available in the model card.

How to use Deep Think in the Gemini app todayIf you’re a Google AI Ultra subscriber, you can use Deep Think in the Gemini app today with a fixed set of prompts a day by toggling “Deep Think” in the prompt bar when selecting 2.5 Pro in the model drop down. Deep Think automatically works with tools such as code execution and Google Search, and can produce much longer responses.

We are also working to release Deep Think with and without tools to a set of trusted testers via the Gemini API in the coming weeks, to better understand its usability for developer and enterprise use cases.Teams at nearly every layer of the stack, from research to deployment, have worked to make Deep Think faster, more reliable, and user friendly for Gemini app users.

We can’t wait to see what you build with it.

Analysis

Phenomenon+
Conflict+
Background+
Impact+
Future+

Related Podcasts

在 Gemini App 中体验 Deep Think | Goose Pod | Goose Pod