跳转到主要内容

GPT-5 思考模式

本文页只写入 2026-03-22 已在 Crazyrouter 生产环境复核过的 GPT reasoning 行为。 当前主示例使用:
  • gpt-5.4
  • POST /v1/responses
Claude 当前不支持 POST /v1/responses。如果你在接 Claude,请使用 POST /v1/messagesPOST /v1/chat/completions,不要套用本页的 Responses 请求形态。
POST /v1/responses

基本用法

curl https://crazyrouter.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.4",
    "input": "Which is larger, 9.11 or 9.9? Explain briefly.",
    "reasoning": {
      "effort": "high",
      "summary": "detailed"
    }
  }'

生产验证到的响应结构

当前生产返回的关键 output.type
["reasoning", "message"]
其中 reasoning item 的典型结构为:
{
  "id": "rs_xxx",
  "type": "reasoning",
  "encrypted_content": "...",
  "summary": [
    {
      "type": "summary_text",
      "text": "..."
    }
  ]
}
message item 中则包含最终输出文本:
{
  "type": "message",
  "content": [
    {
      "type": "output_text",
      "text": "..."
    }
  ]
}

reasoning 参数

字段类型说明
effortstring推理力度,当前实测可用值包括 lowmediumhigh
summarystring思考摘要粒度,当前实测 concisedetailed 可用

一个实用判断

  • 只要更强推理,不关心摘要:只传 effort
  • 需要拿到可展示的 reasoning 摘要:同时传 summary
在当前复核中:
  • 只传 effort 时,reasoning item 存在,但 summary 可能为空数组
  • summary: "detailed" 时,可以稳定拿到 summary_text

提取思考摘要

Python
response = client.responses.create(
    model="gpt-5.4",
    input="Design a high-concurrency message queue architecture.",
    reasoning={
        "effort": "high",
        "summary": "detailed"
    }
)

for item in response.output:
    if item.type == "reasoning":
        for part in item.summary:
            if part.type == "summary_text":
                print("思考摘要:", part.text)
    elif item.type == "message":
        for content in item.content:
            if content.type == "output_text":
                print("最终回答:", content.text)

流式思考

当前生产环境已复核到以下 Responses SSE 事件名:
  • response.reasoning_summary_part.added
  • response.reasoning_summary_text.delta
  • response.reasoning_summary_text.done
  • response.output_text.delta
  • response.output_text.done
  • response.completed
示例:
Python
stream = client.responses.create(
    model="gpt-5.4",
    input="Explain briefly why 9.9 is larger than 9.11.",
    reasoning={
        "effort": "high",
        "summary": "detailed"
    },
    stream=True
)

for event in stream:
    if event.type == "response.reasoning_summary_text.delta":
        print(f"[思考摘要] {event.delta}", end="")
    elif event.type == "response.output_text.delta":
        print(event.delta, end="")

当前建议

  • 需要可见 reasoning 字段时,优先选 gpt-5.4 + Responses API
  • 不要把 gpt-5.4 Chat Completions 的 reasoning_content 当成当前主依赖字段
  • 如果你只关心最终答案,可在 Chat Completions 继续使用 reasoning_effort
reasoning 模式会提高延迟与 Token 消耗。effort 越高,成本通常越高。