From bbfe5d929f3dad83655cb2a337d1054274d4bd87 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E6=9C=B1=E6=BD=AE?= <zhuchaowe@users.noreply.github.com>
Date: Wed, 11 Feb 2026 12:14:48 +0800
Subject: [PATCH 1/5] feat: add skill feature memory
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

添加 skill 功能的 feature memory，记录技能包管理服务和 Hook 系统的核心信息。

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .features/skill/MEMORY.md | 121 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)
 create mode 100644 .features/skill/MEMORY.md

diff --git a/.features/skill/MEMORY.md b/.features/skill/MEMORY.md
new file mode 100644
index 0000000..7d6fc1a
--- /dev/null
+++ b/.features/skill/MEMORY.md
@@ -0,0 +1,121 @@
+# Skill 功能
+
+> 负责范围：技能包管理服务 - 核心实现
+> 最后更新：2025-02-11
+
+## 当前状态
+
+Skill 系统支持两种来源：官方 skills (`./skills/`) 和用户 skills (`projects/uploads/{bot_id}/skills/`)。支持 Hook 系统和 MCP 服务器配置，通过 SKILL.md 或 plugin.json 定义元数据。
+
+## 核心文件
+
+- `routes/skill_manager.py` - Skill 上传/删除/列表 API
+- `agent/plugin_hook_loader.py` - Hook 系统实现
+- `agent/deep_assistant.py` - `CustomSkillsMiddleware`
+- `agent/prompt_loader.py` - PrePrompt hooks + MCP 配置合并
+- `skills/` - 官方 skills 目录
+- `skills_developing/` - 开发中 skills
+
+## 最近重要事项
+
+- 2025-02-11: 初始化 skill 功能 memory
+
+## Gotchas（开发必读）
+
+- ⚠️ 执行脚本必须使用绝对路径
+- ⚠️ MCP 配置优先级：Skill MCP > 默认 MCP > 用户参数
+- ⚠️ 上传大小限制：50MB（ZIP），解压后最大 500MB
+- ⚠️ 压缩比例检查：最大 100:1（防止 zip 炸弹）
+- ⚠️ 符号链接检查：禁止解压包含符号链接的文件
+
+## Skill 目录结构
+
+```
+skill-name/
+├── SKILL.md                    # 核心指令文档（必需）
+├── skill.yaml                  # 元数据配置（可选）
+├── .claude-plugin/
+│   └── plugin.json            # Hook 和 MCP 配置（可选）
+└── scripts/                    # 可执行脚本（可选）
+    └── script.py
+```
+
+## Hook 系统
+
+| Hook 类型 | 执行时机 | 用途 |
+|-----------|---------|------|
+| `PrePrompt` | system_prompt 加载时 | 动态注入用户上下文 |
+| `PostAgent` | agent 执行后 | 处理响应结果 |
+| `PreSave` | 保存消息前 | 内容过滤/修改 |
+
+## API 接口
+
+| 端点 | 方法 | 功能 |
+|------|------|------|
+| `GET /api/v1/skill/list` | - | 返回官方 + 用户 skills |
+| `POST /api/v1/skill/upload` | - | ZIP 上传，解压到用户目录 |
+| `DELETE /api/v1/skill/remove` | - | 删除用户 skill |
+
+## 内置 Skills
+
+| Skill 名称 | 功能描述 |
+|-----------|---------|
+| `excel-analysis` | Excel 数据分析、透视表、图表 |
+| `managing-scripts` | 管理可复用脚本库 |
+| `rag-retrieve` | RAG 知识库检索 |
+| `jina-ai` | Jina AI Reader/Search |
+| `user-context-loader` | Hook 机制示例 |
+
+## plugin.json 格式
+
+```json
+{
+  "name": "skill-name",
+  "description": "描述",
+  "hooks": {
+    "PrePrompt": [{"type": "command", "command": "python hooks/pre_prompt.py"}],
+    "PostAgent": [...],
+    "PreSave": [...]
+  },
+  "mcpServers": {
+    "server-name": {
+      "command": "...",
+      "args": [...]
+    }
+  }
+}
+```
+
+## Skill 加载优先级
+
+1. Skill MCP 配置（最高）
+2. 默认 MCP 配置 (`mcp/mcp_settings.json`)
+3. 用户传入参数（覆盖所有）
+
+## 安全措施
+
+- ZipSlip 防护：检查解压路径
+- 路径遍历防护：验证 `bot_id` 和 `skill_name` 格式
+- 大小限制：上传 50MB，解压后 500MB
+- 压缩比限制：最大 100:1
+
+## 设计原则
+
+- **渐进式加载**：按需加载，避免一次性读取所有
+- **绝对路径优先**：执行脚本必须使用绝对路径
+- **通用化设计**：脚本应参数化，解决一类问题
+- **安全优先**：完整的上传验证链
+
+## 配置项
+
+```bash
+SKILLS_DIR=./skills           # 官方 skills 目录
+BACKEND_HOST=xxx             # RAG API 主机
+MASTERKEY=xxx                # 认证密钥
+```
+
+## 索引
+
+- 设计决策：`decisions/`
+- 变更历史：`changelog/`
+- 相关文档：`docs/`

From 63d17d355b07a533d068d0a2541d33bda1865017 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E6=9C=B1=E6=BD=AE?= <zhuchaowe@users.noreply.github.com>
Date: Wed, 11 Feb 2026 12:30:56 +0800
Subject: [PATCH 2/5] feat(skill): add feature memory with changelog and
 decisions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

添加 skill 功能的完整记忆文档：

Changelog:
- 2025-Q4: 初始实现 (GRPC 层 + 内置 skills)
- 2026-Q1: API 完善 (REST API + Hook 系统)

Design Decisions:
- 001: Skill 架构设计 (目录结构、Hook 系统)
- 002: 上传安全措施 (ZipSlip、路径遍历防护)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 .features/skill/changelog/2025-Q4.md          | 38 +++++++++++++++
 .features/skill/changelog/2026-Q1.md          | 36 +++++++++++++++
 .features/skill/decisions/001-architecture.md | 46 +++++++++++++++++++
 .features/skill/decisions/002-security.md     | 35 ++++++++++++++
 4 files changed, 155 insertions(+)
 create mode 100644 .features/skill/changelog/2025-Q4.md
 create mode 100644 .features/skill/changelog/2026-Q1.md
 create mode 100644 .features/skill/decisions/001-architecture.md
 create mode 100644 .features/skill/decisions/002-security.md

diff --git a/.features/skill/changelog/2025-Q4.md b/.features/skill/changelog/2025-Q4.md
new file mode 100644
index 0000000..b43c00a
--- /dev/null
+++ b/.features/skill/changelog/2025-Q4.md
@@ -0,0 +1,38 @@
+# 2025-Q4 Skill Changelog
+
+## 版本 0.1.0 - 初始实现
+
+### 2025-10-31
+- **新增**: agent skills 支持，测试阶段代码
+- **文件**: `chat_handler.py`, `knowledge_chat_cc_service.py`
+- **作者**: Alex
+
+### 2025-11-03
+- **新增**: 内置 skills (pptx, docx, pdf, xlsx)
+- **新增**: jina skill - 规范 jina 网络搜索
+- **解决**: "prompt too long" 问题
+
+### 2025-11-13
+- **新增**: cc agent task 任务添加默认 skills
+- **文件**: `task_handler.py`, `knowledge_task_cc_service.py`
+
+### 2025-11-19
+- **新增**: skill-creator 内置技能
+
+### 2025-11-20
+- **新增**: EFS 类型接口，新增上传 skill
+- **功能**: 支持 skill 包上传
+
+### 2025-11-21
+- **新增**: EFS 删除 skill 接口
+- **移除**: skill 查询接口（暂存）
+
+### 2025-11-22
+- **新增**: GRPC chat 接口，skills 参数支持
+
+### 2025-11-26
+- **新增**: skill 上传支持 `.skill` 后缀（测试）
+
+### 2025-11-28
+- **优化**: 默认挂载的 skill 改为合并逻辑
+- **优化**: 代码结构优化
diff --git a/.features/skill/changelog/2026-Q1.md b/.features/skill/changelog/2026-Q1.md
new file mode 100644
index 0000000..9c0ea7e
--- /dev/null
+++ b/.features/skill/changelog/2026-Q1.md
@@ -0,0 +1,36 @@
+# 2026-Q1 Skill Changelog
+
+## 版本 0.2.0 - API 完善
+
+### 2026-01-07
+- **新增**: Skills 列表查询 API（能力管理页面）
+- **新增**: 技能管理 API with authentication
+- **文件**: `routes/skill_manager.py`
+- **作者**: claude[bot], 朱潮
+
+### 2026-01-09
+- **重构**: 移除 catalog agent，合并到 general agent
+- **说明**: 简化架构，统一使用 general_agent
+- **作者**: 朱潮
+
+### 2026-01-10
+- **修复**: SKILL.md 的 name 字段解析逻辑
+- **新增**: 支持非标准 YAML 格式
+- **新增**: 目录名称不匹配时自动重命名
+- **作者**: Alex
+
+### 2026-01-13
+- **修复**: multipart form data format for catalog service
+- **作者**: 朱潮
+
+### 2026-01-28
+- **新增**: enable_thinking, enable_memory, skills to agent_bot_config
+- **作者**: 朱潮
+
+### 2026-01-30
+- **修复**: skill router 正确注册
+- **作者**: 朱潮
+
+### 2026-02-11
+- **新增**: 初始化 skill feature memory
+- **作者**: 朱潮
diff --git a/.features/skill/decisions/001-architecture.md b/.features/skill/decisions/001-architecture.md
new file mode 100644
index 0000000..e899950
--- /dev/null
+++ b/.features/skill/decisions/001-architecture.md
@@ -0,0 +1,46 @@
+# 001: Skill 架构设计
+
+## 状态
+已采纳 (Accepted)
+
+## 上下文
+需要为 QWEN_AGENT 模式的机器人提供可扩展的技能（插件/工具）支持，允许动态加载自定义功能。
+
+## 决策
+
+### 目录结构设计
+```
+skill-name/
+├── SKILL.md                    # 核心指令文档（必需）
+├── skill.yaml                  # 元数据配置（可选）
+├── .claude-plugin/
+│   └── plugin.json            # Hook 和 MCP 配置（可选）
+└── scripts/                    # 可执行脚本（可选）
+```
+
+### Hook 系统
+| Hook 类型 | 执行时机 | 用途 |
+|-----------|---------|------|
+| `PrePrompt` | system_prompt 加载时 | 动态注入用户上下文 |
+| `PostAgent` | agent 执行后 | 处理响应结果 |
+| `PreSave` | 保存消息前 | 内容过滤/修改 |
+
+### 技能来源
+1. **官方 skills**: `./skills/` 目录
+2. **用户 skills**: `projects/uploads/{bot_id}/skills/`
+
+## 结果
+
+### 正面影响
+- 渐进式加载，按需读取
+- 支持多种元数据格式（优先级: plugin.json > SKILL.md）
+- 完整的 Hook 扩展机制
+- MCP 服���器配置支持
+
+### 负面影响
+- 需要管理文件系统权限
+- 技能包格式验证复杂度增加
+
+## 替代方案
+1. 使用数据库存储（拒绝：文件更灵活）
+2. 仅支持单一格式（拒绝：用户多样性需求）
diff --git a/.features/skill/decisions/002-security.md b/.features/skill/decisions/002-security.md
new file mode 100644
index 0000000..dc671c5
--- /dev/null
+++ b/.features/skill/decisions/002-security.md
@@ -0,0 +1,35 @@
+# 002: Skill 上传安全措施
+
+## 状态
+已采纳 (Accepted)
+
+## 上下文
+用户可以上传 ZIP 格式的技能包，需要防范常见的安全攻击。
+
+## 决策
+
+### 安全防护措施
+
+| 威胁 | 防护措施 |
+|------|---------|
+| ZipSlip 攻击 | 检查每个文件的解压路径 |
+| 路径遍历 | 验证 `bot_id` 和 `skill_name` 格式 |
+| Zip 炸弹 | 压缩比检查（最大 100:1） |
+| 磁盘空间滥用 | 上传 50MB，解压后最大 500MB |
+| 符号链接攻击 | 禁止解压包含符号链接的文件 |
+
+### 限制规则
+```python
+MAX_UPLOAD_SIZE = 50 * 1024 * 1024      # 50MB
+MAX_EXTRACTED_SIZE = 500 * 1024 * 1024  # 500MB
+MAX_COMPRESSION_RATIO = 100              # 100:1
+```
+
+## 结果
+- 完整的上传验证链
+- 防止恶意文件攻击
+- 资源使用可控
+
+## 替代方案
+1. 使用沙箱容器解压（拒绝：复杂度高）
+2. 仅允许预定义技能（拒绝：限制用户自定义能力）

From deb78a76253a3a3855965858443e92b9f0b2da8a Mon Sep 17 00:00:00 2001
From: autobee-sparticle <support@sparticle.com>
Date: Tue, 17 Mar 2026 10:37:49 +0900
Subject: [PATCH 3/5] fix: improve memory extraction for colloquial/informal
 speech (#16)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* chore: add .worktrees/ to .gitignore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(CI): 添加 onprem-dev 环境的构建和部署配置

在 CircleCI 配置中新增 onprem-dev 环境的 build-and-push 和 deploy 任务，部署到 cluster-for-B 的 onprem-dev 命名空间

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: improve memory extraction for colloquial/informal speech

Add semantic completeness rules and multilingual few-shot examples
to FACT_RETRIEVAL_PROMPT to prevent truncated or semantically incorrect
memory extraction. Specifically addresses Japanese casual speech where
particles (が, を, に) are often omitted.

Closes sparticleinc/mygpt-frontend#2125

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: zhuchao <zhuchaowe@163.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: shuirong <shuirong1997@icloud.com>
---
 .circleci/config.yml            | 27 +++++++++++++++++++++++++++
 prompt/FACT_RETRIEVAL_PROMPT.md | 20 ++++++++++++++++++++
 2 files changed, 47 insertions(+)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index b0a1532..d0b7173 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -193,3 +193,30 @@ workflows:
             branches:
               only:
                 - onprem
+      # 为 onprem-dev 环境部署
+      - build-and-push:
+          name: build-for-onprem-dev
+          context:
+            - ecr-new
+          path: .
+          dockerfile: Dockerfile
+          repo: catalog-agent
+          docker-tag: ''
+          filters:
+            branches:
+              only:
+                - onprem
+      - deploy:
+          name: deploy-for-onprem-dev
+          docker-tag: ''
+          path: '/home/ubuntu/cluster-for-B/onprem-dev/catalog-agent/deploy.yaml'
+          deploy-name: catalog-agent
+          deploy-namespace: onprem-dev
+          context:
+            - ecr-new
+          filters:
+            branches:
+              only:
+                - onprem
+          requires:
+            - build-for-onprem-dev
diff --git a/prompt/FACT_RETRIEVAL_PROMPT.md b/prompt/FACT_RETRIEVAL_PROMPT.md
index 27777e6..1c2f93e 100644
--- a/prompt/FACT_RETRIEVAL_PROMPT.md
+++ b/prompt/FACT_RETRIEVAL_PROMPT.md
@@ -83,6 +83,21 @@ Output: {{"facts" : ["Mike Smith helped with bug fix", "Contact: Mike Smith (col
 Input: Mike is coming to the meeting tomorrow.
 Output: {{"facts" : ["Mike Smith is coming to the meeting tomorrow", "Contact: Mike Smith (colleague, also referred as Mike) - DEFAULT when user says 'Mike'"]}}
 
+Input: 私は林檎好きです
+Output: {{"facts" : ["林檎が好き"]}}
+
+Input: コーヒー飲みたい、毎朝
+Output: {{"facts" : ["毎朝コーヒーを飲みたい"]}}
+
+Input: 昨日映画見た、すごくよかった
+Output: {{"facts" : ["昨日映画を見た", "映画がすごくよかった"]}}
+
+Input: 我喜欢吃苹果
+Output: {{"facts" : ["喜欢吃苹果"]}}
+
+Input: 나는 사과를 좋아해
+Output: {{"facts" : ["사과를 좋아함"]}}
+
 Return the facts and preferences in a json format as shown above.
 
 Remember the following:
@@ -93,6 +108,11 @@ Remember the following:
 - If you do not find anything relevant in the below conversation, you can return an empty list corresponding to the "facts" key.
 - Create the facts based on the user and assistant messages only. Do not pick anything from the system messages.
 - Make sure to return the response in the format mentioned in the examples. The response should be in json with a key as "facts" and corresponding value will be a list of strings.
+- **CRITICAL for Semantic Completeness**:
+  - Each extracted fact MUST preserve the complete semantic meaning. Never truncate or drop key parts of the meaning.
+  - For colloquial or grammatically informal expressions (common in spoken Japanese, Chinese, Korean, etc.), understand the full intended meaning and record it in a clear, semantically complete form.
+  - In Japanese, spoken language often omits particles (e.g., が, を, に). When extracting facts, include the necessary particles to make the meaning unambiguous. For example: "私は林檎好きです" should be understood as "林檎が好き" (likes apples), not literally "私は林檎好き".
+  - When the user expresses a preference or opinion in casual speech, record the core preference/opinion clearly. Remove the subject pronoun (私は/I) since facts are about the user by default, but keep all other semantic components intact.
 - **CRITICAL for Contact/Relationship Tracking**:
   - ALWAYS use the "Contact: [name] (relationship/context)" format when recording people
   - When you see a short name that matches a known full name, record as "Contact: [Full Name] (relationship, also referred as [Short Name])"

From a161e43421bcd6e66e725525a02300bd94153569 Mon Sep 17 00:00:00 2001
From: autobee-sparticle <support@sparticle.com>
Date: Tue, 17 Mar 2026 11:14:02 +0900
Subject: [PATCH 4/5] feat: add POST /api/v1/memory endpoint for realtime
 conversation memory (#17)

* feat: add POST /api/v1/memory endpoint for realtime conversation memory

Add memory extraction API that accepts conversation messages and
stores them via Mem0. This enables realtime voice sessions to save
memories through the same pipeline as chat conversations.

Fixes: sparticleinc/mygpt-frontend#2126

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address code review findings for memory API

- Use Literal["user","assistant"] for role field validation
- Add Field constraints (min_length, max_length=200)
- Track and report pairs_failed in response
- Hide internal exception details from HTTP response
- Remove unused authorization parameter (internal API)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: zhuchao <zhuchaowe@163.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
---
 routes/memory.py | 112 +++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 109 insertions(+), 3 deletions(-)

diff --git a/routes/memory.py b/routes/memory.py
index a0b6a95..c29ae53 100644
--- a/routes/memory.py
+++ b/routes/memory.py
@@ -1,13 +1,13 @@
 """
 Memory 管理 API 路由
-提供记忆查看和删除功能
+提供记忆查看、添加和删除功能
 """
 
 import logging
-from typing import Optional, List, Dict, Any
+from typing import Literal, Optional, List, Dict, Any
 from fastapi import APIRouter, HTTPException, Header, Query
 from fastapi.responses import JSONResponse
-from pydantic import BaseModel
+from pydantic import BaseModel, Field
 
 logger = logging.getLogger('app')
 
@@ -33,6 +33,26 @@ class DeleteAllResponse(BaseModel):
     deleted_count: int
 
 
+class ConversationMessage(BaseModel):
+    """对话消息"""
+    role: Literal["user", "assistant"]
+    content: str = Field(..., min_length=1)
+
+
+class AddMemoryRequest(BaseModel):
+    """添加记忆的请求体"""
+    bot_id: str = Field(..., min_length=1)
+    user_id: str = Field(..., min_length=1)
+    messages: List[ConversationMessage] = Field(..., max_length=200)
+
+
+class AddMemoryResponse(BaseModel):
+    """添加记忆的响应"""
+    success: bool
+    pairs_processed: int
+    pairs_failed: int = 0
+
+
 async def get_user_identifier_from_request(
     authorization: Optional[str],
     user_id: Optional[str] = None
@@ -63,6 +83,92 @@ async def get_user_identifier_from_request(
     )
 
 
+@router.post("/memory", response_model=AddMemoryResponse)
+async def add_memory_from_conversation(data: AddMemoryRequest):
+    """
+    从对话消息中提取并保存记忆
+
+    将用户和助手的对话配对，通过 Mem0 提取关键事实并存储。
+    用于 realtime 语音对话等不经过 Agent 中间件的场景。
+    此端点供内部服务调用（如 felo-mygpt），不暴露给外部用户。
+    """
+    try:
+        from agent.mem0_manager import get_mem0_manager
+        from utils.settings import MEM0_ENABLED
+
+        if not MEM0_ENABLED:
+            raise HTTPException(
+                status_code=503,
+                detail="Memory feature is not enabled"
+            )
+
+        if not data.messages:
+            return AddMemoryResponse(success=True, pairs_processed=0)
+
+        manager = get_mem0_manager()
+
+        # 将消息配对为 user-assistant 对，然后调用 add_memory
+        pairs_processed = 0
+        pairs_failed = 0
+        i = 0
+        while i < len(data.messages):
+            msg = data.messages[i]
+            if msg.role == 'user':
+                # 收集连续的 user 消息
+                user_contents = [msg.content]
+                j = i + 1
+                while j < len(data.messages) and data.messages[j].role == 'user':
+                    user_contents.append(data.messages[j].content)
+                    j += 1
+
+                user_text = '\n'.join(user_contents)
+
+                # 检查是否有对应的 assistant 回复
+                assistant_text = ""
+                if j < len(data.messages) and data.messages[j].role == 'assistant':
+                    assistant_text = data.messages[j].content or ""
+                    j += 1
+
+                if user_text and assistant_text:
+                    conversation_text = f"User: {user_text}\nAssistant: {assistant_text}"
+                    try:
+                        await manager.add_memory(
+                            text=conversation_text,
+                            user_id=data.user_id,
+                            agent_id=data.bot_id,
+                            metadata={"type": "realtime_conversation"},
+                        )
+                        pairs_processed += 1
+                    except Exception as pair_error:
+                        pairs_failed += 1
+                        logger.error(
+                            f"Failed to add memory for pair: {pair_error}"
+                        )
+
+                i = j
+            else:
+                i += 1
+
+        logger.info(
+            f"Added {pairs_processed} memory pairs (failed={pairs_failed}) "
+            f"for user={data.user_id}, bot={data.bot_id}"
+        )
+        return AddMemoryResponse(
+            success=pairs_failed == 0,
+            pairs_processed=pairs_processed,
+            pairs_failed=pairs_failed,
+        )
+
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"Failed to add memory from conversation: {e}")
+        raise HTTPException(
+            status_code=500,
+            detail="Failed to add memory from conversation"
+        )
+
+
 @router.get("/memory", response_model=MemoryListResponse)
 async def get_memories(
     bot_id: str = Query(..., description="Bot ID (对应 agent_id)"),

From 513dda8bbb7640d7e8c974fc99e810bb803149e0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E6=9C=B1=E6=BD=AE?= <zhuchaowe@users.noreply.github.com>
Date: Thu, 9 Apr 2026 15:02:28 +0800
Subject: [PATCH 5/5] =?UTF-8?q?=F0=9F=90=9B=20fix:=20=E4=BF=AE=E5=A4=8D=20?=
 =?UTF-8?q?GuidelineMiddleware=20=E5=AF=BC=E8=87=B4=20assistant=20message?=
 =?UTF-8?q?=20prefill=20=E6=8A=A5=E9=94=99?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

enable_thinking 开启时，thinking 中间件将 AIMessage 追加到 messages 末尾，
导致不支持 assistant prefill 的模型返回 400 错误。
修复方式：在 AIMessage 后追加多语言 HumanMessage，确保消息以 user 结尾。

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 agent/guideline_middleware.py | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/agent/guideline_middleware.py b/agent/guideline_middleware.py
index 9ddfb17..a9b2911 100644
--- a/agent/guideline_middleware.py
+++ b/agent/guideline_middleware.py
@@ -6,7 +6,7 @@ from utils.fastapi_utils import (extract_block_from_system_prompt, format_messag
 from langchain.chat_models import BaseChatModel
 from langgraph.runtime import Runtime
 
-from langchain_core.messages import SystemMessage
+from langchain_core.messages import SystemMessage, HumanMessage
 from typing import Any, Callable
 from langchain_core.callbacks import BaseCallbackHandler
 from langchain_core.outputs import LLMResult
@@ -124,10 +124,11 @@ Action: Provide concise, friendly, and personified natural responses.
         response.additional_kwargs["message_tag"] = "THINK"
         response.content = f"<think>{response.content}</think>"
 
-        # 将响应添加到原始消息列表
-        state['messages'] = state['messages'] + [response]
+        # 将响应添加到原始消息列表，并追加 HumanMessage 确保消息以 user 结尾
+        # 某些模型不支持 assistant message prefill，要求最后一条消息必须是 user
+        state['messages'] = state['messages'] + [response, HumanMessage(content=self._get_follow_up_prompt())]
         return state
-        
+
     async def abefore_agent(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
         if not self.guidelines:
             return None
@@ -148,10 +149,23 @@ Action: Provide concise, friendly, and personified natural responses.
         response.additional_kwargs["message_tag"] = "THINK"
         response.content = f"<think>{response.content}</think>"
 
-        # 将响应添加到原始消息列表
-        state['messages'] = state['messages'] + [response]
+        # 将响应添加到原始消息列表，并追加 HumanMessage 确保消息以 user 结尾
+        # 某些模型不支持 assistant message prefill，要求最后一条消息必须是 user
+        state['messages'] = state['messages'] + [response, HumanMessage(content=self._get_follow_up_prompt())]
         return state
 
+    def _get_follow_up_prompt(self) -> str:
+        """根据语言返回引导主 agent 回复的提示"""
+        prompts = {
+            "ja": "以上の分析に基づいて、ユーザーに返信してください。",
+            "jp": "以上の分析に基づいて、ユーザーに返信してください。",
+            "zh": "请根据以上分析，回复用户。",
+            "zh-TW": "請根據以上分析，回覆用戶。",
+            "ko": "위 분석을 바탕으로 사용자에게 답변해 주세요.",
+            "en": "Based on the above analysis, please respond to the user.",
+        }
+        return prompts.get(self.language, prompts["en"])
+
     def wrap_model_call(
         self,
         request: ModelRequest,