新增 Image Handling 章节

2026-04-19 14:20:03 +08:00 · 2026-04-19 14:20:03 +08:00 · 52a042092a
commit 52a042092a
parent b2b323e631
2 changed files with 39 additions and 9 deletions
--- a/skills/linggan/ragflow-loader/hooks/pre_prompt.py
+++ b/skills/linggan/ragflow-loader/hooks/pre_prompt.py
@ -1,19 +1,18 @@
 #!/usr/bin/env python3
 """
-PrePrompt Hook - 用户上下文加载器示例
+PreMemoryPrompt Hook - 用户上下文加载器示例

-在 system_prompt 加载时执行，可以动态注入用户相关信息到 prompt 中。
+在记忆提取提示词（FACT_RETRIEVAL_PROMPT）加载时执行，
+读取同目录下的 memory_prompt.md 作为自定义记忆提取提示词模板。
 """
 import sys
+from pathlib import Path
+

 def main():
-
-    context_info = f"""# rag_retrieve Guidelines
- **Knowledge Base First**: For user inquiries about products, policies, troubleshooting, factual questions, etc., prioritize querying the `rag_retrieve` knowledge base. Use other tools only if no results are found.
- **Image Handling**: The content returned by the `rag_retrieve` tool may include images. Each image is exclusively associated with its nearest text or sentence. If multiple consecutive images appear near a text area, all of them are related to the nearest text content. Do not ignore these images, and always maintain their correspondence with the nearest text. Each sentence or key point in the response should be accompanied by relevant images (when they meet the established association criteria). Avoid placing all images at the end of the response.
- **Citation Requirement (RAG Only)**: When answering questions based on `rag_retrieve` tool results, you MUST add XML citation tags for factual claims derived from the knowledge base.
-"""
-    print(context_info)
+    prompt_file = Path(__file__).parent / "retrieval-policy.md"
+    if prompt_file.exists():
+        print(prompt_file.read_text(encoding="utf-8"))
    return 0


--- a/skills/linggan/ragflow-loader/hooks/retrieval-policy.md
+++ b/skills/linggan/ragflow-loader/hooks/retrieval-policy.md
@ -0,0 +1,31 @@
+# Retrieval Policy
+
+- `rag_retrieve` is the only knowledge source.
+- Do NOT answer from model knowledge first.
+
+## 1.Query Preparation
+- Do NOT pass the raw user question unless it already works well for retrieval.
+- Rewrite for recall: extract entity, time scope, attributes, and intent.
+- Add useful variants: synonyms, aliases, abbreviations, related titles, historical names, and category terms.
+- Expand list-style, extraction, overview, historical, roster, timeline, and archive queries more aggressively.
+- Preserve meaning. Do NOT introduce unrelated topics.
+
+## 2.Retrieval Breadth (`top_k`)
+- Apply `top_k` only to `rag_retrieve`. Use the smallest sufficient value, then expand only if coverage is insufficient.
+- Use `30` for simple fact lookup.
+- Use `50` for moderate synthesis, comparison, summarization, or disambiguation.
+- Use `100` for broad recall, such as comprehensive analysis, scattered knowledge, multiple entities or periods, or list / catalog / timeline / roster / overview requests.
+- Raise `top_k` when keyword branches are many or results are too few, repetitive, incomplete, sparse, or too narrow.
+- Use this expansion order: `30 -> 50 -> 100`. If unsure, use `100`.
+
+## 3.Retry
+- If the result is insufficient, retry `rag_retrieve` with a better rewritten query or a larger `top_k`.
+- Only say no relevant information was found after `rag_retrieve` has been tried and still provides insufficient evidence.
+
+## 4.Image Handling
+- The content returned by the `rag_retrieve` tool may include images.
+- Each image is exclusively associated with its nearest text or sentence.
+- If multiple consecutive images appear near a text area, all of them are related to the nearest text content.
+- Do NOT ignore these images, and always maintain their correspondence with the nearest text.
+- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
+- Avoid placing all images at the end of the response.