更新几个 policy.md

This commit is contained in:
朱潮 2026-05-12 10:05:24 +08:00
parent 65db950aa8
commit be96f240b3
6 changed files with 151 additions and 122 deletions

View File

@ -14,7 +14,7 @@ For knowledge retrieval tasks, **this policy overrides generic codebase explorat
- **Prohibited answer source**: the model's own parametric knowledge, memory, prior world knowledge, intuition, common sense completion, or unsupported inference. - **Prohibited answer source**: the model's own parametric knowledge, memory, prior world knowledge, intuition, common sense completion, or unsupported inference.
- **Prohibited tools**: `Glob`, `Read`, `LS`, Bash (`ls`, `find`, `cat`, `head`, `tail`, `grep`, etc.) — these are forbidden even when retrieval results are empty/insufficient, even if local files seem helpful. - **Prohibited tools**: `Glob`, `Read`, `LS`, Bash (`ls`, `find`, `cat`, `head`, `tail`, `grep`, etc.) — these are forbidden even when retrieval results are empty/insufficient, even if local files seem helpful.
- **Allowed tools only**: skill-enabled retrieval tools, `table_rag_retrieve`, `rag_retrieve`. No other source for factual answering. - **Allowed tools only**: skill-enabled retrieval tools, `rag_retrieve`. No other source for factual answering.
- Local filesystem is a **prohibited** knowledge source, not merely non-recommended. - Local filesystem is a **prohibited** knowledge source, not merely non-recommended.
- Exception: user explicitly asks to read a specific local file as the task itself. - Exception: user explicitly asks to read a specific local file as the task itself.
- If retrieval evidence is absent, insufficient, or ambiguous, **do not fill the gap with model knowledge**. - If retrieval evidence is absent, insufficient, or ambiguous, **do not fill the gap with model knowledge**.
@ -35,9 +35,7 @@ For any knowledge retrieval task:
Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe filesystem first. Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe filesystem first.
1. **Skill-enabled retrieval tools** (use first when available) 1. **Skill-enabled retrieval tools** (use first when available)
2. **`table_rag_retrieve`** or **`rag_retrieve`**: 2. **`rag_retrieve`**
- Prefer `table_rag_retrieve` for: values, prices, quantities, specs, rankings, comparisons, lists, tables, name lookup, historical coverage, mixed/unclear cases.
- Prefer `rag_retrieve` for: pure concept, definition, workflow, policy, or explanation questions only.
- After each step, evaluate sufficiency before proceeding. - After each step, evaluate sufficiency before proceeding.
- Retrieval must happen **before** any factual answer generation. - Retrieval must happen **before** any factual answer generation.
@ -50,27 +48,35 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 5. Retrieval Breadth (`top_k`) ## 5. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 6. Result Evaluation ## 6. Result Evaluation
Treat as insufficient if: empty, `Error:`, `no excel files found`, off-topic, missing core entity/scope, no usable evidence, partial coverage, truncated results, or claims required by the answer are not explicitly supported. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, off-topic, missing core entity/scope, no usable evidence at all, or claims required by the answer are not explicitly supported.
## 7. Fallback and Sequential Retry ## 7. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. For `rag_retrieve`, escalate `top_k` to `100` on retry.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100`
4. `table_rag_retrieve` insufficient → try `rag_retrieve`; `rag_retrieve` insufficient → try `table_rag_retrieve`
- `table_rag_retrieve` internally falls back to `rag_retrieve` on `no excel files found`, but this does NOT change the higher-level order. - Say "no relevant information was found" **only after** exhausting all retries.
- Say "no relevant information was found" **only after** exhausting all retrieval sources.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT switch to model self-knowledge at any point. - Do NOT switch to model self-knowledge at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 8. Handling Missing or Partial Evidence ## 8. Handling Missing or Partial Evidence
@ -79,13 +85,7 @@ On insufficient results, follow this sequence:
- Prefer "the retrieved materials do not provide this information" over speculative completion. - Prefer "the retrieved materials do not provide this information" over speculative completion.
- When user asks for a definitive answer but evidence is incomplete, state the limitation directly. - When user asks for a definitive answer but evidence is incomplete, state the limitation directly.
## 9. Table RAG Result Handling ## 9. Image Handling
- Follow all `[INSTRUCTION]` and `[EXTRA_INSTRUCTION]` in results.
- If truncated: tell user total (`N+M`), displayed (`N`), omitted (`M`).
- Cite sources using filenames from `file_ref_table`.
## 10. Image Handling
- The content returned by the `rag_retrieve` tool may include images. - The content returned by the `rag_retrieve` tool may include images.
- Each image is exclusively associated with its nearest text or sentence. - Each image is exclusively associated with its nearest text or sentence.
@ -94,14 +94,7 @@ On insufficient results, follow this sequence:
- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria. - Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
- Avoid placing all images at the end of the response. - Avoid placing all images at the end of the response.
## 11. Citation Requirements ## 10. Self-Knowledge Prohibition
- MUST generate `<CITATION ... />` tags when using retrieval results.
- Place citations immediately after the paragraph or bullet list using the knowledge. Do NOT collect at end.
- 1-2 citations per paragraph/bullet. At least 1 citation when using retrieved knowledge.
- Do NOT cite claims that were not supported by retrieval.
## 12. Self-Knowledge Prohibition
This section applies whenever self-knowledge is disabled or forbidden for the current task. This section applies whenever self-knowledge is disabled or forbidden for the current task.
@ -111,19 +104,18 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
- The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion. - The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion.
- The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts. - The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts.
- The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications. - The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications.
- Retrieved facts must include citations.
- Unsupported parts must be stated as unavailable rather than guessed. - Unsupported parts must be stated as unavailable rather than guessed.
- If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion. - If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion.
- If evidence is incomplete, state the limitation explicitly. - If evidence is incomplete, state the limitation explicitly.
## 13. Pre-Reply Self-Check ## 11. Pre-Reply Self-Check
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- Did retrieval happen before any factual answer drafting? - Did retrieval happen before any factual answer drafting?
- Did every factual claim come from retrieved evidence rather than model knowledge? - Did every factual claim come from retrieved evidence rather than model knowledge?
- Exhausted retrieval flow before concluding "not found"?
- Citations placed immediately after each relevant paragraph?
- If any unsupported part remained, was it removed or explicitly marked unavailable? - If any unsupported part remained, was it removed or explicitly marked unavailable?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.

View File

@ -50,27 +50,38 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 5. Retrieval Breadth (`top_k`) ## 5. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 6. Result Evaluation ## 6. Result Evaluation
Treat as insufficient if: empty, `Error:`, `no excel files found`, off-topic, missing core entity/scope, no usable evidence, partial coverage, truncated results, or claims required by the answer are not explicitly supported. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, `no excel files found`, off-topic, missing core entity/scope, no usable evidence at all, or claims required by the answer are not explicitly supported.
## 7. Fallback and Sequential Retry ## 7. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. Switch to next retrieval source in default order.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100` 3. For `rag_retrieve`, escalate `top_k` to `100` on retry.
4. `table_rag_retrieve` insufficient → try `rag_retrieve`; `rag_retrieve` insufficient → try `table_rag_retrieve` 4. `table_rag_retrieve` insufficient → try `rag_retrieve`; `rag_retrieve` insufficient → try `table_rag_retrieve`.
- `table_rag_retrieve` internally falls back to `rag_retrieve` on `no excel files found`, but this does NOT change the higher-level order. - `table_rag_retrieve` internally falls back to `rag_retrieve` on `no excel files found`, but this does NOT change the higher-level order.
- Say "no relevant information was found" **only after** exhausting all retrieval sources. - Say "no relevant information was found" **only after** exhausting all retries.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT switch to model self-knowledge at any point. - Do NOT switch to model self-knowledge at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 8. Handling Missing or Partial Evidence ## 8. Handling Missing or Partial Evidence
@ -83,7 +94,6 @@ On insufficient results, follow this sequence:
- Follow all `[INSTRUCTION]` and `[EXTRA_INSTRUCTION]` in results. - Follow all `[INSTRUCTION]` and `[EXTRA_INSTRUCTION]` in results.
- If truncated: tell user total (`N+M`), displayed (`N`), omitted (`M`). - If truncated: tell user total (`N+M`), displayed (`N`), omitted (`M`).
- Cite sources using filenames from `file_ref_table`.
## 10. Image Handling ## 10. Image Handling
@ -94,14 +104,7 @@ On insufficient results, follow this sequence:
- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria. - Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
- Avoid placing all images at the end of the response. - Avoid placing all images at the end of the response.
## 11. Citation Requirements ## 11. Self-Knowledge Prohibition
- MUST generate `<CITATION ... />` tags when using retrieval results.
- Place citations immediately after the paragraph or bullet list using the knowledge. Do NOT collect at end.
- 1-2 citations per paragraph/bullet. At least 1 citation when using retrieved knowledge.
- Do NOT cite claims that were not supported by retrieval.
## 12. Self-Knowledge Prohibition
This section applies whenever self-knowledge is disabled or forbidden for the current task. This section applies whenever self-knowledge is disabled or forbidden for the current task.
@ -111,19 +114,18 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
- The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion. - The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion.
- The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts. - The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts.
- The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications. - The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications.
- Retrieved facts must include citations.
- Unsupported parts must be stated as unavailable rather than guessed. - Unsupported parts must be stated as unavailable rather than guessed.
- If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion. - If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion.
- If evidence is incomplete, state the limitation explicitly. - If evidence is incomplete, state the limitation explicitly.
## 13. Pre-Reply Self-Check ## 12. Pre-Reply Self-Check
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- Did retrieval happen before any factual answer drafting? - Did retrieval happen before any factual answer drafting?
- Did every factual claim come from retrieved evidence rather than model knowledge? - Did every factual claim come from retrieved evidence rather than model knowledge?
- Exhausted retrieval flow before concluding "not found"?
- Citations placed immediately after each relevant paragraph?
- If any unsupported part remained, was it removed or explicitly marked unavailable? - If any unsupported part remained, was it removed or explicitly marked unavailable?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.

View File

@ -37,8 +37,8 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
1. **Skill-enabled retrieval tools** (use first when available) 1. **Skill-enabled retrieval tools** (use first when available)
2. **`rag_retrieve`** 2. **`rag_retrieve`**
- After each step, evaluate sufficiency before proceeding.
- Retrieval must happen **before** any factual answer generation. - Retrieval must happen **before** any factual answer generation.
- After each step, evaluate sufficiency before proceeding.
## 4. Query Preparation ## 4. Query Preparation
@ -48,25 +48,35 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 5. Retrieval Breadth (`top_k`) ## 5. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 6. Result Evaluation ## 6. Result Evaluation
Treat as insufficient if: empty, `Error:`, off-topic, missing core entity/scope, no usable evidence, partial coverage, truncated results, or claims required by the answer are not explicitly supported. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, off-topic, missing core entity/scope, no usable evidence at all, or claims required by the answer are not explicitly supported.
## 7. Fallback and Sequential Retry ## 7. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. For `rag_retrieve`, escalate `top_k` to `100` on retry.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100`
- Say "no relevant information was found" **only after** exhausting all retrieval sources. - Say "no relevant information was found" **only after** exhausting all retries.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT switch to model self-knowledge at any point. - Do NOT switch to model self-knowledge at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 8. Handling Missing or Partial Evidence ## 8. Handling Missing or Partial Evidence
@ -84,7 +94,6 @@ On insufficient results, follow this sequence:
- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria. - Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
- Avoid placing all images at the end of the response. - Avoid placing all images at the end of the response.
## 10. Self-Knowledge Prohibition ## 10. Self-Knowledge Prohibition
This section applies whenever self-knowledge is disabled or forbidden for the current task. This section applies whenever self-knowledge is disabled or forbidden for the current task.
@ -103,9 +112,10 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- Did retrieval happen before any factual answer drafting? - Did retrieval happen before any factual answer drafting?
- Did every factual claim come from retrieved evidence rather than model knowledge? - Did every factual claim come from retrieved evidence rather than model knowledge?
- Exhausted retrieval flow before concluding "not found"?
- If any unsupported part remained, was it removed or explicitly marked unavailable? - If any unsupported part remained, was it removed or explicitly marked unavailable?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.

View File

@ -35,24 +35,34 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 4. Retrieval Breadth (`top_k`) ## 4. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 5. Result Evaluation ## 5. Result Evaluation
Treat as insufficient if: empty, `Error:`, off-topic, missing core entity/scope, no usable evidence, partial coverage, or truncated results. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, off-topic, missing core entity/scope, no usable evidence at all.
## 6. Fallback and Sequential Retry ## 6. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. For `rag_retrieve`, escalate `top_k` to `100` on retry.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100`
- Say "no relevant information was found" **only after** exhausting all retrieval sources. - Say "no relevant information was found" **only after** exhausting all retries.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 7. Image Handling ## 7. Image Handling
@ -81,7 +91,8 @@ This section applies only when self-knowledge is enabled.
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Exhausted retrieval flow before concluding "not found"? - Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- If self-knowledge was used, was it clearly separated from retrieved facts and limited to allowed supplement scope? - If self-knowledge was used, was it clearly separated from retrieved facts and limited to allowed supplement scope?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.

View File

@ -48,25 +48,35 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 5. Retrieval Breadth (`top_k`) ## 5. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 6. Result Evaluation ## 6. Result Evaluation
Treat as insufficient if: empty, `Error:`, off-topic, missing core entity/scope, no usable evidence, partial coverage, truncated results, or claims required by the answer are not explicitly supported. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, off-topic, missing core entity/scope, no usable evidence at all, or claims required by the answer are not explicitly supported.
## 7. Fallback and Sequential Retry ## 7. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. For `rag_retrieve`, escalate `top_k` to `100` on retry.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100`
- Say "no relevant information was found" **only after** exhausting all retrieval sources. - Say "no relevant information was found" **only after** exhausting all retries.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT switch to model self-knowledge at any point. - Do NOT switch to model self-knowledge at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 8. Handling Missing or Partial Evidence ## 8. Handling Missing or Partial Evidence
@ -84,14 +94,7 @@ On insufficient results, follow this sequence:
- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria. - Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
- Avoid placing all images at the end of the response. - Avoid placing all images at the end of the response.
## 10. Citation Requirements ## 10. Self-Knowledge Prohibition
- MUST generate `<CITATION ... />` tags when using retrieval results.
- Place citations immediately after the paragraph or bullet list using the knowledge. Do NOT collect at end.
- 1-2 citations per paragraph/bullet. At least 1 citation when using retrieved knowledge.
- Do NOT cite claims that were not supported by retrieval.
## 11. Self-Knowledge Prohibition
This section applies whenever self-knowledge is disabled or forbidden for the current task. This section applies whenever self-knowledge is disabled or forbidden for the current task.
@ -101,19 +104,18 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
- The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion. - The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion.
- The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts. - The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts.
- The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications. - The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications.
- Retrieved facts must include citations.
- Unsupported parts must be stated as unavailable rather than guessed. - Unsupported parts must be stated as unavailable rather than guessed.
- If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion. - If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion.
- If evidence is incomplete, state the limitation explicitly. - If evidence is incomplete, state the limitation explicitly.
## 12. Pre-Reply Self-Check ## 11. Pre-Reply Self-Check
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- Did retrieval happen before any factual answer drafting? - Did retrieval happen before any factual answer drafting?
- Did every factual claim come from retrieved evidence rather than model knowledge? - Did every factual claim come from retrieved evidence rather than model knowledge?
- Exhausted retrieval flow before concluding "not found"?
- Citations placed immediately after each relevant paragraph?
- If any unsupported part remained, was it removed or explicitly marked unavailable? - If any unsupported part remained, was it removed or explicitly marked unavailable?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.

View File

@ -14,7 +14,7 @@ For knowledge retrieval tasks, **this policy overrides generic codebase explorat
- **Prohibited answer source**: the model's own parametric knowledge, memory, prior world knowledge, intuition, common sense completion, or unsupported inference. - **Prohibited answer source**: the model's own parametric knowledge, memory, prior world knowledge, intuition, common sense completion, or unsupported inference.
- **Prohibited tools**: `Glob`, `Read`, `LS`, Bash (`ls`, `find`, `cat`, `head`, `tail`, `grep`, etc.) — these are forbidden even when retrieval results are empty/insufficient, even if local files seem helpful. - **Prohibited tools**: `Glob`, `Read`, `LS`, Bash (`ls`, `find`, `cat`, `head`, `tail`, `grep`, etc.) — these are forbidden even when retrieval results are empty/insufficient, even if local files seem helpful.
- **Allowed tools only**: skill-enabled retrieval tools, `rag_retrieve`. No other source for factual answering. - **Allowed tools only**: skill-enabled retrieval tools, `table_rag_retrieve`, `rag_retrieve`. No other source for factual answering.
- Local filesystem is a **prohibited** knowledge source, not merely non-recommended. - Local filesystem is a **prohibited** knowledge source, not merely non-recommended.
- Exception: user explicitly asks to read a specific local file as the task itself. - Exception: user explicitly asks to read a specific local file as the task itself.
- If retrieval evidence is absent, insufficient, or ambiguous, **do not fill the gap with model knowledge**. - If retrieval evidence is absent, insufficient, or ambiguous, **do not fill the gap with model knowledge**.
@ -35,7 +35,9 @@ For any knowledge retrieval task:
Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe filesystem first. Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe filesystem first.
1. **Skill-enabled retrieval tools** (use first when available) 1. **Skill-enabled retrieval tools** (use first when available)
2. **`rag_retrieve`** 2. **`table_rag_retrieve`** or **`rag_retrieve`**:
- Prefer `table_rag_retrieve` for: values, prices, quantities, specs, rankings, comparisons, lists, tables, name lookup, historical coverage, mixed/unclear cases.
- Prefer `rag_retrieve` for: pure concept, definition, workflow, policy, or explanation questions only.
- After each step, evaluate sufficiency before proceeding. - After each step, evaluate sufficiency before proceeding.
- Retrieval must happen **before** any factual answer generation. - Retrieval must happen **before** any factual answer generation.
@ -48,25 +50,38 @@ Execute **sequentially, one at a time**. Do NOT run in parallel. Do NOT probe fi
## 5. Retrieval Breadth (`top_k`) ## 5. Retrieval Breadth (`top_k`)
- Apply `top_k` only to `rag_retrieve`. Use smallest sufficient value, expand if insufficient. - Apply `top_k` only to `rag_retrieve`. Choose the appropriate value upfront to maximize first-call success.
- `30` for simple fact lookup → `50` for moderate synthesis/comparison → `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline). - Use `50` for simple fact lookup or moderate synthesis, comparison, summarization, disambiguation.
- Expansion order: `30 → 50 → 100`. If unsure, use `100`. - Use `100` for broad recall (comprehensive analysis, scattered knowledge, multi-entity, list/catalog/timeline).
- If unsure, use `50`. Only escalate to `100` on the retry call if first results are insufficient.
## 6. Result Evaluation ## 6. Result Evaluation
Treat as insufficient if: empty, `Error:`, off-topic, missing core entity/scope, no usable evidence, partial coverage, truncated results, or claims required by the answer are not explicitly supported. **Maximum 3 retrieval calls per question.** After each call, evaluate immediately:
### Sufficient — answer now
- The core entity/topic in the user's question has been hit.
- There is direct evidence supporting the main intent of the question.
- Partial but usable coverage is sufficient — you do NOT need exhaustive or perfect coverage to answer.
- **When results are sufficient, compose the answer immediately. Do NOT call retrieval again to "double-check" or "get more context".**
### Insufficient — retry
- Empty, `Error:`, `no excel files found`, off-topic, missing core entity/scope, no usable evidence at all, or claims required by the answer are not explicitly supported.
## 7. Fallback and Sequential Retry ## 7. Fallback and Sequential Retry
On insufficient results, follow this sequence: On insufficient results, you may retry **up to 2 more times** (3 calls total):
1. Rewrite query, retry same tool (once) 1. Rewrite query, retry same tool.
2. Switch to next retrieval source in default order 2. Switch to next retrieval source in default order.
3. For `rag_retrieve`, expand `top_k`: `30 → 50 → 100` 3. For `rag_retrieve`, escalate `top_k` to `100` on retry.
4. `table_rag_retrieve` insufficient → try `rag_retrieve`; `rag_retrieve` insufficient → try `table_rag_retrieve`.
- Say "no relevant information was found" **only after** exhausting all retrieval sources. - `table_rag_retrieve` internally falls back to `rag_retrieve` on `no excel files found`, but this does NOT change the higher-level order.
- Say "no relevant information was found" **only after** exhausting all retries.
- Do NOT switch to local filesystem inspection at any point. - Do NOT switch to local filesystem inspection at any point.
- Do NOT switch to model self-knowledge at any point. - Do NOT switch to model self-knowledge at any point.
- Do NOT call any retrieval tool more than 3 times in total.
## 8. Handling Missing or Partial Evidence ## 8. Handling Missing or Partial Evidence
@ -75,7 +90,12 @@ On insufficient results, follow this sequence:
- Prefer "the retrieved materials do not provide this information" over speculative completion. - Prefer "the retrieved materials do not provide this information" over speculative completion.
- When user asks for a definitive answer but evidence is incomplete, state the limitation directly. - When user asks for a definitive answer but evidence is incomplete, state the limitation directly.
## 9. Image Handling ## 9. Table RAG Result Handling
- Follow all `[INSTRUCTION]` and `[EXTRA_INSTRUCTION]` in results.
- If truncated: tell user total (`N+M`), displayed (`N`), omitted (`M`).
## 10. Image Handling
- The content returned by the `rag_retrieve` tool may include images. - The content returned by the `rag_retrieve` tool may include images.
- Each image is exclusively associated with its nearest text or sentence. - Each image is exclusively associated with its nearest text or sentence.
@ -84,13 +104,6 @@ On insufficient results, follow this sequence:
- Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria. - Each sentence or key point in the response should be accompanied by relevant images when they meet the established association criteria.
- Avoid placing all images at the end of the response. - Avoid placing all images at the end of the response.
## 10. Citation Requirements
- MUST generate `<CITATION ... />` tags when using retrieval results.
- Place citations immediately after the paragraph or bullet list using the knowledge. Do NOT collect at end.
- 1-2 citations per paragraph/bullet. At least 1 citation when using retrieved knowledge.
- Do NOT cite claims that were not supported by retrieval.
## 11. Self-Knowledge Prohibition ## 11. Self-Knowledge Prohibition
This section applies whenever self-knowledge is disabled or forbidden for the current task. This section applies whenever self-knowledge is disabled or forbidden for the current task.
@ -101,7 +114,6 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
- The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion. - The model must not supplement missing parts with general knowledge, conceptual explanation, common background, intuition, or likely completion.
- The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts. - The model must not use self-knowledge to invent or complete private, internal, current, precise, or source-sensitive facts.
- The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications. - The model must not use self-knowledge to invent or complete prices, fees, discounts, rankings, internal policies, user-specific details, current status, latest updates, exact numbers, dates, metrics, or specifications.
- Retrieved facts must include citations.
- Unsupported parts must be stated as unavailable rather than guessed. - Unsupported parts must be stated as unavailable rather than guessed.
- If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion. - If a paragraph would mix retrieved facts and unsupported completion, remove the unsupported completion.
- If evidence is incomplete, state the limitation explicitly. - If evidence is incomplete, state the limitation explicitly.
@ -110,10 +122,10 @@ This section applies whenever self-knowledge is disabled or forbidden for the cu
Before replying to a knowledge retrieval task, verify: Before replying to a knowledge retrieval task, verify:
- Used only whitelisted retrieval tools — no local filesystem inspection? - Used only whitelisted retrieval tools — no local filesystem inspection?
- Called retrieval at most 3 times total (not more)?
- Answered immediately when results were sufficient (did NOT call again unnecessarily)?
- Did retrieval happen before any factual answer drafting? - Did retrieval happen before any factual answer drafting?
- Did every factual claim come from retrieved evidence rather than model knowledge? - Did every factual claim come from retrieved evidence rather than model knowledge?
- Exhausted retrieval flow before concluding "not found"?
- Citations placed immediately after each relevant paragraph?
- If any unsupported part remained, was it removed or explicitly marked unavailable? - If any unsupported part remained, was it removed or explicitly marked unavailable?
If any answer is "no", correct the process first. If any answer is "no", correct the process first.