qwen_agent/skills_autoload/rag-retrieve/hooks/hook-backup.md
2026-04-17 19:53:51 +08:00

4.6 KiB

Retrieval Policy

1. Retrieval Order and Tool Selection

  • Follow this section for source choice, tool choice, query rewrite, top_k, fallback, result handling, and citations.
  • Use this default retrieval order and execute it sequentially: skill-enabled knowledge retrieval tools > rag_retrieve / table_rag_retrieve.
  • Do NOT answer from model knowledge first.
  • Do NOT bypass the retrieval flow and inspect local filesystem documents on your own.
  • Do NOT use local filesystem retrieval as a fallback knowledge source.
  • Local filesystem documents are not a recommended retrieval source here because file formats are inconsistent and have not been normalized or parsed for reliable knowledge lookup.
  • Knowledge must be retrieved through the supported knowledge tools only: skill-enabled retrieval scripts, table_rag_retrieve, and rag_retrieve.
  • When a suitable skill-enabled knowledge retrieval tool is available, use it first.
  • If no suitable skill-enabled retrieval tool is available, or if its result is insufficient, continue with rag_retrieve or table_rag_retrieve.
  • Use table_rag_retrieve first for values, prices, quantities, inventory, specifications, rankings, comparisons, summaries, extraction, lists, tables, name lookup, historical coverage, mixed questions, and unclear cases.
  • Use rag_retrieve first only for clearly pure concept, definition, workflow, policy, or explanation questions without structured data needs.
  • After each retrieval step, evaluate sufficiency before moving to the next source. Do NOT run these retrieval sources in parallel.

2. Query Preparation

  • Do NOT pass the raw user question unless it already works well for retrieval.
  • Rewrite for recall: extract entity, time scope, attributes, and intent.
  • Add useful variants: synonyms, aliases, abbreviations, related titles, historical names, and category terms.
  • Expand list-style, extraction, overview, historical, roster, timeline, and archive queries more aggressively.
  • Preserve meaning. Do NOT introduce unrelated topics.

3. Retrieval Breadth (top_k)

  • Apply top_k only to rag_retrieve. Use the smallest sufficient value, then expand only if coverage is insufficient.
  • Use 30 for simple fact lookup.
  • Use 50 for moderate synthesis, comparison, summarization, or disambiguation.
  • Use 100 for broad recall, such as comprehensive analysis, scattered knowledge, multiple entities or periods, or list / catalog / timeline / roster / overview requests.
  • Raise top_k when keyword branches are many or results are too few, repetitive, incomplete, sparse, or too narrow.
  • Use this expansion order: 30 -> 50 -> 100. If unsure, use 100.

4. Result Evaluation

  • Treat results as insufficient if they are empty, start with Error:, say no excel files found, are off-topic, miss the core entity or scope, or provide no usable evidence.
  • Also treat results as insufficient when they cover only part of the request, or when full-list, historical, comparison, or mixed data + explanation requests return only partial or truncated coverage.

5. Fallback and Sequential Retry

  • If the first retrieval result is insufficient, call the next supported retrieval source in the default order before replying.
  • table_rag_retrieve now performs an internal fallback to rag_retrieve when it returns no excel files found, but this does NOT change the higher-level retrieval order.
  • If table_rag_retrieve is insufficient or empty, continue with rag_retrieve.
  • If rag_retrieve is insufficient or empty, continue with table_rag_retrieve.
  • Say no relevant information was found only after all applicable skill-enabled retrieval tools, rag_retrieve, and table_rag_retrieve have been tried and still do not provide enough evidence.
  • Do NOT reply that no relevant information was found before the supported knowledge retrieval flow has been exhausted.

6. Table RAG Result Handling

  • Follow all [INSTRUCTION] and [EXTRA_INSTRUCTION] content in table_rag_retrieve results.
  • If results are truncated, explicitly tell the user total matches (N+M), displayed count (N), and omitted count (M).
  • Cite data sources using filenames from file_ref_table.

7. Citation Requirements for Retrieved Knowledge

  • When using knowledge from rag_retrieve or table_rag_retrieve, you MUST generate <CITATION ... /> tags.
  • Follow the citation format returned by each tool.
  • Place citations immediately after the paragraph or bullet list that uses the knowledge.
  • Do NOT collect citations at the end.
  • Use 1-2 citations per paragraph or bullet list when possible.
  • If learned knowledge is used, include at least 1 <CITATION ... />.