qwen_agent/skills/support/japanese-pii-redactor/SKILL.md

---
name: japanese-pii-redactor
description: Redact, anonymize, and de-identify personal information in Japanese-language or mixed-language text and tabular data while preserving analytical usefulness. Use this whenever users ask for PII redaction, PII scrub, de-identification, 個人情報匿名化, 匿名加工, 仮名化, 秘匿化, or マスキング; use it for executing anonymization rules, not for legal interpretation or general writing polish.
category: Compliance & Security
---

# Japanese PII Redactor

## Overview

Detect and redact personal information in Japanese-language or mixed-language content for safer sharing and analysis.

Common targets:
- Person names
- Phone numbers
- Email addresses
- Home/work addresses
- Account/member/employee identifiers
- Free-text notes containing identifiable details

## Triggering Cues

Use this skill when user messages include:

- Chinese cues: 脱敏、匿名化、个人信息、隐私遮蔽、PII处理、数据清洗
- Japanese cues: 個人情報、匿名化、マスキング、伏字、漏えい対策、PII
- English cues: redact PII, anonymize Japanese data, privacy masking

## Input Requirements

Ask for or infer:

1. Source text/table
2. Target output format (text/table/json)
3. Redaction strength (light/standard/strict)
4. Whether reversible pseudonyms are needed

## Output Format

Always output:

1. **Redacted Data**
2. **Redaction Rules Applied**
3. **Fields Preserved vs Masked**
4. **Residual Risk Notes**

For rules section, use this schema:

| Field Type | Detection Pattern | Redaction Method | Example |
|------------|-------------------|------------------|---------|

## Workflow

1. Detect direct identifiers first (email/phone/account IDs).
2. Detect contextual identifiers (address/detail combinations).
3. Apply consistent masking policy across the dataset.
4. Keep analytical utility while minimizing re-identification risk.
5. Report what was masked and why.

## Examples

### Example 1
Input:
- 日文客服对话日志，需要共享给外部分析团队。

Output style:
- Replace identifiers with neutral tokens
- Preserve issue semantics and timeline

### Example 2
Input:
- 员工名单（姓名、邮箱、电话、住址、员工编号）。

Output style:
- Table output with masked fields and preserved non-sensitive columns
- Explicit rule list for audit traceability

## Guidelines

- Prefer consistency: same entity should map to same token within one output.
- Never expose raw originals in final output.
- Mark uncertain detections as "Needs Manual Review".
- State that redaction reduces risk but does not guarantee zero re-identification risk.