84 lines
2.6 KiB
Markdown
84 lines
2.6 KiB
Markdown
---
|
|
name: japanese-pii-redactor
|
|
description: Redact, anonymize, and de-identify personal information in Japanese-language or mixed-language text and tabular data while preserving analytical usefulness. Use this whenever users ask for PII redaction, PII scrub, de-identification, 個人情報匿名化, 匿名加工, 仮名化, 秘匿化, or マスキング; use it for executing anonymization rules, not for legal interpretation or general writing polish.
|
|
category: Compliance & Security
|
|
---
|
|
|
|
# Japanese PII Redactor
|
|
|
|
## Overview
|
|
|
|
Detect and redact personal information in Japanese-language or mixed-language content for safer sharing and analysis.
|
|
|
|
Common targets:
|
|
- Person names
|
|
- Phone numbers
|
|
- Email addresses
|
|
- Home/work addresses
|
|
- Account/member/employee identifiers
|
|
- Free-text notes containing identifiable details
|
|
|
|
## Triggering Cues
|
|
|
|
Use this skill when user messages include:
|
|
|
|
- Chinese cues: 脱敏、匿名化、个人信息、隐私遮蔽、PII处理、数据清洗
|
|
- Japanese cues: 個人情報、匿名化、マスキング、伏字、漏えい対策、PII
|
|
- English cues: redact PII, anonymize Japanese data, privacy masking
|
|
|
|
## Input Requirements
|
|
|
|
Ask for or infer:
|
|
|
|
1. Source text/table
|
|
2. Target output format (text/table/json)
|
|
3. Redaction strength (light/standard/strict)
|
|
4. Whether reversible pseudonyms are needed
|
|
|
|
## Output Format
|
|
|
|
Always output:
|
|
|
|
1. **Redacted Data**
|
|
2. **Redaction Rules Applied**
|
|
3. **Fields Preserved vs Masked**
|
|
4. **Residual Risk Notes**
|
|
|
|
For rules section, use this schema:
|
|
|
|
| Field Type | Detection Pattern | Redaction Method | Example |
|
|
|------------|-------------------|------------------|---------|
|
|
|
|
## Workflow
|
|
|
|
1. Detect direct identifiers first (email/phone/account IDs).
|
|
2. Detect contextual identifiers (address/detail combinations).
|
|
3. Apply consistent masking policy across the dataset.
|
|
4. Keep analytical utility while minimizing re-identification risk.
|
|
5. Report what was masked and why.
|
|
|
|
## Examples
|
|
|
|
### Example 1
|
|
Input:
|
|
- 日文客服对话日志,需要共享给外部分析团队。
|
|
|
|
Output style:
|
|
- Replace identifiers with neutral tokens
|
|
- Preserve issue semantics and timeline
|
|
|
|
### Example 2
|
|
Input:
|
|
- 员工名单(姓名、邮箱、电话、住址、员工编号)。
|
|
|
|
Output style:
|
|
- Table output with masked fields and preserved non-sensitive columns
|
|
- Explicit rule list for audit traceability
|
|
|
|
## Guidelines
|
|
|
|
- Prefer consistency: same entity should map to same token within one output.
|
|
- Never expose raw originals in final output.
|
|
- Mark uncertain detections as "Needs Manual Review".
|
|
- State that redaction reduces risk but does not guarantee zero re-identification risk.
|