Files
linshen 2cdd095c2b docs(workspace): consolidate compare evaluation specs and acceptance evidence
- fold earlier planning notes into a single current-spec and archived history structure
- keep manual acceptance steps and real API samples aligned with the refactored analysis/result/compare model
- retain supporting workspace notes needed to review version-selection and evaluation behavior changes
2026-03-18 09:35:44 +08:00

34 lines
1.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
```json
{
"score": {
"overall": 70,
"dimensions": [
{ "key": "goalClarity", "label": "目标清晰度", "score": 85 },
{ "key": "instructionCompleteness", "label": "指令完备度", "score": 60 },
{ "key": "structuralExecutability", "label": "结构可执行性", "score": 75 },
{ "key": "ambiguityControl", "label": "歧义控制", "score": 70 },
{ "key": "robustness", "label": "稳健性", "score": 65 }
]
},
"improvements": [
"在提示词中进一步明确强调'不要输出思考过程',例如使用加粗或重复表述以增强约束力。",
"将分类结构拆分为更清晰的子步骤(如:先归类问题类型,再评估风险等级),避免模型因顺序模糊而遗漏关键步骤。"
],
"patchPlan": [
{
"op": "replace",
"oldText": "回答前先判断用户问题类型,再输出:问题归类、风险等级、建议回复。",
"newText": "回答前仅输出以下内容1. 问题归类2. 风险等级3. 建议回复。不得包含任何额外解释或思考过程。",
"instruction": "原句未明确禁止输出思考过程,且分类和风险评估可能被模型混为一谈。修改后通过分步骤与强化约束来提升可执行性和稳健性。"
},
{
"op": "insert",
"oldText": "",
"newText": "特别注意:严格遵守格式要求,只输出结果,不添加任何解释、说明或推理过程。",
"instruction": "补充强调'不要输出思考过程'的要求,确保模型理解这是不可违反的硬性规则。"
}
],
"summary": "当前系统提示词基本符合质检任务需求,但对'不输出思考过程'的约束不够强,且分类结构可能导致风险等级被忽略。"
}
```