Files
linshen 2cdd095c2b docs(workspace): consolidate compare evaluation specs and acceptance evidence
- fold earlier planning notes into a single current-spec and archived history structure
- keep manual acceptance steps and real API samples aligned with the refactored analysis/result/compare model
- retain supporting workspace notes needed to review version-selection and evaluation behavior changes
2026-03-18 09:35:44 +08:00

57 lines
1.8 KiB
JSON
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"type": "compare",
"evaluationModelKey": "dashscope",
"mode": {
"functionMode": "basic",
"subMode": "system"
},
"focus": {
"content": "优先比较哪种系统提示词更能稳定安抚用户情绪并输出风险等级",
"source": "user",
"priority": "highest"
},
"target": {
"workspacePrompt": "你是一个客服助手。请先判断问题类型,再评估风险等级,并用安抚性的语气给出建议回复。输出格式固定为:问题类型、风险等级、建议回复。",
"referencePrompt": "你是一个助手。"
},
"testCases": [
{
"id": "shared-system-focus-test-case",
"label": "测试内容",
"input": {
"kind": "text",
"label": "测试内容",
"content": "用户说:订单超过一周还没发货,我很着急。"
}
}
],
"snapshots": [
{
"id": "a",
"label": "A",
"testCaseId": "shared-system-focus-test-case",
"promptRef": {
"kind": "original",
"label": "原始"
},
"promptText": "你是一个助手。",
"output": "请耐心等待。",
"modelKey": "siliconflow",
"versionLabel": "原始"
},
{
"id": "b",
"label": "B",
"testCaseId": "shared-system-focus-test-case",
"promptRef": {
"kind": "workspace",
"label": "工作区"
},
"promptText": "你是一个客服助手。请先判断问题类型,再评估风险等级,并用安抚性的语气给出建议回复。输出格式固定为:问题类型、风险等级、建议回复。",
"output": "问题类型:物流延迟\n风险等级中\n建议回复非常抱歉让您久等了我们理解您现在很着急会立刻帮您核查物流进度并优先跟进。",
"modelKey": "dashscope",
"versionLabel": "工作区"
}
]
}