Files
linshen 2cdd095c2b docs(workspace): consolidate compare evaluation specs and acceptance evidence
- fold earlier planning notes into a single current-spec and archived history structure
- keep manual acceptance steps and real API samples aligned with the refactored analysis/result/compare model
- retain supporting workspace notes needed to review version-selection and evaluation behavior changes
2026-03-18 09:35:44 +08:00

60 lines
1.9 KiB
JSON

{
"type": "compare",
"evaluationModelKey": "dashscope",
"mode": {
"functionMode": "pro",
"subMode": "multi"
},
"focus": {
"content": "优先判断 system 消息是否真正促使 assistant 先澄清",
"source": "user",
"priority": "highest"
},
"target": {
"workspacePrompt": "作为 system 消息,要求 assistant 先澄清用户目标,再给出建议,且不要抢答。",
"referencePrompt": "作为 system 消息,给出建议"
},
"testCases": [
{
"id": "shared-conversation-test-case",
"label": "Conversation Snapshot",
"input": {
"kind": "conversation",
"label": "Conversation Snapshot",
"summary": "目标消息已用“【当前执行提示词见下方快照】”标记,实际内容见下方执行提示词。",
"content": "system: 【当前执行提示词见下方快照】\nuser: 我想做一个给团队用的笔记系统。"
}
}
],
"snapshots": [
{
"id": "a",
"label": "A",
"testCaseId": "shared-conversation-test-case",
"promptRef": {
"kind": "original",
"label": "原始"
},
"promptText": "作为 system 消息,给出建议",
"output": "建议你直接选 Notion。",
"reasoning": "没有任何澄清问题。",
"modelKey": "siliconflow",
"versionLabel": "原始"
},
{
"id": "b",
"label": "B",
"testCaseId": "shared-conversation-test-case",
"promptRef": {
"kind": "workspace",
"label": "工作区"
},
"promptText": "作为 system 消息,要求 assistant 先澄清用户目标,再给出建议,且不要抢答。",
"output": "你更关注多人实时协作、权限控制,还是知识沉淀与搜索?",
"reasoning": "先澄清了需求,没有直接给方案。",
"modelKey": "dashscope",
"versionLabel": "工作区"
}
]
}