Add community monitoring MVP

2026-05-30 23:30:55 +08:00 · 2026-05-30 23:30:55 +08:00 · 912057de0a
commit 912057de0a
parent d7f8450123
18 changed files with 2781 additions and 0 deletions
--- a/.codex/tasks/steam-monitor-mvp.md
+++ b/.codex/tasks/steam-monitor-mvp.md
@ -0,0 +1,61 @@
+# Steam Monitor MVP
+
+## 需求确认
+
+- 产品：《帝国幻想乡~TOHOTOPIA》
+- Steam AppID：`3774440`
+- 信息源：Steam 评测、Steam 讨论区主题和回复
+- 刷新：每 30 分钟；第一轮全量，后续增量
+- 分类模型：OpenRouter `deepseek/deepseek-v4-pro`
+- 密钥：`.env` / 环境变量 `OPENROUTER_API_KEY`
+- Dashboard：展示分类、原始链接、是否建议回复、处理状态、制作人/处理人备注
+
+## 当前计划
+
+- [x] T1 建立 Python/FastAPI + SQLite MVP。
+- [x] T2 实现 Steam 评测 API 抓取。
+- [x] T3 实现 Steam 讨论区主题和回复抓取。
+- [x] T4 实现 SQLite 去重、处理状态和同步游标。
+- [x] T5 实现 OpenRouter 结构化分类。
+- [x] T6 实现 dashboard、手动同步、状态更新。
+- [x] T7 本机 smoke test 并启动局域网服务。
+- [ ] T8 接入下一个社区平台。
+
+## 执行记录
+
+- 2026-05-16：创建任务记录，开始项目骨架实现。
+- 2026-05-16：完成 Python/FastAPI + SQLite MVP，实现 Steam 评测、讨论区主题和回复抓取，dashboard 展示、手动同步、后台 30 分钟增量同步、处理状态更新。
+- 2026-05-16：本机 smoke test 抓取 384 条：评测 132、讨论主题 75、回复 177。未配置 `OPENROUTER_API_KEY`，模型分析按预期进入 error，配置 `.env` 后可补跑。
+- 2026-05-16：服务已启动在 `http://127.0.0.1:8000`。
+- 2026-05-16：用户补充 `.env` 后发现“补跑分析”视觉无反应。定位为旧 uvicorn 进程未读新 `.env`，且补跑接口同步等待模型调用。已改为按钮立即返回、后台每批 20 条补跑，并在 dashboard 显示“已分析 / 待补跑”。
+- 2026-05-16：服务改为局域网监听 `0.0.0.0:8000`，当前局域网地址曾检测为 `http://10.27.16.17:8000`。
+- 2026-05-16：修复讨论区排序问题。根因是 Steam 讨论区 `published_at` 未解析，已支持 `x 小时以前`、`3 月 7 日 下午 4:52`、`2025 年 8 月 9 日 下午 3:29` 并回填 252 条讨论区记录。
+- 2026-05-16：按用户要求补跑 2026-05-01 之后内容。共 209 条：评测 132、讨论主题 26、讨论回复 51，最终全部 `done`。
+- 2026-05-16：Dashboard 页头新增“最近更新时间”，优先取最近成功同步完成时间，缺失时取最新采集时间。
+
+## 恢复入口
+
+- 方案文档：`任务/方案/steam社区监控一期计划.md`
+- README：`README.md`
+- CLI：`python -m app.cli sync --full`、`python -m app.cli analyze-pending --since 2026-05-01 --limit 20`
+- Dashboard：`python -m uvicorn app.main:app --host 0.0.0.0 --port 8000`
+- 当前服务：局域网监听 `0.0.0.0:8000`
+
+## 当前状态
+
+- 已完成 Steam 一期 MVP。
+- 当前数据文件：`data/tohotopia_monitor.sqlite3`。
+- 当前 dashboard 无登录认证，局域网可访问者可查看和修改处理状态。
+- 当前排序：建议回复优先；同组内按发布时间新到旧。
+- 当前后台任务：FastAPI 启动后每 30 分钟增量同步。
+- 当前 OpenRouter key：来自 `.env` 的 `OPENROUTER_API_KEY`。
+
+## 下一阶段入口
+
+添加其它社区平台时：
+
+- 先读 `AGENTS.md`、`README.md`、本任务文档和 `任务/方案/steam社区监控一期计划.md`。
+- 新平台采集器应输出 `app.models.RawItem`。
+- 继续复用 `raw_items`、`analysis_results`、`work_items`。
+- 新平台不要把平台私有字段直接塞到 dashboard 查询条件里；先进入 `raw_json` 和统一字段。
+- 需要登录态、API、反爬或浏览器自动化的平台，先验证当前事实再实现。
--- a/.codex/tasks/twitter-monitor-mvp.md
+++ b/.codex/tasks/twitter-monitor-mvp.md
@ -0,0 +1,86 @@
+# Twitter Monitor MVP
+
+日期：2026-05-16
+状态：completed
+
+## 背景
+
+用户要求在 Steam MVP 已完成的基础上，新增 X.com/Twitter 玩家反馈采集与处理功能，目标源为 `https://x.com/Tohotopia`，采集范围为所有帖子以及所有回复，首轮全量，增量按时间，继续复用 `RawItem -> raw_items -> OpenRouter -> analysis_results -> work_items -> dashboard` 流程。
+
+## 需求确认
+
+- 做什么：接入 X.com/Twitter 账号 `Tohotopia` 的账号帖子和每帖回复采集，归一为 `RawItem` 并进入现有同步、分析、dashboard 流程。
+- 不做什么：不改 dashboard 的核心数据结构；不把 Twitter 私有字段提升为 dashboard 查询字段；不在未登录时伪造空结果。
+- 成功标准：本机登录态可用时，CLI/同步能采集 Twitter 帖子与回复并入库去重，新增内容可进入 OpenRouter 分析和 dashboard 展示。
+- 关键约束：X.com 当前页面/API/登录态属于动态事实，先以本机 smoke test 验证；采集失败不得删除旧数据。
+
+## 文档/代码预读
+
+- Project AGENTS：新渠道单独封装采集、解析、限流、登录态和失败处理；运营判断必须可追溯到平台、原始链接、采集时间或批次。
+- Relevant docs：`README.md` 和 `任务/方案/后续社区平台接入指南.md` 明确新平台采集器输出 `app.models.RawItem`，复用三层数据模型。
+- Relevant code：`app/sync.py` 当前只采 Steam；`app/models.py` 的 `RawItem` 可容纳 Twitter 数据；`app/db.py` 已有 `raw_json` 和 `(source, source_item_id)` 唯一键。
+- 已确认事实：已有 `social-media-scraper` skill 支持 X.com 用户时间线和单帖回复，通过已登录 Chrome/CDP profile 拦截 API 输出 JSON/CSV。
+- 冲突 / 歧义：用户不确定是否需要登录态；本机 smoke test 已验证当前 profile 检测到 X.com 登录提示。
+
+## 术语与冲突
+
+- Resolved terms：Twitter/X 平台标识在代码中使用 `twitter` 作为配置前缀；来源类型使用 `twitter_posts` 和 `twitter_replies`。
+- Conflicts：无。
+- Follow-up CONTEXT / glossary updates：暂无项目级 `CONTEXT.md`，本次术语记录在任务文档。
+
+## 当前计划
+
+- [x] T1 预读文档与现有 Steam 流程代码。
+- [x] T2 验证 X.com 目标页与可用采集工具前提。
+- [x] T3 制定 Twitter 接入方案和数据映射。
+- [x] T4 实现采集器与同步流程接入。
+- [x] T5 补充 CLI/配置/文档与任务记录。
+- [x] T6 运行 smoke test 验证入库、分析与 dashboard。
+
+## 关键判断与证据
+
+| 判断 | 类型（稳定原理/当前事实/推断） | 证据 | 验证时间 | 未验证项 | 决策影响 |
+|------|--------------------------------|------|----------|----------|----------|
+| 新平台应输出 `RawItem` 后复用同步链路 | 稳定原理 | README、后续社区平台接入指南、`app/models.py` | 2026-05-16 | 无 | 避免 dashboard 直接依赖 Twitter 私有字段 |
+| X.com 当前采集需要登录态 | 当前事实 | `social-media-scraper` 未登录提示；登录后小样本抓到 18 条 | 2026-05-16 | 全量回复数量和耗时 | 实现必须显式处理未登录失败并给出前置条件 |
+| 复用已有 CDP 采集脚本比重写 X.com API 更稳妥 | 推断 | 已有 skill 支持 UserTweets/TweetDetail；登录后项目同步入口成功入库并分析 | 2026-05-16 | 全量耗时 | 新增项目内适配层读取 JSON 并转 RawItem |
+
+## 执行记录
+
+- 14:00：读取 `AGENTS.md`、`README.md`、`.codex/tasks/steam-monitor-mvp.md`、`任务/方案/后续社区平台接入指南.md`，确认新平台接入规则。
+- 14:05：读取 `social-media-scraper` skill，确认 X.com 用户时间线和单帖回复已支持，输出位置可指定。
+- 14:10：运行 `python C:\Users\jiajiankun\.codex\skills\social-media-scraper\scraper.py https://x.com/Tohotopia --max-no-new 1 --output-dir 任务/验证/twitter-smoke`，结果为当前 Chrome profile 未登录 X.com。
+- 14:25：新增 `app/twitter.py`，将 `social-media-scraper` 输出的 timeline/thread JSON 转为 `RawItem`，内容类型为 `twitter_post` / `twitter_reply`，来源为 `twitter_posts` / `twitter_replies`。
+- 14:35：扩展 `app/config.py`、`app/sync.py`、`app/cli.py`、`app/main.py`，支持 `TWITTER_ENABLED`、平台级同步、Twitter 单平台 CLI、dashboard 类型筛选。
+- 14:42：更新 `.env.example`、`README.md`、`requirements.txt`，补充 Twitter 登录前提、配置和依赖。
+- 14:50：修正 Twitter 增量高水位，从“最近同步完成时间”改为“已入库 Twitter 内容的最大发布/采集时间”，避免漏掉发布时间早于同步结束时间的内容。
+- 14:55：验证 `python -m compileall app` 通过；默认配置 `python -m app.cli sync --platform twitter` 返回 `twitter_skipped=1`；临时启用 Twitter 后返回 `twitter_errors=1` 且 `sync_runs.status=partial`，未插入空 Twitter 数据。
+- 19:17：用户在 CDP Chrome profile 登录 X.com 后，运行 `social-media-scraper` 小样本验证，抓到 18 条 `Tohotopia` 时间线内容。
+- 19:21：运行项目同步小范围验证：`TWITTER_ENABLED=true`、`TWITTER_INCREMENTAL_MAX_NO_NEW=1`、`TWITTER_THREAD_MAX_NO_NEW=1`、`TWITTER_INCREMENTAL_REPLY_PARENT_LIMIT=2`、`python -m app.cli sync --platform twitter`。结果：`twitter_fetched=26`、新增 22、分析 22、已见 4。
+- 19:25：数据库确认 Twitter 已入库 18 条 `twitter_posts` 和 4 条 `twitter_replies`；最近同步 `id=12` 状态为 `success`。
+- 19:24-19:51：用户设置 `TWITTER_ENABLED=true` 后启动 `python -m app.cli sync --platform twitter --full`。命令被用户中断后仍有进程存活但 30 秒内无文件增长、CPU 几乎不变，判断为不再推进。
+- 19:53：停止残留全量进程 `pid=81152`，将 `sync_runs id=13` 从 `running` 标记为 `partial`，保留已入库数据。最终 Twitter 数据为 34 条主帖、139 条回复，共 173 条，分析状态全部 `done`。
+
+## 当前状态
+
+- 已完成：文档/代码预读；X.com 登录态前提验证；Twitter 采集适配层、配置、同步、CLI、dashboard 文案和文档更新；编译、未登录失败路径、登录后小范围端到端验证。
+- 阻塞：无。
+- 下一步：如需执行“所有帖子及所有回复”的首轮全量，启用 `.env` 的 `TWITTER_ENABLED=true` 后运行 `python -m app.cli sync --platform twitter --full`。
+
+## 五层变更候选
+
+- 无。
+
+## 恢复入口
+
+下次继续时先读：
+
+- 关键文件：`app/twitter.py`、`app/sync.py`、`app/config.py`、`app/cli.py`、`app/main.py`。
+- 当前目标：把 `https://x.com/Tohotopia` 的帖子和回复接入现有 RawItem 流程。
+- 当前状态：实现已完成；X.com 登录态已写入 CDP profile；小范围同步成功；一次全量同步被中断后已清理残留进程并保留 173 条已分析数据。
+- 最近完成：清理全量残留进程，将 `sync_runs id=13` 标记为 partial。
+- 下一步：如需继续全量，可再次运行 `python -m app.cli sync --platform twitter --full`，现有去重会跳过已入库内容。
+- 不要做：不要把未登录导致的失败当作“无数据”；不要改 dashboard 数据模型。
+- 已改文件：`.codex/tasks/twitter-monitor-mvp.md`、`app/twitter.py`、`app/config.py`、`app/sync.py`、`app/cli.py`、`app/main.py`、`.env.example`、`README.md`、`requirements.txt`。
+- 验证结果：`python -m compileall app` 通过；默认 Twitter 未启用会跳过；未登录会 partial；登录后项目同步成功；当前 Twitter 共 34 条主帖和 139 条回复，173 条全部 `done`。
+- 当前阻塞：无。
--- a/.env.example
+++ b/.env.example
@ -0,0 +1,22 @@
+OPENROUTER_API_KEY=
+APP_ID=3774440
+PRODUCT_NAME=帝国幻想乡~TOHOTOPIA
+DATABASE_PATH=data/tohotopia_monitor.sqlite3
+SYNC_INTERVAL_MINUTES=30
+AUTO_SYNC_ENABLED=true
+TWITTER_ENABLED=false
+TWITTER_USERNAME=Tohotopia
+TWITTER_BROWSER_PROVIDER=existing
+TWITTER_OUTPUT_DIR=任务/社媒数据/twitter-monitor
+TWITTER_FULL_MAX_NO_NEW=6
+TWITTER_INCREMENTAL_MAX_NO_NEW=2
+TWITTER_THREAD_MAX_NO_NEW=3
+TWITTER_COMMAND_TIMEOUT_SECONDS=900
+TWITTER_FULL_REPLY_POST_LIMIT=0
+TWITTER_INCREMENTAL_REPLY_PARENT_LIMIT=20
+DISCUSSION_FULL_SCAN_MAX_PAGES=500
+DISCUSSION_INCREMENTAL_MAX_PAGES=5
+FULL_SCAN_TIME_LIMIT_SECONDS=7200
+OPENROUTER_MODEL=deepseek/deepseek-v4-pro
+OPENROUTER_REFERER=http://localhost:8000
+OPENROUTER_TITLE=TOHOTOPIA Steam Monitor
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,13 @@
+.env
+.venv/
+__pycache__/
+*.pyc
+*.pyo
+.pytest_cache/
+.mypy_cache/
+
+data/
+任务/社媒数据/
+任务/验证/**/*.json
+任务/验证/**/*.csv
+任务/验证/**/*.log
--- a/AGENTS.md
+++ b/AGENTS.md
@ -0,0 +1,33 @@
+# AGENTS.md
+
+## 项目定位
+
+本项目是面向新上架独立游戏的社区监控和处理平台，用于分阶段接入社区渠道的信息采集、整理、分析和处理能力。
+
+目标不是一次性做大全渠道，而是先跑通可验证的运营闭环：发现社区信息 → 归一化入库或记录 → 分析优先级 → 形成可处理事项 → 追踪处理结果。
+
+## 领域边界
+
+- 平台关注社区运营工作流，不只做爬虫脚本集合。
+- 社区内容处理应区分：原始内容、规范化记录、分析结论、人工处理状态。
+- 运营判断必须能追溯到来源平台、原始链接、采集时间或采集批次。
+- 新渠道接入时，先明确该渠道在运营中的用途：反馈收集、舆情监控、玩家支持、内容机会、竞品观察或发布效果追踪。
+
+## 渠道接入原则
+
+- 每个渠道单独封装采集、解析、限流、登录态和失败处理逻辑。
+- 渠道输出尽量归一到稳定字段，避免上层业务直接依赖页面结构或平台私有字段。
+- 同一内容的重复采集、编辑更新、删除不可见、权限变化，需要在渠道方案中显式说明处理策略。
+- 涉及外部平台当前 API、页面结构、频率限制或服务条款时，以实时验证结果为准。
+
+## 数据优先级
+
+优先保留能支撑运营决策和追溯的信息：
+
+- 来源平台和原始链接
+- 作者标识
+- 发布时间和采集时间
+- 正文或摘要
+- 互动指标
+- 主题、情绪、问题类型或处理标签
+- 当前处理状态和负责人记录
--- a/app/init.py
+++ b/app/init.py
@ -0,0 +1 @@
+"""TOHOTOPIA community monitor."""
--- a/app/cli.py
+++ b/app/cli.py
@ -0,0 +1,61 @@
+from __future__ import annotations
+
+import argparse
+import json
+import time
+
+from .config import get_settings
+from .db import init_db, session
+from .sync import analyze_pending, run_sync
+
+
+def _platforms(value: str | None) -> list[str] | None:
+    if not value:
+        return None
+    selected = [part.strip().lower() for part in value.split(",") if part.strip()]
+    allowed = {"steam", "twitter"}
+    unknown = sorted(set(selected) - allowed)
+    if unknown:
+        raise argparse.ArgumentTypeError(f"Unsupported platform(s): {', '.join(unknown)}")
+    return selected
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="TOHOTOPIA community monitor")
+    sub = parser.add_subparsers(dest="command", required=True)
+
+    sub.add_parser("init-db", help="Initialize SQLite database")
+
+    sync_parser = sub.add_parser("sync", help="Fetch community content and analyze new items")
+    sync_parser.add_argument("--full", action="store_true", help="Run first full scan")
+    sync_parser.add_argument(
+        "--platform",
+        type=_platforms,
+        help="Comma-separated platform list: steam,twitter. Defaults to all enabled platforms.",
+    )
+
+    analyze_parser = sub.add_parser("analyze-pending", help="Analyze pending/error items")
+    analyze_parser.add_argument("--limit", type=int, default=50)
+    analyze_parser.add_argument("--since", help="Only analyze items since YYYY-MM-DD")
+
+    args = parser.parse_args()
+    settings = get_settings()
+    with session(settings.database_path) as conn:
+        init_db(conn)
+        if args.command == "init-db":
+            result = {"database": str(settings.database_path)}
+        elif args.command == "sync":
+            result = run_sync(conn, settings, full=args.full, platforms=args.platform)
+        elif args.command == "analyze-pending":
+            since_ts = None
+            if args.since:
+                parsed = time.strptime(args.since, "%Y-%m-%d")
+                since_ts = int(time.mktime(parsed))
+            result = analyze_pending(conn, settings, limit=args.limit, since_ts=since_ts)
+        else:
+            raise SystemExit(f"Unknown command: {args.command}")
+    print(json.dumps(result, ensure_ascii=False, indent=2))
+
+
+if __name__ == "__main__":
+    main()
--- a/app/config.py
+++ b/app/config.py
@ -0,0 +1,94 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from pathlib import Path
+import os
+
+from dotenv import load_dotenv
+
+
+ROOT_DIR = Path(__file__).resolve().parent.parent
+load_dotenv(ROOT_DIR / ".env")
+
+
+def _int_env(name: str, default: int) -> int:
+    value = os.getenv(name)
+    if not value:
+        return default
+    return int(value)
+
+
+def _bool_env(name: str, default: bool) -> bool:
+    value = os.getenv(name)
+    if value is None:
+        return default
+    return value.strip().lower() in {"1", "true", "yes", "on"}
+
+
+@dataclass(frozen=True)
+class Settings:
+    app_id: str
+    product_name: str
+    database_path: Path
+    sync_interval_minutes: int
+    auto_sync_enabled: bool
+    twitter_enabled: bool
+    twitter_username: str
+    twitter_scraper_path: Path
+    twitter_output_dir: Path
+    twitter_browser_provider: str
+    twitter_full_max_no_new: int
+    twitter_incremental_max_no_new: int
+    twitter_thread_max_no_new: int
+    twitter_command_timeout_seconds: int
+    twitter_full_reply_post_limit: int
+    twitter_incremental_reply_parent_limit: int
+    discussion_full_scan_max_pages: int
+    discussion_incremental_max_pages: int
+    full_scan_time_limit_seconds: int
+    openrouter_api_key: str | None
+    openrouter_model: str
+    openrouter_referer: str
+    openrouter_title: str
+
+
+def get_settings() -> Settings:
+    database_path = Path(os.getenv("DATABASE_PATH", "data/tohotopia_monitor.sqlite3"))
+    if not database_path.is_absolute():
+        database_path = ROOT_DIR / database_path
+    twitter_scraper_path = Path(
+        os.getenv(
+            "TWITTER_SCRAPER_PATH",
+            str(Path.home() / ".codex" / "skills" / "social-media-scraper" / "scraper.py"),
+        )
+    )
+    if not twitter_scraper_path.is_absolute():
+        twitter_scraper_path = ROOT_DIR / twitter_scraper_path
+    twitter_output_dir = Path(os.getenv("TWITTER_OUTPUT_DIR", "任务/社媒数据/twitter-monitor"))
+    if not twitter_output_dir.is_absolute():
+        twitter_output_dir = ROOT_DIR / twitter_output_dir
+    return Settings(
+        app_id=os.getenv("APP_ID", "3774440"),
+        product_name=os.getenv("PRODUCT_NAME", "帝国幻想乡~TOHOTOPIA"),
+        database_path=database_path,
+        sync_interval_minutes=_int_env("SYNC_INTERVAL_MINUTES", 30),
+        auto_sync_enabled=_bool_env("AUTO_SYNC_ENABLED", True),
+        twitter_enabled=_bool_env("TWITTER_ENABLED", False),
+        twitter_username=os.getenv("TWITTER_USERNAME", "Tohotopia"),
+        twitter_scraper_path=twitter_scraper_path,
+        twitter_output_dir=twitter_output_dir,
+        twitter_browser_provider=os.getenv("TWITTER_BROWSER_PROVIDER", "existing"),
+        twitter_full_max_no_new=_int_env("TWITTER_FULL_MAX_NO_NEW", 6),
+        twitter_incremental_max_no_new=_int_env("TWITTER_INCREMENTAL_MAX_NO_NEW", 2),
+        twitter_thread_max_no_new=_int_env("TWITTER_THREAD_MAX_NO_NEW", 3),
+        twitter_command_timeout_seconds=_int_env("TWITTER_COMMAND_TIMEOUT_SECONDS", 900),
+        twitter_full_reply_post_limit=_int_env("TWITTER_FULL_REPLY_POST_LIMIT", 0),
+        twitter_incremental_reply_parent_limit=_int_env("TWITTER_INCREMENTAL_REPLY_PARENT_LIMIT", 20),
+        discussion_full_scan_max_pages=_int_env("DISCUSSION_FULL_SCAN_MAX_PAGES", 500),
+        discussion_incremental_max_pages=_int_env("DISCUSSION_INCREMENTAL_MAX_PAGES", 5),
+        full_scan_time_limit_seconds=_int_env("FULL_SCAN_TIME_LIMIT_SECONDS", 7200),
+        openrouter_api_key=os.getenv("OPENROUTER_API_KEY"),
+        openrouter_model=os.getenv("OPENROUTER_MODEL", "deepseek/deepseek-v4-pro"),
+        openrouter_referer=os.getenv("OPENROUTER_REFERER", "http://localhost:8000"),
+        openrouter_title=os.getenv("OPENROUTER_TITLE", "TOHOTOPIA Steam Monitor"),
+    )
--- a/app/db.py
+++ b/app/db.py
@ -0,0 +1,120 @@
+from __future__ import annotations
+
+from contextlib import contextmanager
+from pathlib import Path
+import json
+import sqlite3
+from typing import Any, Iterator
+
+
+def connect(database_path: Path) -> sqlite3.Connection:
+    database_path.parent.mkdir(parents=True, exist_ok=True)
+    conn = sqlite3.connect(database_path)
+    conn.row_factory = sqlite3.Row
+    conn.execute("PRAGMA journal_mode=WAL")
+    conn.execute("PRAGMA foreign_keys=ON")
+    return conn
+
+
+@contextmanager
+def session(database_path: Path) -> Iterator[sqlite3.Connection]:
+    conn = connect(database_path)
+    try:
+        yield conn
+        conn.commit()
+    except Exception:
+        conn.rollback()
+        raise
+    finally:
+        conn.close()
+
+
+def init_db(conn: sqlite3.Connection) -> None:
+    conn.executescript(
+        """
+        CREATE TABLE IF NOT EXISTS raw_items (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            source TEXT NOT NULL,
+            source_item_id TEXT NOT NULL,
+            source_url TEXT NOT NULL,
+            content_type TEXT NOT NULL,
+            author_id TEXT,
+            author_name TEXT,
+            title TEXT,
+            published_at INTEGER,
+            published_at_text TEXT,
+            collected_at INTEGER NOT NULL,
+            updated_at_source INTEGER,
+            content TEXT NOT NULL,
+            raw_json TEXT NOT NULL,
+            content_hash TEXT NOT NULL,
+            analysis_status TEXT NOT NULL DEFAULT 'pending',
+            UNIQUE(source, source_item_id)
+        );
+
+        CREATE TABLE IF NOT EXISTS analysis_results (
+            raw_item_id INTEGER PRIMARY KEY,
+            model TEXT NOT NULL,
+            sentiment TEXT NOT NULL,
+            is_positive INTEGER NOT NULL,
+            is_negative INTEGER NOT NULL,
+            has_actionable_feedback INTEGER NOT NULL,
+            feedback_types TEXT NOT NULL,
+            reply_recommended INTEGER NOT NULL,
+            reply_priority TEXT NOT NULL,
+            reply_suggestion TEXT NOT NULL,
+            summary TEXT NOT NULL,
+            priority TEXT NOT NULL,
+            confidence REAL NOT NULL,
+            reason TEXT NOT NULL,
+            model_json TEXT NOT NULL,
+            analyzed_at INTEGER NOT NULL,
+            FOREIGN KEY(raw_item_id) REFERENCES raw_items(id) ON DELETE CASCADE
+        );
+
+        CREATE TABLE IF NOT EXISTS work_items (
+            raw_item_id INTEGER PRIMARY KEY,
+            status TEXT NOT NULL DEFAULT 'new',
+            owner TEXT NOT NULL DEFAULT '',
+            notes TEXT NOT NULL DEFAULT '',
+            last_handled_at INTEGER,
+            created_at INTEGER NOT NULL,
+            updated_at INTEGER NOT NULL,
+            FOREIGN KEY(raw_item_id) REFERENCES raw_items(id) ON DELETE CASCADE
+        );
+
+        CREATE TABLE IF NOT EXISTS sync_state (
+            key TEXT PRIMARY KEY,
+            value TEXT NOT NULL,
+            updated_at INTEGER NOT NULL
+        );
+
+        CREATE TABLE IF NOT EXISTS sync_runs (
+            id INTEGER PRIMARY KEY AUTOINCREMENT,
+            started_at INTEGER NOT NULL,
+            finished_at INTEGER,
+            mode TEXT NOT NULL,
+            status TEXT NOT NULL,
+            message TEXT NOT NULL DEFAULT '',
+            stats_json TEXT NOT NULL DEFAULT '{}'
+        );
+
+        CREATE INDEX IF NOT EXISTS idx_raw_items_collected_at ON raw_items(collected_at DESC);
+        CREATE INDEX IF NOT EXISTS idx_raw_items_content_type ON raw_items(content_type);
+        CREATE INDEX IF NOT EXISTS idx_raw_items_analysis_status ON raw_items(analysis_status);
+        CREATE INDEX IF NOT EXISTS idx_work_items_status ON work_items(status);
+        """
+    )
+
+
+def encode_json(value: Any) -> str:
+    return json.dumps(value, ensure_ascii=False, separators=(",", ":"))
+
+
+def decode_json(value: str | None, default: Any = None) -> Any:
+    if value is None:
+        return default
+    try:
+        return json.loads(value)
+    except json.JSONDecodeError:
+        return default
--- a/app/main.py
+++ b/app/main.py
@ -0,0 +1,717 @@
+from __future__ import annotations
+
+from hashlib import sha1
+from html import escape
+import threading
+import time
+from typing import Any
+
+from fastapi import FastAPI, Form, Query
+from fastapi.responses import HTMLResponse, RedirectResponse
+
+from .config import Settings, get_settings
+from .db import decode_json, init_db, session
+from .models import RawItem
+from .openrouter import OpenRouterClient
+from .sync import analyze_pending, run_sync, save_analysis, upsert_raw_item
+
+
+app = FastAPI(title="TOHOTOPIA Steam Monitor")
+sync_lock = threading.Lock()
+analysis_lock = threading.Lock()
+stop_event = threading.Event()
+
+
+def current_settings() -> Settings:
+    return get_settings()
+
+
+def _fmt_ts(value: int | None) -> str:
+    if not value:
+        return ""
+    return time.strftime("%Y-%m-%d %H:%M", time.localtime(int(value)))
+
+
+def _badge(text: str, cls: str = "") -> str:
+    return f'<span class="badge {cls}">{escape(text)}</span>'
+
+
+def _manual_item_id(source_url: str, source_name: str, title: str, author_name: str, content: str) -> str:
+    seed = source_url.strip() or "\n".join(
+        [source_name.strip(), title.strip(), author_name.strip(), content.strip()]
+    )
+    return sha1(seed.encode("utf-8", errors="ignore")).hexdigest()
+
+
+def _looks_chinese(text: str) -> bool:
+    letters = [char for char in text if char.isalpha()]
+    if not letters:
+        return True
+    cjk_count = sum(1 for char in letters if "\u4e00" <= char <= "\u9fff")
+    return cjk_count / len(letters) >= 0.2
+
+
+def _query(filters: dict[str, str]) -> tuple[str, list[Any]]:
+    where = []
+    params: list[Any] = []
+    if filters.get("content_type"):
+        where.append("r.content_type = ?")
+        params.append(filters["content_type"])
+    if filters.get("sentiment"):
+        where.append("a.sentiment = ?")
+        params.append(filters["sentiment"])
+    if filters.get("status"):
+        where.append("w.status = ?")
+        params.append(filters["status"])
+    if filters.get("reply") == "1":
+        where.append("a.reply_recommended = 1")
+    if filters.get("actionable") == "1":
+        where.append("a.has_actionable_feedback = 1")
+    if filters.get("q"):
+        where.append("(r.content LIKE ? OR r.title LIKE ? OR a.summary LIKE ?)")
+        like = f"%{filters['q']}%"
+        params.extend([like, like, like])
+    clause = "WHERE " + " AND ".join(where) if where else ""
+    return clause, params
+
+
+@app.on_event("startup")
+def startup() -> None:
+    settings = current_settings()
+    with session(settings.database_path) as conn:
+        init_db(conn)
+    if settings.auto_sync_enabled:
+        thread = threading.Thread(target=_sync_loop, name="steam-sync-loop", daemon=True)
+        thread.start()
+
+
+@app.on_event("shutdown")
+def shutdown() -> None:
+    stop_event.set()
+
+
+def _sync_loop() -> None:
+    settings = current_settings()
+    interval_seconds = max(settings.sync_interval_minutes, 1) * 60
+    while not stop_event.wait(interval_seconds):
+        if not sync_lock.acquire(blocking=False):
+            continue
+        try:
+            with session(settings.database_path) as conn:
+                run_sync(conn, settings, full=False)
+        except Exception:
+            # Sync failures are recorded in sync_runs by run_sync when possible.
+            pass
+        finally:
+            sync_lock.release()
+
+
+@app.get("/", response_class=HTMLResponse)
+def index(
+    content_type: str = Query(""),
+    sentiment: str = Query(""),
+    status: str = Query(""),
+    reply: str = Query(""),
+    actionable: str = Query(""),
+    q: str = Query(""),
+    manual: str = Query(""),
+    notice: str = Query(""),
+) -> str:
+    settings = current_settings()
+    filters = {
+        "content_type": content_type,
+        "sentiment": sentiment,
+        "status": status,
+        "reply": reply,
+        "actionable": actionable,
+        "q": q,
+    }
+    with session(settings.database_path) as conn:
+        clause, params = _query(filters)
+        rows = conn.execute(
+            f"""
+            SELECT r.*, a.sentiment, a.is_positive, a.is_negative,
+                   a.has_actionable_feedback, a.feedback_types, a.reply_recommended,
+                   a.reply_priority, a.reply_suggestion, a.summary, a.priority,
+                   a.confidence, a.reason, w.status, w.owner, w.notes
+            FROM raw_items r
+            LEFT JOIN analysis_results a ON a.raw_item_id = r.id
+            LEFT JOIN work_items w ON w.raw_item_id = r.id
+            {clause}
+            ORDER BY
+                COALESCE(a.reply_recommended, 0) DESC,
+                COALESCE(r.published_at, r.collected_at) DESC,
+                r.collected_at DESC,
+                r.id DESC
+            LIMIT 200
+            """,
+            params,
+        ).fetchall()
+        metrics = conn.execute(
+            """
+            SELECT
+                COUNT(*) AS total,
+                SUM(CASE WHEN w.status = 'new' THEN 1 ELSE 0 END) AS new_count,
+                SUM(CASE WHEN a.is_negative = 1 THEN 1 ELSE 0 END) AS negative_count,
+                SUM(CASE WHEN a.has_actionable_feedback = 1 THEN 1 ELSE 0 END) AS actionable_count,
+                SUM(CASE WHEN a.reply_recommended = 1 THEN 1 ELSE 0 END) AS reply_count,
+                SUM(CASE WHEN a.priority = 'high' THEN 1 ELSE 0 END) AS high_count,
+                SUM(CASE WHEN r.analysis_status = 'done' THEN 1 ELSE 0 END) AS analyzed_count,
+                SUM(CASE WHEN r.analysis_status = 'pending' THEN 1 ELSE 0 END) AS pending_count,
+                SUM(CASE WHEN r.analysis_status = 'error' THEN 1 ELSE 0 END) AS error_count
+            FROM raw_items r
+            LEFT JOIN analysis_results a ON a.raw_item_id = r.id
+            LEFT JOIN work_items w ON w.raw_item_id = r.id
+            """
+        ).fetchone()
+        last_runs = conn.execute(
+            "SELECT * FROM sync_runs ORDER BY started_at DESC LIMIT 5"
+        ).fetchall()
+        last_success = conn.execute(
+            """
+            SELECT finished_at FROM sync_runs
+            WHERE status = 'success' AND finished_at IS NOT NULL
+            ORDER BY finished_at DESC
+            LIMIT 1
+            """
+        ).fetchone()
+        latest_collected = conn.execute(
+            "SELECT MAX(collected_at) AS collected_at FROM raw_items"
+        ).fetchone()
+
+    items_html = "\n".join(_render_item(row) for row in rows)
+    runs_html = "\n".join(
+        f"<li>{_fmt_ts(run['started_at'])} {escape(run['mode'])} "
+        f"{escape(run['status'])} {escape(run['stats_json'] or '')} {escape(run['message'] or '')}</li>"
+        for run in last_runs
+    )
+    return f"""
+    <!doctype html>
+    <html lang="zh-CN">
+    <head>
+      <meta charset="utf-8">
+      <meta name="viewport" content="width=device-width, initial-scale=1">
+      <title>{escape(settings.product_name)} 社区监控</title>
+      <style>{CSS}</style>
+    </head>
+    <body>
+      <header>
+        <div>
+          <h1>{escape(settings.product_name)} 社区监控</h1>
+          <p>Steam 与社区平台内容，每 {settings.sync_interval_minutes} 分钟刷新</p>
+          <p>最近更新时间：{_last_update_text(last_success, latest_collected)}</p>
+        </div>
+        <div class="actions">
+          <form method="post" action="/sync"><button>增量同步</button></form>
+          <form method="post" action="/sync?full=1"><button class="secondary">全量同步</button></form>
+          <form method="post" action="/analyze-pending"><button class="secondary">补跑分析</button></form>
+          <a class="button secondary" href="/?manual=1">手动添加</a>
+        </div>
+      </header>
+      <section class="metrics">
+        {_metric("总内容", metrics["total"])}
+        {_metric("未处理", metrics["new_count"])}
+        {_metric("差评/负面", metrics["negative_count"])}
+        {_metric("具体反馈", metrics["actionable_count"])}
+        {_metric("建议回复", metrics["reply_count"])}
+        {_metric("高优先级", metrics["high_count"])}
+        {_metric("已分析", metrics["analyzed_count"])}
+        {_metric("待补跑", (metrics["pending_count"] or 0) + (metrics["error_count"] or 0))}
+      </section>
+      {f'<div class="notice">{escape(notice)}</div>' if notice else ''}
+      {_render_manual_form() if manual == '1' else ''}
+      <form class="filters" method="get">
+        {_select("content_type", content_type, {"": "全部类型", "review": "Steam 评测", "discussion_topic": "Steam 帖子", "discussion_reply": "Steam 回复", "twitter_post": "Twitter 帖子", "twitter_reply": "Twitter 回复", "manual_note": "手动添加"})}
+        {_select("sentiment", sentiment, {"": "全部情绪", "positive": "正面", "negative": "负面", "mixed": "混合", "neutral": "中性"})}
+        {_select("status", status, {"": "全部状态", "new": "未处理", "read": "已读", "needs_reply": "待回复", "replied": "已回复", "needs_fix": "待修复", "archived": "已归档"})}
+        <label><input type="checkbox" name="reply" value="1" {'checked' if reply == '1' else ''}> 建议回复</label>
+        <label><input type="checkbox" name="actionable" value="1" {'checked' if actionable == '1' else ''}> 具体反馈</label>
+        <input name="q" placeholder="搜索正文/摘要" value="{escape(q)}">
+        <button>筛选</button>
+      </form>
+      <main>{items_html or '<div class="empty">暂无数据。先运行同步。</div>'}</main>
+      <aside>
+        <h2>最近同步</h2>
+        <ul>{runs_html or '<li>暂无同步记录</li>'}</ul>
+      </aside>
+    </body>
+    </html>
+    """
+
+
+@app.post("/sync")
+def sync(full: int = Query(0)) -> RedirectResponse:
+    if sync_lock.acquire(blocking=False):
+        thread = threading.Thread(target=_run_sync_background, args=(bool(full),), daemon=True)
+        thread.start()
+        return RedirectResponse("/?notice=同步已在后台开始，稍后刷新查看结果", status_code=303)
+    return RedirectResponse("/?notice=已有同步任务正在运行", status_code=303)
+
+
+@app.post("/analyze-pending")
+def analyze() -> RedirectResponse:
+    if analysis_lock.acquire(blocking=False):
+        thread = threading.Thread(target=_run_analysis_background, kwargs={"limit": 20}, daemon=True)
+        thread.start()
+        return RedirectResponse("/?notice=补跑分析已在后台开始，每批最多 20 条，稍后刷新查看结果", status_code=303)
+    return RedirectResponse("/?notice=已有补跑分析正在运行", status_code=303)
+
+
+@app.post("/manual-items")
+def create_manual_item(
+    source_name: str = Form(...),
+    source_url: str = Form(""),
+    title: str = Form(""),
+    author_name: str = Form(""),
+    published_at_text: str = Form(""),
+    content: str = Form(...),
+    status: str = Form("new"),
+    owner: str = Form(""),
+    notes: str = Form(""),
+) -> RedirectResponse:
+    source_name = source_name.strip()
+    source_url = source_url.strip()
+    title = title.strip()
+    author_name = author_name.strip()
+    published_at_text = published_at_text.strip()
+    content = content.strip()
+    status = status if status in _work_status_options() else "new"
+
+    if not source_name or not content:
+        return RedirectResponse("/?manual=1&notice=来源社群和正文不能为空", status_code=303)
+
+    original_content = content
+    translated = False
+    analysis_error = ""
+    settings = current_settings()
+    analyzer = OpenRouterClient(settings)
+    try:
+        if not _looks_chinese(content):
+            content = analyzer.translate_to_chinese(content)
+            translated = content != original_content
+    except Exception as exc:  # noqa: BLE001 - keep manual entry even if translation fails
+        analysis_error = f"翻译失败，已保留原文并标记待补跑：{exc}"
+
+    item = RawItem(
+        source="manual",
+        source_item_id=_manual_item_id(source_url, source_name, title, author_name, content),
+        source_url=source_url,
+        content_type="manual_note",
+        author_id=None,
+        author_name=author_name or source_name,
+        title=title or f"{source_name} 手动信息",
+        published_at=None,
+        published_at_text=published_at_text,
+        updated_at_source=None,
+        content=content,
+        raw={
+            "source_name": source_name,
+            "source_url": source_url,
+            "title": title,
+            "author_name": author_name,
+            "published_at_text": published_at_text,
+            "original_content": original_content,
+            "translated_to_zh": translated,
+            "manual": True,
+        },
+    )
+    now = int(time.time())
+    try:
+        with session(settings.database_path) as conn:
+            raw_item_id, inserted = upsert_raw_item(conn, item)
+            conn.execute(
+                """
+                UPDATE work_items
+                SET status = ?, owner = ?, notes = ?, updated_at = ?,
+                    last_handled_at = CASE WHEN ? != 'new' THEN ? ELSE last_handled_at END
+                WHERE raw_item_id = ?
+                """,
+                (status, owner.strip(), notes.strip(), now, status, now, raw_item_id),
+            )
+            if not analysis_error:
+                try:
+                    analysis = analyzer.analyze(item)
+                    save_analysis(conn, raw_item_id, settings.openrouter_model, analysis)
+                except Exception as exc:  # noqa: BLE001 - keep pending/error for analyze-pending
+                    analysis_error = f"分析失败，已标记待补跑：{exc}"
+                    conn.execute(
+                        "UPDATE raw_items SET analysis_status = 'error' WHERE id = ?",
+                        (raw_item_id,),
+                    )
+    finally:
+        analyzer.close()
+
+    parts = ["已添加手动信息" if inserted else "已更新同来源手动信息"]
+    if translated:
+        parts.append("已翻译成中文")
+    if analysis_error:
+        parts.append(analysis_error)
+    else:
+        parts.append("已生成是否回复和回复建议")
+    notice = "，".join(parts)
+    return RedirectResponse(f"/?notice={notice}", status_code=303)
+
+
+@app.post("/items/{raw_item_id}/work")
+def update_work(
+    raw_item_id: int,
+    status: str = Form(...),
+    owner: str = Form(""),
+    notes: str = Form(""),
+) -> RedirectResponse:
+    settings = current_settings()
+    now = int(time.time())
+    with session(settings.database_path) as conn:
+        conn.execute(
+            """
+            UPDATE work_items
+            SET status = ?, owner = ?, notes = ?, updated_at = ?,
+                last_handled_at = CASE WHEN ? != 'new' THEN ? ELSE last_handled_at END
+            WHERE raw_item_id = ?
+            """,
+            (status, owner, notes, now, status, now, raw_item_id),
+        )
+    return RedirectResponse("/", status_code=303)
+
+
+def _run_sync_background(full: bool) -> None:
+    settings = current_settings()
+    try:
+        with session(settings.database_path) as conn:
+            run_sync(conn, settings, full=full)
+    finally:
+        sync_lock.release()
+
+
+def _run_analysis_background(limit: int) -> None:
+    settings = current_settings()
+    try:
+        with session(settings.database_path) as conn:
+            analyze_pending(conn, settings, limit=limit)
+    finally:
+        analysis_lock.release()
+
+
+def _notice_text(stats: dict[str, Any]) -> str:
+    if not stats:
+        return "无待处理项目"
+    return "，".join(f"{key}={value}" for key, value in stats.items())
+
+
+def _last_update_text(last_success: Any, latest_collected: Any) -> str:
+    if last_success and last_success["finished_at"]:
+        return _fmt_ts(last_success["finished_at"])
+    if latest_collected and latest_collected["collected_at"]:
+        return _fmt_ts(latest_collected["collected_at"])
+    return "暂无"
+
+
+def _metric(label: str, value: Any) -> str:
+    return f'<div class="metric"><span>{escape(label)}</span><strong>{int(value or 0)}</strong></div>'
+
+
+def _select(name: str, current: str, options: dict[str, str]) -> str:
+    option_html = "".join(
+        f'<option value="{escape(value)}" {"selected" if value == current else ""}>{escape(label)}</option>'
+        for value, label in options.items()
+    )
+    return f'<select name="{escape(name)}">{option_html}</select>'
+
+
+def _work_status_options() -> dict[str, str]:
+    return {
+        "new": "未处理",
+        "read": "已读",
+        "needs_reply": "待回复",
+        "replied": "已回复",
+        "needs_fix": "待修复",
+        "archived": "已归档",
+    }
+
+
+def _render_manual_form() -> str:
+    return f"""
+    <section class="manual-panel">
+      <h2>手动添加社区信息</h2>
+      <form class="manual-form" method="post" action="/manual-items">
+        <input name="source_name" placeholder="来源社群/平台，例如 Discord、小红书、QQ群" required>
+        <input name="source_url" placeholder="原始链接，可留空">
+        <input name="title" placeholder="标题，可留空">
+        <input name="author_name" placeholder="作者/昵称，可留空">
+        <input name="published_at_text" placeholder="发布时间文本，可留空">
+        <textarea name="content" placeholder="正文/摘要" required></textarea>
+        {_select("status", "new", _work_status_options())}
+        <input name="owner" placeholder="制作人/处理人">
+        <input name="notes" placeholder="备注">
+        <button>添加</button>
+      </form>
+    </section>
+    """
+
+
+def _render_item(row: Any) -> str:
+    feedback_types = ", ".join(decode_json(row["feedback_types"], [])) if row["feedback_types"] else ""
+    cls = "item urgent" if row["reply_recommended"] or row["priority"] == "high" else "item"
+    badges = [
+        _badge(row["content_type"] or "", "type"),
+        _badge(row["sentiment"] or "pending", row["sentiment"] or ""),
+        _badge(row["priority"] or "low", "priority"),
+    ]
+    if row["has_actionable_feedback"]:
+        badges.append(_badge("具体反馈", "action"))
+    if row["reply_recommended"]:
+        badges.append(_badge("建议回复", "reply"))
+    content = escape(row["content"] or "")
+    if len(content) > 900:
+        content = content[:900] + "..."
+    return f"""
+    <article class="{cls}">
+      <div class="item-head">
+        <div>
+          <h2>{escape(row['summary'] or row['title'] or '未分析')}</h2>
+          <div class="meta">{' '.join(badges)} <span>{escape(row['author_name'] or '')}</span> <span>{_fmt_ts(row['published_at']) or escape(row['published_at_text'] or '')}</span></div>
+        </div>
+        {_source_link(row['source_url'])}
+      </div>
+      <p class="content">{content}</p>
+      <p class="reason">{escape(row['reason'] or '')}</p>
+      <p class="reply-suggestion">{escape(row['reply_suggestion'] or '')}</p>
+      <p class="types">{escape(feedback_types)}</p>
+      <form class="work" method="post" action="/items/{row['id']}/work">
+        {_select("status", row["status"] or "new", _work_status_options())}
+        <input name="owner" placeholder="制作人/处理人" value="{escape(row['owner'] or '')}">
+        <input name="notes" placeholder="备注" value="{escape(row['notes'] or '')}">
+        <button>保存</button>
+      </form>
+    </article>
+    """
+
+
+def _source_link(source_url: str | None) -> str:
+    if not source_url:
+        return '<span class="source muted">无原始链接</span>'
+    if not source_url.startswith(("http://", "https://")):
+        return f'<span class="source muted">{escape(source_url)}</span>'
+    return (
+        f'<a class="source" href="{escape(source_url)}" target="_blank" '
+        f'rel="noreferrer">原始链接</a>'
+    )
+
+
+CSS = """
+:root {
+  color-scheme: light;
+  font-family: Inter, "Segoe UI", "Microsoft YaHei", sans-serif;
+  background: #f6f7f9;
+  color: #1f2933;
+}
+body {
+  margin: 0;
+}
+header {
+  display: flex;
+  justify-content: space-between;
+  gap: 24px;
+  align-items: center;
+  padding: 24px 32px;
+  background: #ffffff;
+  border-bottom: 1px solid #d9dee7;
+}
+h1 {
+  margin: 0 0 4px;
+  font-size: 24px;
+}
+p {
+  line-height: 1.5;
+}
+header p {
+  margin: 0;
+  color: #64748b;
+}
+.actions {
+  display: flex;
+  gap: 8px;
+  flex-wrap: wrap;
+}
+button, .button, select, input, textarea {
+  min-height: 36px;
+  border: 1px solid #cbd5e1;
+  border-radius: 6px;
+  padding: 0 12px;
+  background: #fff;
+  font: inherit;
+}
+button, .button {
+  display: inline-flex;
+  align-items: center;
+  background: #166534;
+  color: white;
+  border-color: #166534;
+  cursor: pointer;
+  text-decoration: none;
+}
+button.secondary, .button.secondary {
+  background: #334155;
+  border-color: #334155;
+}
+.metrics {
+  display: grid;
+  grid-template-columns: repeat(6, minmax(120px, 1fr));
+  gap: 12px;
+  padding: 18px 32px;
+}
+.metric {
+  background: #fff;
+  border: 1px solid #d9dee7;
+  border-radius: 8px;
+  padding: 14px;
+}
+.metric span {
+  display: block;
+  color: #64748b;
+  font-size: 13px;
+}
+.metric strong {
+  display: block;
+  font-size: 26px;
+  margin-top: 6px;
+}
+.filters {
+  display: flex;
+  gap: 10px;
+  flex-wrap: wrap;
+  align-items: center;
+  padding: 0 32px 18px;
+}
+.manual-panel {
+  margin: 0 32px 18px;
+  padding: 18px;
+  border: 1px solid #d9dee7;
+  border-radius: 8px;
+  background: #fff;
+}
+.manual-panel h2 {
+  margin: 0 0 12px;
+  font-size: 17px;
+}
+.manual-form {
+  display: grid;
+  grid-template-columns: repeat(3, minmax(160px, 1fr));
+  gap: 10px;
+}
+.manual-form textarea {
+  grid-column: 1 / -1;
+  min-height: 120px;
+  padding: 10px 12px;
+  resize: vertical;
+}
+.notice {
+  margin: 0 32px 18px;
+  padding: 12px 14px;
+  border: 1px solid #86efac;
+  border-radius: 8px;
+  background: #f0fdf4;
+  color: #166534;
+}
+main {
+  display: grid;
+  gap: 14px;
+  padding: 0 32px 24px;
+}
+.item {
+  background: #fff;
+  border: 1px solid #d9dee7;
+  border-radius: 8px;
+  padding: 18px;
+}
+.item.urgent {
+  border-color: #dc2626;
+  box-shadow: inset 4px 0 0 #dc2626;
+}
+.item-head {
+  display: flex;
+  justify-content: space-between;
+  gap: 16px;
+  align-items: flex-start;
+}
+.item h2 {
+  margin: 0 0 8px;
+  font-size: 17px;
+}
+.meta {
+  display: flex;
+  gap: 8px;
+  align-items: center;
+  flex-wrap: wrap;
+  color: #64748b;
+  font-size: 13px;
+}
+.badge {
+  display: inline-flex;
+  align-items: center;
+  min-height: 24px;
+  padding: 0 8px;
+  border-radius: 999px;
+  background: #e2e8f0;
+  color: #334155;
+}
+.badge.negative, .badge.reply {
+  background: #fee2e2;
+  color: #991b1b;
+}
+.badge.positive {
+  background: #dcfce7;
+  color: #166534;
+}
+.badge.action {
+  background: #fef3c7;
+  color: #92400e;
+}
+.source {
+  color: #166534;
+  white-space: nowrap;
+}
+.source.muted {
+  color: #64748b;
+}
+.content {
+  white-space: pre-wrap;
+}
+.reason, .reply-suggestion, .types {
+  color: #475569;
+  margin: 8px 0;
+}
+.reply-suggestion {
+  font-weight: 600;
+}
+.work {
+  display: grid;
+  grid-template-columns: 150px minmax(140px, 220px) 1fr 80px;
+  gap: 8px;
+  margin-top: 12px;
+}
+aside {
+  padding: 0 32px 32px;
+  color: #475569;
+}
+.empty {
+  background: #fff;
+  border: 1px solid #d9dee7;
+  border-radius: 8px;
+  padding: 32px;
+}
+@media (max-width: 900px) {
+  header, .item-head {
+    flex-direction: column;
+  }
+  .metrics {
+    grid-template-columns: repeat(2, minmax(120px, 1fr));
+  }
+  .work {
+    grid-template-columns: 1fr;
+  }
+  .manual-form {
+    grid-template-columns: 1fr;
+  }
+}
+"""
--- a/app/models.py
+++ b/app/models.py
@ -0,0 +1,20 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+from typing import Any
+
+
+@dataclass(frozen=True)
+class RawItem:
+    source: str
+    source_item_id: str
+    source_url: str
+    content_type: str
+    author_id: str | None
+    author_name: str | None
+    title: str | None
+    published_at: int | None
+    published_at_text: str | None
+    updated_at_source: int | None
+    content: str
+    raw: dict[str, Any]
--- a/app/openrouter.py
+++ b/app/openrouter.py
@ -0,0 +1,238 @@
+from __future__ import annotations
+
+import json
+import re
+from typing import Any
+
+import httpx
+
+from .config import Settings
+from .models import RawItem
+
+
+DEFAULT_ANALYSIS = {
+    "sentiment": "neutral",
+    "is_positive": False,
+    "is_negative": False,
+    "has_actionable_feedback": False,
+    "feedback_types": [],
+    "reply_recommended": False,
+    "reply_priority": "none",
+    "reply_suggestion": "",
+    "summary": "",
+    "priority": "low",
+    "confidence": 0.0,
+    "reason": "",
+}
+
+
+TRANSLATION_SCHEMA = {
+    "type": "object",
+    "properties": {
+        "translated_content": {"type": "string"},
+    },
+    "required": ["translated_content"],
+    "additionalProperties": False,
+}
+
+
+SCHEMA = {
+    "type": "object",
+    "properties": {
+        "sentiment": {"type": "string", "enum": ["positive", "negative", "mixed", "neutral"]},
+        "is_positive": {"type": "boolean"},
+        "is_negative": {"type": "boolean"},
+        "has_actionable_feedback": {"type": "boolean"},
+        "feedback_types": {
+            "type": "array",
+            "items": {
+                "type": "string",
+                "enum": [
+                    "bug",
+                    "suggestion",
+                    "balance",
+                    "ui",
+                    "localization",
+                    "performance",
+                    "pricing",
+                    "content",
+                    "question",
+                    "other",
+                ],
+            },
+        },
+        "reply_recommended": {"type": "boolean"},
+        "reply_priority": {"type": "string", "enum": ["none", "low", "medium", "high"]},
+        "reply_suggestion": {"type": "string"},
+        "summary": {"type": "string"},
+        "priority": {"type": "string", "enum": ["low", "medium", "high"]},
+        "confidence": {"type": "number", "minimum": 0, "maximum": 1},
+        "reason": {"type": "string"},
+    },
+    "required": [
+        "sentiment",
+        "is_positive",
+        "is_negative",
+        "has_actionable_feedback",
+        "feedback_types",
+        "reply_recommended",
+        "reply_priority",
+        "reply_suggestion",
+        "summary",
+        "priority",
+        "confidence",
+        "reason",
+    ],
+    "additionalProperties": False,
+}
+
+
+class OpenRouterClient:
+    def __init__(self, settings: Settings) -> None:
+        self.settings = settings
+        self.enabled = bool(settings.openrouter_api_key)
+        self.client = httpx.Client(timeout=60)
+
+    def close(self) -> None:
+        self.client.close()
+
+    def analyze(self, item: RawItem) -> dict[str, Any]:
+        if not self.enabled:
+            raise MissingOpenRouterKey("OPENROUTER_API_KEY is not configured")
+
+        payload = {
+            "model": self.settings.openrouter_model,
+            "messages": [
+                {
+                    "role": "system",
+                    "content": (
+                        "你是独立游戏《帝国幻想乡~TOHOTOPIA》的社区运营助手。"
+                        "请判断 Steam、Twitter/X 等社区内容的情绪、是否包含具体可处理反馈、"
+                        "以及是否建议制作人回复。summary、reason、reply_suggestion 必须使用中文。"
+                        "只输出符合 JSON Schema 的 JSON。"
+                    ),
+                },
+                {
+                    "role": "user",
+                    "content": self._prompt(item),
+                },
+            ],
+            "temperature": 0.1,
+            "response_format": {
+                "type": "json_schema",
+                "json_schema": {
+                    "name": "community_item_analysis",
+                    "strict": True,
+                    "schema": SCHEMA,
+                },
+            },
+        }
+        headers = {
+            "Authorization": f"Bearer {self.settings.openrouter_api_key}",
+            "HTTP-Referer": self.settings.openrouter_referer,
+            "X-Title": self.settings.openrouter_title,
+        }
+        response = self.client.post(
+            "https://openrouter.ai/api/v1/chat/completions",
+            headers=headers,
+            json=payload,
+        )
+        response.raise_for_status()
+        data = response.json()
+        content = data["choices"][0]["message"]["content"]
+        parsed = self._parse_json(content)
+        return self._normalize(parsed)
+
+    def translate_to_chinese(self, content: str) -> str:
+        if not self.enabled:
+            raise MissingOpenRouterKey("OPENROUTER_API_KEY is not configured")
+
+        payload = {
+            "model": self.settings.openrouter_model,
+            "messages": [
+                {
+                    "role": "system",
+                    "content": (
+                        "你是独立游戏社区运营翻译助手。"
+                        "把用户提供的社区内容准确翻译成简体中文，保留原意、语气、问题细节、游戏术语、链接和编号。"
+                        "不要添加解释。只输出符合 JSON Schema 的 JSON。"
+                    ),
+                },
+                {
+                    "role": "user",
+                    "content": content[:6000],
+                },
+            ],
+            "temperature": 0,
+            "response_format": {
+                "type": "json_schema",
+                "json_schema": {
+                    "name": "manual_item_translation",
+                    "strict": True,
+                    "schema": TRANSLATION_SCHEMA,
+                },
+            },
+        }
+        headers = {
+            "Authorization": f"Bearer {self.settings.openrouter_api_key}",
+            "HTTP-Referer": self.settings.openrouter_referer,
+            "X-Title": self.settings.openrouter_title,
+        }
+        response = self.client.post(
+            "https://openrouter.ai/api/v1/chat/completions",
+            headers=headers,
+            json=payload,
+        )
+        response.raise_for_status()
+        data = response.json()
+        parsed = self._parse_json(data["choices"][0]["message"]["content"])
+        translated = str(parsed.get("translated_content") or "").strip()
+        return translated or content
+
+    def _prompt(self, item: RawItem) -> str:
+        metadata = {
+            "source": item.source,
+            "content_type": item.content_type,
+            "source_url": item.source_url,
+            "author": item.author_name,
+            "title": item.title,
+            "steam_review_voted_up": item.raw.get("voted_up"),
+            "language": item.raw.get("language"),
+            "in_reply_to": item.raw.get("parent_url") or item.raw.get("in_reply_to"),
+            "likes": item.raw.get("likes"),
+            "replies": item.raw.get("replies"),
+            "retweets": item.raw.get("retweets"),
+            "views": item.raw.get("views"),
+        }
+        return (
+            "请分析以下社区内容。\n\n"
+            f"元数据：{json.dumps(metadata, ensure_ascii=False)}\n\n"
+            f"正文：\n{item.content[:6000]}"
+        )
+
+    def _parse_json(self, content: str) -> dict[str, Any]:
+        try:
+            return json.loads(content)
+        except json.JSONDecodeError:
+            match = re.search(r"\{.*\}", content, re.S)
+            if not match:
+                raise
+            return json.loads(match.group(0))
+
+    def _normalize(self, value: dict[str, Any]) -> dict[str, Any]:
+        result = dict(DEFAULT_ANALYSIS)
+        result.update(value)
+        result["feedback_types"] = list(result.get("feedback_types") or [])
+        result["is_positive"] = bool(result.get("is_positive"))
+        result["is_negative"] = bool(result.get("is_negative"))
+        result["has_actionable_feedback"] = bool(result.get("has_actionable_feedback"))
+        result["reply_recommended"] = bool(result.get("reply_recommended"))
+        try:
+            result["confidence"] = float(result.get("confidence", 0.0))
+        except (TypeError, ValueError):
+            result["confidence"] = 0.0
+        return result
+
+
+class MissingOpenRouterKey(RuntimeError):
+    pass
--- a/app/steam.py
+++ b/app/steam.py
@ -0,0 +1,321 @@
+from __future__ import annotations
+
+from hashlib import sha1
+import re
+import time
+from typing import Any, Iterable
+from urllib.parse import parse_qs, quote, urljoin, urlparse
+
+from bs4 import BeautifulSoup
+import httpx
+
+from .models import RawItem
+
+
+STEAM_STORE = "https://store.steampowered.com"
+STEAM_COMMUNITY = "https://steamcommunity.com"
+
+
+HEADERS = {
+    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
+    "(KHTML, like Gecko) Chrome/125.0 Safari/537.36",
+    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8,ja;q=0.7",
+}
+
+
+def content_hash(text: str) -> str:
+    return sha1(text.encode("utf-8", errors="ignore")).hexdigest()
+
+
+def _text(node: Any) -> str:
+    return node.get_text(separator="\n", strip=True) if node else ""
+
+
+def _abs_url(url: str) -> str:
+    return urljoin(STEAM_COMMUNITY, url)
+
+
+def _topic_id_from_url(url: str) -> str:
+    match = re.search(r"/discussions/[^/]+/(\d+)", url)
+    if match:
+        return match.group(1)
+    return content_hash(url)
+
+
+def _reply_id(comment: Any, topic_id: str, author: str, timestamp: str, text: str) -> str:
+    node_id = comment.get("id", "")
+    if node_id:
+        return node_id
+    data_id = comment.get("data-commentid", "")
+    if data_id:
+        return data_id
+    return f"{topic_id}:{content_hash(author + timestamp + text)}"
+
+
+def parse_steam_time(text: str | None, now: int | None = None) -> int | None:
+    if not text:
+        return None
+    value = text.strip()
+    now_ts = now or int(time.time())
+    relative = re.match(r"^(\d+)\s*(分钟|小时|天|minute|minutes|hour|hours|day|days)\s*(以前|ago)?$", value, re.I)
+    if relative:
+        amount = int(relative.group(1))
+        unit = relative.group(2).lower()
+        seconds = {
+            "分钟": 60,
+            "minute": 60,
+            "minutes": 60,
+            "小时": 3600,
+            "hour": 3600,
+            "hours": 3600,
+            "天": 86400,
+            "day": 86400,
+            "days": 86400,
+        }[unit]
+        return now_ts - amount * seconds
+
+    absolute = re.match(
+        r"^(\d{1,2})\s*月\s*(\d{1,2})\s*日\s*(上午|下午)\s*(\d{1,2}):(\d{2})$",
+        value,
+    )
+    if absolute:
+        current = time.localtime(now_ts)
+        return _make_ts(
+            current.tm_year,
+            int(absolute.group(1)),
+            int(absolute.group(2)),
+            absolute.group(3),
+            int(absolute.group(4)),
+            int(absolute.group(5)),
+        )
+
+    absolute_with_year = re.match(
+        r"^(\d{4})\s*年\s*(\d{1,2})\s*月\s*(\d{1,2})\s*日\s*(上午|下午)\s*(\d{1,2}):(\d{2})$",
+        value,
+    )
+    if absolute_with_year:
+        return _make_ts(
+            int(absolute_with_year.group(1)),
+            int(absolute_with_year.group(2)),
+            int(absolute_with_year.group(3)),
+            absolute_with_year.group(4),
+            int(absolute_with_year.group(5)),
+            int(absolute_with_year.group(6)),
+        )
+    return None
+
+
+def _make_ts(year: int, month: int, day: int, ampm: str, hour: int, minute: int) -> int:
+    if ampm == "下午" and hour != 12:
+        hour += 12
+    if ampm == "上午" and hour == 12:
+        hour = 0
+    return int(time.mktime((year, month, day, hour, minute, 0, -1, -1, -1)))
+
+
+class SteamClient:
+    def __init__(self, app_id: str) -> None:
+        self.app_id = app_id
+        self.client = httpx.Client(headers=HEADERS, timeout=30, follow_redirects=True)
+        self.client.cookies.set("birthtime", "568022401", domain="steamcommunity.com")
+
+    def close(self) -> None:
+        self.client.close()
+
+    def fetch_reviews(self, max_pages: int | None = None) -> list[RawItem]:
+        cursor = "*"
+        page = 0
+        items: list[RawItem] = []
+        while True:
+            params = {
+                "json": "1",
+                "num_per_page": "100",
+                "language": "all",
+                "filter": "recent",
+                "purchase_type": "all",
+                "cursor": cursor,
+            }
+            response = self.client.get(f"{STEAM_STORE}/appreviews/{self.app_id}", params=params)
+            response.raise_for_status()
+            data = response.json()
+            reviews = data.get("reviews") or []
+            if not reviews:
+                break
+            for review in reviews:
+                items.append(self._review_to_item(review))
+            new_cursor = data.get("cursor") or cursor
+            page += 1
+            if new_cursor == cursor:
+                break
+            if max_pages and page >= max_pages:
+                break
+            cursor = new_cursor
+            time.sleep(0.25)
+        return items
+
+    def fetch_discussions(self, full: bool, max_pages: int, time_limit_seconds: int) -> list[RawItem]:
+        started = time.monotonic()
+        topic_urls: list[str] = []
+        seen_urls: set[str] = set()
+        for page in range(1, max_pages + 1):
+            if time.monotonic() - started > time_limit_seconds:
+                break
+            url = f"{STEAM_COMMUNITY}/app/{self.app_id}/discussions/"
+            if page > 1:
+                url = f"{url}?fp={page}"
+            html = self._get_text(url)
+            urls = self._extract_topic_urls(html)
+            new_urls = [u for u in urls if u not in seen_urls]
+            if not new_urls:
+                break
+            topic_urls.extend(new_urls)
+            seen_urls.update(new_urls)
+            if not full and page >= max_pages:
+                break
+            time.sleep(0.25)
+
+        items: list[RawItem] = []
+        for url in topic_urls:
+            if time.monotonic() - started > time_limit_seconds:
+                break
+            items.extend(self.fetch_discussion_topic(url))
+            time.sleep(0.35)
+        return items
+
+    def fetch_discussion_topic(self, url: str) -> list[RawItem]:
+        html = self._get_text(url)
+        soup = BeautifulSoup(html, "html.parser")
+        topic_id = _topic_id_from_url(url)
+        title = _text(soup.select_one("div.topic")) or _text(soup.select_one(".forum_topic_name"))
+        items: list[RawItem] = []
+
+        op = soup.select_one(".forum_op")
+        if op:
+            author_el = op.select_one(".authorline a")
+            date_el = op.select_one(".date")
+            date_text = _text(date_el)
+            content_el = op.select_one(".content")
+            author = _text(author_el)
+            content = _text(content_el)
+            source_url = url
+            if content:
+                items.append(
+                    RawItem(
+                        source="steam_discussions",
+                        source_item_id=f"topic:{topic_id}",
+                        source_url=source_url,
+                        content_type="discussion_topic",
+                        author_id=self._steam_id_from_author(author_el),
+                        author_name=author,
+                        title=title,
+                        published_at=parse_steam_time(date_text),
+                        published_at_text=date_text,
+                        updated_at_source=None,
+                        content=content,
+                        raw={
+                            "topic_id": topic_id,
+                            "topic_url": url,
+                            "title": title,
+                            "author": author,
+                            "date": date_text,
+                            "content": content,
+                        },
+                    )
+                )
+
+        for comment in soup.select(".commentthread_comment"):
+            author_el = comment.select_one(".commentthread_author_link")
+            date_el = comment.select_one(".commentthread_comment_timestamp")
+            text_el = comment.select_one(".commentthread_comment_text")
+            text = _text(text_el)
+            if not text:
+                continue
+            author = _text(author_el)
+            timestamp = _text(date_el)
+            reply_id = _reply_id(comment, topic_id, author, timestamp, text)
+            reply_url = f"{url}#{reply_id}" if reply_id else url
+            items.append(
+                RawItem(
+                    source="steam_discussions",
+                    source_item_id=f"reply:{topic_id}:{reply_id}",
+                    source_url=reply_url,
+                    content_type="discussion_reply",
+                    author_id=self._steam_id_from_author(author_el),
+                    author_name=author,
+                    title=title,
+                    published_at=parse_steam_time(timestamp),
+                    published_at_text=timestamp,
+                    updated_at_source=None,
+                    content=text,
+                    raw={
+                        "topic_id": topic_id,
+                        "topic_url": url,
+                        "reply_id": reply_id,
+                        "reply_url": reply_url,
+                        "title": title,
+                        "reply_author": author,
+                        "reply_time_text": timestamp,
+                        "reply_content": text,
+                    },
+                )
+            )
+        return items
+
+    def _review_to_item(self, review: dict[str, Any]) -> RawItem:
+        author = review.get("author") or {}
+        steam_id = str(author.get("steamid") or "")
+        recommendation_id = str(review.get("recommendationid"))
+        source_url = f"{STEAM_COMMUNITY}/profiles/{steam_id}/recommended/{self.app_id}/"
+        raw = dict(review)
+        raw["source_url"] = source_url
+        return RawItem(
+            source="steam_reviews",
+            source_item_id=f"review:{recommendation_id}",
+            source_url=source_url,
+            content_type="review",
+            author_id=steam_id or None,
+            author_name=author.get("personaname"),
+            title=None,
+            published_at=review.get("timestamp_created"),
+            published_at_text=None,
+            updated_at_source=review.get("timestamp_updated"),
+            content=review.get("review") or "",
+            raw=raw,
+        )
+
+    def _get_text(self, url: str) -> str:
+        response = self.client.get(url)
+        response.raise_for_status()
+        response.encoding = "utf-8"
+        return response.text
+
+    def _extract_topic_urls(self, html: str) -> list[str]:
+        soup = BeautifulSoup(html, "html.parser")
+        urls: list[str] = []
+        for link in soup.select("a.forum_topic_overlay, a.forum_topic_name"):
+            href = link.get("href")
+            if not href:
+                continue
+            url = _abs_url(href).split("?")[0]
+            if f"/app/{self.app_id}/discussions/" in url and url not in urls:
+                urls.append(url)
+        return urls
+
+    def _steam_id_from_author(self, author_el: Any) -> str | None:
+        if not author_el:
+            return None
+        href = author_el.get("href") or ""
+        parsed = urlparse(href)
+        if "/profiles/" in parsed.path:
+            return parsed.path.rstrip("/").split("/")[-1]
+        if "/id/" in parsed.path:
+            return parsed.path.rstrip("/").split("/")[-1]
+        query = parse_qs(parsed.query)
+        steam_id = query.get("steamid")
+        return steam_id[0] if steam_id else None
+
+
+def iter_nonempty(items: Iterable[RawItem]) -> Iterable[RawItem]:
+    for item in items:
+        if item.content.strip():
+            yield item
--- a/app/sync.py
+++ b/app/sync.py
@ -0,0 +1,366 @@
+from __future__ import annotations
+
+from collections import Counter
+from hashlib import sha1
+import sqlite3
+import time
+from typing import Any
+
+from .config import Settings
+from .db import decode_json, encode_json, init_db
+from .models import RawItem
+from .openrouter import OpenRouterClient
+from .steam import SteamClient, iter_nonempty
+from .twitter import TwitterClient, TwitterScrapeOptions
+
+
+def _now() -> int:
+    return int(time.time())
+
+
+def _hash(text: str) -> str:
+    return sha1(text.encode("utf-8", errors="ignore")).hexdigest()
+
+
+def upsert_raw_item(conn: sqlite3.Connection, item: RawItem) -> tuple[int, bool]:
+    now = _now()
+    item_hash = _hash(item.content)
+    existing = conn.execute(
+        "SELECT id, content_hash FROM raw_items WHERE source = ? AND source_item_id = ?",
+        (item.source, item.source_item_id),
+    ).fetchone()
+    if existing:
+        if existing["content_hash"] != item_hash:
+            conn.execute(
+                """
+                UPDATE raw_items
+                SET source_url = ?, author_id = ?, author_name = ?, title = ?,
+                    published_at = ?, published_at_text = ?, updated_at_source = ?,
+                    content = ?, raw_json = ?, content_hash = ?, analysis_status = 'pending',
+                    collected_at = ?
+                WHERE id = ?
+                """,
+                (
+                    item.source_url,
+                    item.author_id,
+                    item.author_name,
+                    item.title,
+                    item.published_at,
+                    item.published_at_text,
+                    item.updated_at_source,
+                    item.content,
+                    encode_json(item.raw),
+                    item_hash,
+                    now,
+                    existing["id"],
+                ),
+            )
+        return int(existing["id"]), False
+
+    cursor = conn.execute(
+        """
+        INSERT INTO raw_items (
+            source, source_item_id, source_url, content_type, author_id, author_name,
+            title, published_at, published_at_text, collected_at, updated_at_source,
+            content, raw_json, content_hash, analysis_status
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, 'pending')
+        """,
+        (
+            item.source,
+            item.source_item_id,
+            item.source_url,
+            item.content_type,
+            item.author_id,
+            item.author_name,
+            item.title,
+            item.published_at,
+            item.published_at_text,
+            now,
+            item.updated_at_source,
+            item.content,
+            encode_json(item.raw),
+            item_hash,
+        ),
+    )
+    raw_item_id = int(cursor.lastrowid)
+    conn.execute(
+        """
+        INSERT INTO work_items (raw_item_id, status, owner, notes, created_at, updated_at)
+        VALUES (?, 'new', '', '', ?, ?)
+        """,
+        (raw_item_id, now, now),
+    )
+    return raw_item_id, True
+
+
+def save_analysis(
+    conn: sqlite3.Connection,
+    raw_item_id: int,
+    model: str,
+    analysis: dict[str, Any],
+) -> None:
+    now = _now()
+    conn.execute(
+        """
+        INSERT INTO analysis_results (
+            raw_item_id, model, sentiment, is_positive, is_negative,
+            has_actionable_feedback, feedback_types, reply_recommended, reply_priority,
+            reply_suggestion, summary, priority, confidence, reason, model_json, analyzed_at
+        )
+        VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+        ON CONFLICT(raw_item_id) DO UPDATE SET
+            model = excluded.model,
+            sentiment = excluded.sentiment,
+            is_positive = excluded.is_positive,
+            is_negative = excluded.is_negative,
+            has_actionable_feedback = excluded.has_actionable_feedback,
+            feedback_types = excluded.feedback_types,
+            reply_recommended = excluded.reply_recommended,
+            reply_priority = excluded.reply_priority,
+            reply_suggestion = excluded.reply_suggestion,
+            summary = excluded.summary,
+            priority = excluded.priority,
+            confidence = excluded.confidence,
+            reason = excluded.reason,
+            model_json = excluded.model_json,
+            analyzed_at = excluded.analyzed_at
+        """,
+        (
+            raw_item_id,
+            model,
+            analysis["sentiment"],
+            int(analysis["is_positive"]),
+            int(analysis["is_negative"]),
+            int(analysis["has_actionable_feedback"]),
+            encode_json(analysis["feedback_types"]),
+            int(analysis["reply_recommended"]),
+            analysis["reply_priority"],
+            analysis["reply_suggestion"],
+            analysis["summary"],
+            analysis["priority"],
+            analysis["confidence"],
+            analysis["reason"],
+            encode_json(analysis),
+            now,
+        ),
+    )
+    conn.execute("UPDATE raw_items SET analysis_status = 'done' WHERE id = ?", (raw_item_id,))
+
+
+def _twitter_high_watermark_ts(conn: sqlite3.Connection) -> int | None:
+    row = conn.execute(
+        """
+        SELECT MAX(COALESCE(published_at, collected_at)) AS watermark
+        FROM raw_items
+        WHERE source IN ('twitter_posts', 'twitter_replies')
+        """
+    ).fetchone()
+    if row and row["watermark"]:
+        return int(row["watermark"])
+    return None
+
+
+def _recent_twitter_post_urls(conn: sqlite3.Connection, limit: int) -> list[str]:
+    if limit <= 0:
+        return []
+    rows = conn.execute(
+        """
+        SELECT source_url
+        FROM raw_items
+        WHERE source = 'twitter_posts'
+        ORDER BY COALESCE(published_at, collected_at) DESC, collected_at DESC
+        LIMIT ?
+        """,
+        (limit,),
+    ).fetchall()
+    return [str(row["source_url"]) for row in rows if row["source_url"]]
+
+
+def _twitter_options(settings: Settings) -> TwitterScrapeOptions:
+    return TwitterScrapeOptions(
+        username=settings.twitter_username,
+        scraper_path=settings.twitter_scraper_path,
+        output_dir=settings.twitter_output_dir,
+        browser_provider=settings.twitter_browser_provider,
+        full_max_no_new=settings.twitter_full_max_no_new,
+        incremental_max_no_new=settings.twitter_incremental_max_no_new,
+        thread_max_no_new=settings.twitter_thread_max_no_new,
+        command_timeout_seconds=settings.twitter_command_timeout_seconds,
+        full_reply_post_limit=settings.twitter_full_reply_post_limit,
+        incremental_reply_parent_limit=settings.twitter_incremental_reply_parent_limit,
+    )
+
+
+def run_sync(
+    conn: sqlite3.Connection,
+    settings: Settings,
+    full: bool = False,
+    platforms: list[str] | None = None,
+) -> dict[str, Any]:
+    init_db(conn)
+    started = _now()
+    mode = "full" if full else "incremental"
+    run_id = conn.execute(
+        "INSERT INTO sync_runs (started_at, mode, status) VALUES (?, ?, 'running')",
+        (started, mode),
+    ).lastrowid
+    conn.commit()
+
+    stats: Counter[str] = Counter()
+    messages: list[str] = []
+    try:
+        enabled_platforms = platforms or ["steam", "twitter"]
+        if "twitter" in enabled_platforms and not settings.twitter_enabled:
+            stats["twitter_skipped"] += 1
+        raw_items: list[RawItem] = []
+        if "steam" in enabled_platforms:
+            steam = SteamClient(settings.app_id)
+            try:
+                review_pages = None if full else 2
+                review_items = steam.fetch_reviews(max_pages=review_pages)
+                discussion_pages = (
+                    settings.discussion_full_scan_max_pages
+                    if full
+                    else settings.discussion_incremental_max_pages
+                )
+                discussion_items = steam.fetch_discussions(
+                    full=full,
+                    max_pages=discussion_pages,
+                    time_limit_seconds=settings.full_scan_time_limit_seconds,
+                )
+                steam_items = list(iter_nonempty([*review_items, *discussion_items]))
+                raw_items.extend(steam_items)
+                stats["steam_fetched"] = len(steam_items)
+            finally:
+                steam.close()
+
+        if "twitter" in enabled_platforms and settings.twitter_enabled:
+            try:
+                since_ts = None if full else _twitter_high_watermark_ts(conn)
+                existing_urls = _recent_twitter_post_urls(
+                    conn,
+                    settings.twitter_incremental_reply_parent_limit,
+                )
+                twitter = TwitterClient(_twitter_options(settings))
+                twitter_items = twitter.fetch_items(
+                    full=full,
+                    since_ts=since_ts,
+                    existing_post_urls=existing_urls,
+                )
+                raw_items.extend(twitter_items)
+                stats["twitter_fetched"] = len(twitter_items)
+            except Exception as exc:  # noqa: BLE001 - keep Steam and old Twitter data intact
+                stats["twitter_errors"] += 1
+                stats[f"twitter_error:{type(exc).__name__}"] += 1
+                messages.append(f"twitter: {exc}")
+
+        stats["fetched"] = len(raw_items)
+        analyzer = OpenRouterClient(settings)
+        try:
+            for item in raw_items:
+                raw_item_id, inserted = upsert_raw_item(conn, item)
+                prefix = item.source.split("_", 1)[0]
+                stats["inserted" if inserted else "seen"] += 1
+                stats[f"{prefix}_{'inserted' if inserted else 'seen'}"] += 1
+                if inserted:
+                    try:
+                        analysis = analyzer.analyze(item)
+                        save_analysis(conn, raw_item_id, settings.openrouter_model, analysis)
+                        stats["analyzed"] += 1
+                    except Exception as exc:  # noqa: BLE001 - keep item pending for retry
+                        conn.execute(
+                            "UPDATE raw_items SET analysis_status = 'error' WHERE id = ?",
+                            (raw_item_id,),
+                        )
+                        stats["analysis_errors"] += 1
+                        stats[f"analysis_error:{type(exc).__name__}"] += 1
+                conn.commit()
+        finally:
+            analyzer.close()
+
+        finished = _now()
+        status = "partial" if messages else "success"
+        conn.execute(
+            """
+            UPDATE sync_runs
+            SET finished_at = ?, status = ?, message = ?, stats_json = ?
+            WHERE id = ?
+            """,
+            (finished, status, "\n".join(messages), encode_json(dict(stats)), run_id),
+        )
+        if status == "success":
+            conn.execute(
+                """
+                INSERT INTO sync_state (key, value, updated_at)
+                VALUES ('last_sync_mode', ?, ?)
+                ON CONFLICT(key) DO UPDATE SET value = excluded.value, updated_at = excluded.updated_at
+                """,
+                (mode, finished),
+            )
+        return dict(stats)
+    except Exception as exc:
+        finished = _now()
+        conn.execute(
+            """
+            UPDATE sync_runs
+            SET finished_at = ?, status = 'failed', message = ?, stats_json = ?
+            WHERE id = ?
+            """,
+            (finished, str(exc), encode_json(dict(stats)), run_id),
+        )
+        raise
+
+
+def analyze_pending(
+    conn: sqlite3.Connection,
+    settings: Settings,
+    limit: int = 50,
+    since_ts: int | None = None,
+) -> dict[str, Any]:
+    init_db(conn)
+    analyzer = OpenRouterClient(settings)
+    stats: Counter[str] = Counter()
+    try:
+        params: list[Any] = []
+        since_clause = ""
+        if since_ts is not None:
+            since_clause = "AND COALESCE(published_at, collected_at) >= ?"
+            params.append(since_ts)
+        params.append(limit)
+        rows = conn.execute(
+            f"""
+            SELECT * FROM raw_items
+            WHERE analysis_status IN ('pending', 'error')
+            {since_clause}
+            ORDER BY COALESCE(published_at, collected_at) DESC, collected_at DESC, id DESC
+            LIMIT ?
+            """,
+            params,
+        ).fetchall()
+        for row in rows:
+            item = RawItem(
+                source=row["source"],
+                source_item_id=row["source_item_id"],
+                source_url=row["source_url"],
+                content_type=row["content_type"],
+                author_id=row["author_id"],
+                author_name=row["author_name"],
+                title=row["title"],
+                published_at=row["published_at"],
+                published_at_text=row["published_at_text"],
+                updated_at_source=row["updated_at_source"],
+                content=row["content"],
+                raw=decode_json(row["raw_json"], {}),
+            )
+            try:
+                analysis = analyzer.analyze(item)
+                save_analysis(conn, int(row["id"]), settings.openrouter_model, analysis)
+                stats["analyzed"] += 1
+                conn.commit()
+            except Exception as exc:  # noqa: BLE001
+                stats["analysis_errors"] += 1
+                stats[f"analysis_error:{type(exc).__name__}"] += 1
+        return dict(stats)
+    finally:
+        analyzer.close()
--- a/app/twitter.py
+++ b/app/twitter.py
@ -0,0 +1,246 @@
+from __future__ import annotations
+
+from dataclasses import dataclass
+import calendar
+import json
+from pathlib import Path
+import re
+import subprocess
+import sys
+import time
+from typing import Any, Iterable
+
+from .models import RawItem
+
+
+TWITTER_EPOCH_FORMAT = "%a %b %d %H:%M:%S +0000 %Y"
+NORMALIZED_DATE_FORMAT = "%Y-%m-%d %H:%M:%S"
+
+
+@dataclass(frozen=True)
+class TwitterScrapeOptions:
+    username: str
+    scraper_path: Path
+    output_dir: Path
+    browser_provider: str
+    full_max_no_new: int
+    incremental_max_no_new: int
+    thread_max_no_new: int
+    command_timeout_seconds: int
+    full_reply_post_limit: int
+    incremental_reply_parent_limit: int
+
+
+def parse_twitter_time(value: str | None) -> int | None:
+    if not value:
+        return None
+    text = value.strip()
+    for fmt in (NORMALIZED_DATE_FORMAT, TWITTER_EPOCH_FORMAT):
+        try:
+            parsed = time.strptime(text, fmt)
+            return calendar.timegm(parsed)
+        except ValueError:
+            continue
+    return None
+
+
+def _author_from_url(url: str | None) -> str | None:
+    if not url:
+        return None
+    match = re.search(r"(?:x\.com|twitter\.com)/([^/?#]+)/status/\d+", url)
+    if not match:
+        return None
+    value = match.group(1)
+    return value if value and value.lower() != "i" else None
+
+
+def _tweet_id_from_item(item: dict[str, Any]) -> str | None:
+    value = item.get("id")
+    if value:
+        return str(value)
+    url = str(item.get("url") or "")
+    match = re.search(r"/status/(\d+)", url)
+    return match.group(1) if match else None
+
+
+def _tweet_url(username: str, tweet_id: str) -> str:
+    return f"https://x.com/{username}/status/{tweet_id}"
+
+
+def _is_original_post(item: dict[str, Any]) -> bool:
+    return not bool(item.get("is_retweet"))
+
+
+class TwitterClient:
+    def __init__(self, options: TwitterScrapeOptions) -> None:
+        self.options = options
+
+    def fetch_items(
+        self,
+        *,
+        full: bool,
+        since_ts: int | None,
+        existing_post_urls: Iterable[str] = (),
+    ) -> list[RawItem]:
+        run_dir = self._new_run_dir()
+        timeline = self._fetch_timeline(run_dir, full=full)
+        timeline_items = [
+            self._post_to_item(item)
+            for item in timeline
+            if self._include_by_time(item, since_ts)
+        ]
+
+        reply_parent_urls = self._reply_parent_urls(
+            timeline=timeline,
+            full=full,
+            existing_post_urls=existing_post_urls,
+        )
+        reply_items: list[RawItem] = []
+        for parent_url in reply_parent_urls:
+            thread = self._fetch_thread(run_dir, parent_url)
+            parent_id = str(thread.get("main_tweet", {}).get("id") or self._id_from_url(parent_url) or "")
+            for reply in thread.get("replies") or []:
+                if self._include_by_time(reply, since_ts):
+                    reply_items.append(self._reply_to_item(reply, parent_id=parent_id, parent_url=parent_url))
+
+        return [item for item in [*timeline_items, *reply_items] if item.content.strip()]
+
+    def _new_run_dir(self) -> Path:
+        path = self.options.output_dir / time.strftime("%Y%m%d_%H%M%S")
+        path.mkdir(parents=True, exist_ok=True)
+        return path
+
+    def _fetch_timeline(self, run_dir: Path, *, full: bool) -> list[dict[str, Any]]:
+        max_no_new = self.options.full_max_no_new if full else self.options.incremental_max_no_new
+        self._run_scraper(self.options.username, run_dir, max_no_new=max_no_new)
+        path = run_dir / f"{self.options.username}_posts.json"
+        return self._read_json(path, expected="timeline posts")
+
+    def _fetch_thread(self, run_dir: Path, parent_url: str) -> dict[str, Any]:
+        tweet_id = self._id_from_url(parent_url)
+        if not tweet_id:
+            return {"main_tweet": None, "replies": [], "total_replies": 0}
+        self._run_scraper(parent_url, run_dir, max_no_new=self.options.thread_max_no_new)
+        path = run_dir / f"thread_{tweet_id}.json"
+        return self._read_json(path, expected=f"thread {tweet_id}")
+
+    def _run_scraper(self, target: str, run_dir: Path, *, max_no_new: int) -> None:
+        command = [
+            sys.executable,
+            str(self.options.scraper_path),
+            target,
+            "--max-no-new",
+            str(max_no_new),
+            "--output-dir",
+            str(run_dir),
+            "--browser-provider",
+            self.options.browser_provider,
+        ]
+        result = subprocess.run(
+            command,
+            cwd=Path.cwd(),
+            capture_output=True,
+            text=True,
+            encoding="utf-8",
+            errors="replace",
+            timeout=self.options.command_timeout_seconds,
+        )
+        output = "\n".join(part for part in [result.stdout, result.stderr] if part).strip()
+        if result.returncode != 0:
+            raise RuntimeError(f"Twitter scraper failed for {target}: {output[-1200:]}")
+        if "登录提示" in output or "未登录" in output or "login" in output.lower():
+            raise RuntimeError(
+                "Twitter scraper requires an authenticated X.com browser profile. "
+                "Run the configured social-media-scraper once with --keep-browser-open, "
+                "log in to X.com, then retry."
+            )
+
+    def _read_json(self, path: Path, *, expected: str) -> Any:
+        if not path.exists():
+            raise RuntimeError(f"Twitter scraper did not produce {expected}: {path}")
+        return json.loads(path.read_text(encoding="utf-8"))
+
+    def _reply_parent_urls(
+        self,
+        *,
+        timeline: list[dict[str, Any]],
+        full: bool,
+        existing_post_urls: Iterable[str],
+    ) -> list[str]:
+        urls: list[str] = []
+        for item in timeline:
+            tweet_id = _tweet_id_from_item(item)
+            url = item.get("url") or (_tweet_url(self.options.username, tweet_id) if tweet_id else "")
+            if url and _is_original_post(item):
+                urls.append(str(url))
+
+        if not full:
+            urls.extend(str(url) for url in existing_post_urls if url)
+
+        seen: set[str] = set()
+        unique_urls: list[str] = []
+        for url in urls:
+            if url not in seen:
+                seen.add(url)
+                unique_urls.append(url)
+
+        limit = self.options.full_reply_post_limit if full else self.options.incremental_reply_parent_limit
+        if limit > 0:
+            return unique_urls[:limit]
+        return unique_urls
+
+    def _post_to_item(self, item: dict[str, Any]) -> RawItem:
+        tweet_id = _tweet_id_from_item(item) or ""
+        url = item.get("url") or _tweet_url(self.options.username, tweet_id)
+        author = _author_from_url(str(url)) or self.options.username
+        raw = dict(item)
+        raw["source_url"] = url
+        return RawItem(
+            source="twitter_posts",
+            source_item_id=f"post:{tweet_id}",
+            source_url=str(url),
+            content_type="twitter_post",
+            author_id=author,
+            author_name=author,
+            title=None,
+            published_at=parse_twitter_time(item.get("date")),
+            published_at_text=item.get("date"),
+            updated_at_source=None,
+            content=str(item.get("text") or ""),
+            raw=raw,
+        )
+
+    def _reply_to_item(self, item: dict[str, Any], *, parent_id: str, parent_url: str) -> RawItem:
+        tweet_id = _tweet_id_from_item(item) or ""
+        url = item.get("url") or _tweet_url(_author_from_url(parent_url) or self.options.username, tweet_id)
+        author = _author_from_url(str(url)) or str(item.get("in_reply_to") or "")
+        raw = dict(item)
+        raw["parent_tweet_id"] = parent_id
+        raw["parent_url"] = parent_url
+        raw["source_url"] = url
+        return RawItem(
+            source="twitter_replies",
+            source_item_id=f"reply:{tweet_id}",
+            source_url=str(url),
+            content_type="twitter_reply",
+            author_id=author or None,
+            author_name=author or None,
+            title=f"Reply to {parent_id}" if parent_id else None,
+            published_at=parse_twitter_time(item.get("date")),
+            published_at_text=item.get("date"),
+            updated_at_source=None,
+            content=str(item.get("text") or ""),
+            raw=raw,
+        )
+
+    def _include_by_time(self, item: dict[str, Any], since_ts: int | None) -> bool:
+        if since_ts is None:
+            return True
+        published_at = parse_twitter_time(item.get("date"))
+        if published_at is None:
+            return True
+        return published_at >= since_ts
+
+    def _id_from_url(self, url: str) -> str | None:
+        match = re.search(r"/status/(\d+)", url)
+        return match.group(1) if match else None
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,8 @@
+beautifulsoup4==4.12.3
+fastapi==0.115.6
+httpx==0.28.1
+python-multipart==0.0.20
+python-dotenv==1.0.1
+playwright==1.56.0
+requests==2.31.0
+uvicorn==0.34.0
--- a/任务/方案/steam社区监控一期计划.md
+++ b/任务/方案/steam社区监控一期计划.md
@ -0,0 +1,307 @@
+# Steam 社区监控一期计划
+
+## 目标
+
+第一阶段先接入 Steam 两个信息源：
+
+1. Steam 评测信息
+2. Steam 讨论社区信息：`https://steamcommunity.com/app/3774440/discussions`
+
+系统每 30 分钟刷新一次。第一轮全量抓取 Steam 评测、讨论区主题和讨论区回复；后续只做增量更新。所有新增内容调用 OpenRouter 的 `deepseek/deepseek-v4-pro` 做分类和回复必要性评估，并在 dashboard 中展示、筛选、高亮和追踪人工处理状态。
+
+## 已确认事实
+
+| 判断 | 类型 | 证据 | 决策影响 |
+|---|---|---|---|
+| AppID 为 `3774440` 的 Steam 评测 API 当前有数据 | 当前事实 | 本地请求 `https://store.steampowered.com/appreviews/3774440?...` 成功，返回 `total_reviews=130`、`review_score_desc=Very Positive` | 一期可以直接接入评测 API |
+| Steam 讨论区页面当前可访问 | 当前事实 | 本地请求 `https://steamcommunity.com/app/3774440/discussions/` 返回 HTTP 200，页面包含 forum/topic 内容 | 一期可以用 HTTP + HTML 解析抓讨论区 |
+| `deepseek/deepseek-v4-pro` 当前存在于 OpenRouter 模型列表 | 当前事实 | 本地请求 OpenRouter models API 返回该模型，支持 `response_format` 和 `structured_outputs` | 一期可按结构化 JSON 分类设计 |
+| Steam 评测数量存在口径差异风险 | 经验事实 | 用户级经验记录：Steam `appreviews` 受缓存、语言、购买类型和索引延迟影响 | 统计口径不能只依赖单一请求 |
+
+## 一期范围
+
+### 做
+
+- 每 30 分钟刷新 Steam 评测和 Steam 讨论区。
+- 第一轮全量抓取；后续增量抓取新增或更新内容。
+- 对 Steam 评测、讨论区主题、讨论区回复分别去重入库。
+- 调用 OpenRouter 模型输出结构化分类结果。
+- Dashboard 展示评论/帖子/回复列表、分类结果、原始链接、回复建议和人工处理状态。
+- 支持本机运行，架构上预留服务器部署。
+
+### 暂不做
+
+- 暂不接入 Steam 以外社区。
+- 暂不做复杂账号权限系统；服务器部署前再补认证方案。
+- 暂不自动回复玩家，只做信息发现、分类和处理追踪。
+- 暂不做语言筛选；所有语言统一进入采集和模型评估。
+
+## 采集流程
+
+### Steam 评测
+
+使用 Steam Store Reviews API：
+
+```text
+GET https://store.steampowered.com/appreviews/3774440
+```
+
+基础参数：
+
+- `json=1`
+- `num_per_page=100`
+- `language=all`
+- `filter=recent`
+- `purchase_type=all`
+- `cursor=*` 起步，后续使用响应中的 cursor 翻页
+
+评测去重主键：
+
+- `steam_review:{recommendationid}`
+
+评测建议保留字段：
+
+- `recommendationid`
+- `voted_up`
+- `review`
+- `language`
+- `timestamp_created`
+- `timestamp_updated`
+- `author.steamid`
+- `author.personaname`
+- `author.profile_url`
+- `author.playtime_forever`
+- `votes_up`
+- `comment_count`
+- `steam_purchase`
+- `received_for_free`
+- `source_url`
+
+评测链接可由 `recommendationid` 构造：
+
+```text
+https://steamcommunity.com/profiles/{steamid}/recommended/3774440/#developer_response
+```
+
+若用户 profile URL 可用，也应保留原始 `profile_url` 作为辅助追溯字段。
+
+### Steam 讨论区
+
+使用 HTTP 请求讨论区列表页：
+
+```text
+https://steamcommunity.com/app/3774440/discussions/
+```
+
+翻页参数：
+
+```text
+?fp=2
+?fp=3
+```
+
+第一轮抓取所有可访问讨论页和所有可访问回复。后续增量刷新时，从最新列表页开始向后翻页，直到遇到本地已存在且未更新的主题为止；若 Steam 页面无法稳定判断更新时间，则以最近若干页作为增量窗口，并保留手动全量重扫入口。
+
+讨论区去重主键：
+
+- 主题：`steam_discussion_topic:{topic_id}`
+- 回复：`steam_discussion_reply:{topic_id}:{reply_id}`，如果页面拿不到稳定 reply id，则用 `topic_id + author + timestamp + content_hash`
+
+讨论区建议保留字段：
+
+- `topic_id`
+- `topic_url`
+- `title`
+- `author`
+- `published_at_text`
+- `content`
+- `reply_count`
+- `reply_author`
+- `reply_time_text`
+- `reply_content`
+- `reply_url`
+- `source_url`
+
+## 数据模型
+
+建议先用 SQLite 跑通本机版本；部署服务器时可迁移 PostgreSQL。
+
+核心表可以先压成三类：
+
+### `raw_items`
+
+保存原始社区内容及来源信息。
+
+关键字段：
+
+- `id`
+- `source`
+- `source_item_id`
+- `source_url`
+- `content_type`
+- `author_id`
+- `author_name`
+- `published_at`
+- `collected_at`
+- `content`
+- `raw_json`
+- `content_hash`
+
+### `analysis_results`
+
+保存模型分类结果。
+
+关键字段：
+
+- `raw_item_id`
+- `model`
+- `sentiment`
+- `is_positive`
+- `is_negative`
+- `has_actionable_feedback`
+- `feedback_types`
+- `reply_recommended`
+- `reply_priority`
+- `reply_suggestion`
+- `summary`
+- `priority`
+- `confidence`
+- `model_json`
+- `analyzed_at`
+
+### `work_items`
+
+保存人工处理状态。
+
+关键字段：
+
+- `raw_item_id`
+- `status`
+- `owner`
+- `notes`
+- `last_handled_at`
+- `created_at`
+- `updated_at`
+
+状态枚举建议：
+
+- `new`
+- `read`
+- `needs_reply`
+- `replied`
+- `needs_fix`
+- `archived`
+
+## OpenRouter 分类方案
+
+模型：
+
+```text
+deepseek/deepseek-v4-pro
+```
+
+OpenRouter Key：
+
+- 本机和服务器都使用 `.env` / 环境变量读取，不在项目文件中明文保存。
+- 用户级 `auth.json` 只作为本机开发时迁移 key 的来源，不作为项目运行时依赖。
+- 推荐变量名：`OPENROUTER_API_KEY`。
+
+目标输出 JSON：
+
+```json
+{
+  "sentiment": "positive | negative | mixed | neutral",
+  "is_positive": true,
+  "is_negative": false,
+  "has_actionable_feedback": true,
+  "feedback_types": ["bug", "suggestion", "balance", "ui", "localization", "performance", "pricing", "content", "question", "other"],
+  "reply_recommended": true,
+  "reply_priority": "none | low | medium | high",
+  "reply_suggestion": "建议运营或开发如何回复；不需要回复时为空字符串",
+  "summary": "一句话摘要",
+  "priority": "low | medium | high",
+  "confidence": 0.0,
+  "reason": "简短分类依据"
+}
+```
+
+分类规则：
+
+- `is_positive` / `is_negative` 对应用户要求的好评、差评展示。
+- `has_actionable_feedback=true` 表示包含具体建议、问题反馈、bug、平衡性、UI、翻译、本地化、性能、价格、内容量等可处理信息。
+- `reply_recommended=true` 表示建议人工回复或处理，高优先级内容需要在 dashboard 高亮。
+- 讨论区主题和回复都必须进入模型评估；不能只评估主题原帖。
+- Steam 评测本身的 `voted_up` 作为强信号，但不要覆盖文本判断；例如推荐评测里也可能包含具体差评点。
+- 每条结果必须保留 `source_url`，dashboard 中直接跳转原始评论或讨论帖。
+
+## Dashboard 一期页面
+
+第一版页面不追求复杂，重点是运营处理效率。
+
+建议视图：
+
+- 总览指标：新增数量、未处理数量、差评数量、具体反馈数量、高优先级数量、已分析数量、待补跑数量、最近更新时间。
+- 内容列表：来源、内容类型、时间、作者、摘要、情绪、反馈类型、优先级、是否建议回复、处理状态、原始链接。
+- 筛选：信息源、内容类型、情绪、是否具体反馈、是否建议回复、反馈类型、处理状态、时间范围。
+- 高亮：`reply_recommended=true` 或 `priority=high` 的帖子/回复。
+- 详情：原文、模型分类、回复建议、原始链接、备注、负责人、状态变更。
+- 排序：建议回复优先；同组内按发布时间新到旧。
+
+## 定时与失败处理
+
+定时：
+
+- 默认每 30 分钟执行一次采集任务。
+- 第一轮执行全量抓取；全量完成后记录同步游标、已见主题、已见回复和评测 cursor/时间水位。
+- 首轮全量建议支持断点续跑：每完成一页讨论列表、一个主题详情、一个评测分页后写入进度，失败后从最近进度恢复。
+- 首轮全量不建议设置过小页数上限，否则会破坏“全抓”目标；建议设置安全保护，例如单次最多连续运行 2 小时或最多抓取 500 页，并允许下次继续。
+- 本机先用应用内 scheduler 或命令行手动触发验证；服务器部署时再选 systemd timer、cron 或队列 worker。
+
+失败处理：
+
+- Steam 请求失败：记录错误，下一轮重试，不删除旧数据。
+- OpenRouter 请求失败：保留 raw item，标记 `analysis_pending`，下一轮或手动补跑。
+- JSON 解析失败：保存模型原始输出，进入待复核状态。
+- 重复采集：通过 source item id 和 content hash 去重。
+
+## 部署前提
+
+本机 MVP：
+
+- 本地数据库
+- 本地 dashboard
+- 从 `.env` 读取 OpenRouter API Key
+- 手动或定时刷新
+
+服务器部署前需要补充：
+
+- 访问认证
+- 持久化数据库位置和备份策略
+- 后台任务运行方式
+- 日志与错误告警
+- OpenRouter 调用预算和速率控制
+- Steam 抓取频率和 User-Agent 策略
+
+## 已定实现决策
+
+- 密钥配置：使用 `.env` / 环境变量，变量名 `OPENROUTER_API_KEY`。
+- 首轮抓取：全量抓取，支持断点续跑；用运行时间或高页数阈值做安全保护，不用小页数上限替代全量目标。
+- 负责人字段：按小团队制作人/处理人文本字段设计，暂不接用户账号系统。
+
+## 当前实现状态
+
+- 已实现 Python/FastAPI + SQLite MVP。
+- 已实现 Steam 评测 API 抓取。
+- 已实现 Steam 讨论区主题与回复抓取。
+- 已实现 OpenRouter `deepseek/deepseek-v4-pro` 结构化分类。
+- 已实现 dashboard、手动同步、后台 30 分钟增量同步、处理状态更新。
+- 已实现局域网服务监听 `0.0.0.0:8000`。
+- 已实现 Steam 讨论区中文时间解析，支持 `x 小时以前`、`3 月 7 日 下午 4:52`、`2025 年 8 月 9 日 下午 3:29`。
+- 已补跑完成 2026-05-01 之后 209 条内容的 AI 分析。
+
+## 后续平台接入约束
+
+- 新平台不要复制 Steam 私有逻辑；应新增平台采集器，输出统一 `RawItem`。
+- 新平台继续复用 `raw_items`、`analysis_results`、`work_items`。
+- 每个平台必须明确稳定去重主键、原始链接、发布时间解析、首轮全量和后续增量策略。
+- 需要登录态或浏览器自动化的平台，先单独做方案和当前事实验证，再接入同步链路。
--- a/任务/方案/后续社区平台接入指南.md
+++ b/任务/方案/后续社区平台接入指南.md
@ -0,0 +1,67 @@
+# 后续社区平台接入指南
+
+## 当前架构
+
+当前 MVP 是 Python/FastAPI + SQLite：
+
+- `app/main.py`：dashboard、手动同步、补跑分析、处理状态更新、后台 30 分钟增量同步。
+- `app/steam.py`：Steam 评测、讨论区主题和回复采集器。
+- `app/sync.py`：统一同步流程、入库去重、调用模型分析、补跑分析。
+- `app/openrouter.py`：OpenRouter `deepseek/deepseek-v4-pro` 结构化分类。
+- `app/db.py`：SQLite schema。
+- `app/models.py`：统一原始内容对象 `RawItem`。
+- `app/cli.py`：命令行入口。
+
+## 统一数据流
+
+```text
+平台采集器 -> RawItem -> raw_items -> OpenRouter -> analysis_results -> work_items -> dashboard
+```
+
+新平台不要直接改 dashboard 数据结构。优先让平台采集器输出 `RawItem`，复用现有同步和分析流程。
+
+## RawItem 字段约定
+
+新增平台采集器至少要提供：
+
+- `source`：平台标识，例如 `steam_reviews`、`steam_discussions`。
+- `source_item_id`：稳定去重主键，必须包含平台和内容 ID。
+- `source_url`：能跳回原始内容的链接。
+- `content_type`：内容类型，例如 `review`、`discussion_topic`、`discussion_reply`。
+- `author_id` / `author_name`：能取到多少填多少。
+- `title`：帖子标题，没有则为空。
+- `published_at`：Unix 时间戳，优先提供。
+- `published_at_text`：平台原始时间文本。
+- `updated_at_source`：平台原始更新时间，没有则为空。
+- `content`：送入模型分析的正文。
+- `raw`：平台原始字段 JSON。
+
+## 新平台接入步骤
+
+1. 验证当前事实：页面/API 是否可访问、是否需要登录态、是否有频率限制。
+2. 定义内容类型和去重主键。
+3. 实现平台采集器，输出 `list[RawItem]`。
+4. 在 `app/sync.py` 中接入采集器，保持失败不删除旧数据。
+5. 跑小样本 smoke test：抓取、去重、AI 分析、dashboard 展示。
+6. 再做首轮全量策略和后续增量策略。
+
+## 已知实现决策
+
+- AI 模型：OpenRouter `deepseek/deepseek-v4-pro`。
+- Key：`.env` / 环境变量 `OPENROUTER_API_KEY`。
+- Dashboard 排序：建议回复优先，同组内按发布时间新到旧。
+- 补跑分析：每批最多 20 条，按 `published_at/collected_at` 新到旧。
+- 局域网服务：`python -m uvicorn app.main:app --host 0.0.0.0 --port 8000`。
+- 当前无登录认证，开放到局域网有修改处理状态风险。
+
+## 新平台方案必须回答
+
+- 这个平台监控的运营目的是什么？
+- 抓哪些内容类型？
+- 首轮是否全量？全量边界是什么？
+- 后续增量根据什么停止？
+- 原始链接如何生成？
+- 发布时间是否可解析？相对时间如何处理？
+- 是否要抓回复/评论楼中楼？
+- 是否需要登录态、cookie、API key 或浏览器自动化？
+- 失败、限流和重复采集如何处理？