优化重连CD
This commit is contained in:
parent
e2d4231bd3
commit
25d89aae0b
@ -24,8 +24,10 @@ Do not let one peer advance local game state unless the matching network send/re
|
||||
- `Unity/Assets/Scripts/TH1_Logic/Core/Main.cs`
|
||||
- `Unity/Assets/Scripts/TH1_Logic/Action/ActionLogic.cs`
|
||||
- `Unity/Assets/Scripts/TH1_Instance/Timer.cs`
|
||||
- `Unity/Assets/Scripts/TH1_Logic/Editor/NetworkStressEditorWindow.cs` when changing or interpreting network stress tests
|
||||
|
||||
For the current network contract summary, read `references/network-contract.md`.
|
||||
For the one-click Steam P2P stress tool, report fields, and current healthy baseline, read `references/network-stress.md`.
|
||||
|
||||
## Workflow
|
||||
|
||||
@ -67,6 +69,13 @@ For the current network contract summary, read `references/network-contract.md`.
|
||||
- If incoming `GameStart` or `ForceUpdate` validation fails, do not hide room UI and do not leave the game in `ForceUpdating`.
|
||||
- Restore previous game state if `NetResumeMatch` fails.
|
||||
|
||||
8. Preserve the stress-test path.
|
||||
- `NetworkStressMessage` is a diagnostics message and must not mutate gameplay state.
|
||||
- `GameNetReceiver` should ignore `P2PMsgType.NetworkStress`; the editor tool listens through `SimpleP2P.OnMessageReceivedEvent`.
|
||||
- Stress probes must still use `Lobby.BroadcastMessage` / `SendMessageToPeer` so they cover the real ordered queue and large-message chunking path.
|
||||
- Keep per-run `RunId` isolation so stale packets or ACKs from a previous test cannot pollute a new report.
|
||||
- The host should export after collecting client reports or after the configured report wait timeout; clients should retry report sends briefly after test completion.
|
||||
|
||||
## Checks Before Finishing
|
||||
|
||||
Run:
|
||||
@ -75,6 +84,12 @@ Run:
|
||||
dotnet build Unity/Assembly-CSharp.csproj --no-restore
|
||||
```
|
||||
|
||||
If editor tooling changed, also run:
|
||||
|
||||
```powershell
|
||||
dotnet build Unity/Assembly-CSharp-Editor.csproj --no-restore
|
||||
```
|
||||
|
||||
For network-heavy changes, inspect these risks explicitly:
|
||||
|
||||
- Could a critical broadcast partially enqueue?
|
||||
@ -82,6 +97,7 @@ For network-heavy changes, inspect these risks explicitly:
|
||||
- Could `MapData` deserialize with missing core fields and still be used?
|
||||
- Could a timer callback fire after the target UI/object state is gone?
|
||||
- Could a retry loop or ordered gap wait forever?
|
||||
- If stress tooling or queue throughput changed, does the one-click two-machine report meet the baseline in `references/network-stress.md`?
|
||||
|
||||
## What Not To Do
|
||||
|
||||
|
||||
@ -9,6 +9,7 @@ This reference summarizes the multiplayer contract after the May 2026 pre-releas
|
||||
- Ordered envelopes are used for game payloads; do not bypass them for gameplay sync.
|
||||
- Outgoing sequence is committed only after enqueue succeeds.
|
||||
- Large payloads are chunked after ordered wrapping.
|
||||
- Outgoing queue processing may send multiple messages per update within a per-frame message/byte budget, but must preserve FIFO per peer.
|
||||
- Incoming large chunks must validate magic, version, message id, chunk index, chunk count, total length, and payload length.
|
||||
- Large incoming messages and outgoing queues have per-peer and global byte budgets.
|
||||
- Ordered gaps and large-message receives must timeout and disconnect/clean state rather than wait forever.
|
||||
@ -33,11 +34,22 @@ This reference summarizes the multiplayer contract after the May 2026 pre-releas
|
||||
## GameNetReceiver
|
||||
|
||||
- Wrap deserialization and dispatch in try/catch.
|
||||
- Ignore `P2PMsgType.NetworkStress`; the editor stress tool consumes diagnostics packets through `SimpleP2P.OnMessageReceivedEvent` before gameplay dispatch.
|
||||
- Validate incoming `GameStart` and `ForceUpdate` maps before applying.
|
||||
- `NetStartGame` and `NetResumeMatch` return `bool`; UI should only close/hide after success.
|
||||
- `ForceUpdate` should restore previous game state if resume fails.
|
||||
- `MapConfirm` must guard null maps, missing actions, and null action payloads.
|
||||
|
||||
## Network Stress Tool
|
||||
|
||||
- Tool path: `Unity/Assets/Scripts/TH1_Logic/Editor/NetworkStressEditorWindow.cs`.
|
||||
- Diagnostics message: `NetworkStressMessage` / `P2PMsgType.NetworkStress`.
|
||||
- The tool must send probes through `Lobby.BroadcastMessage` and `Lobby.SendMessageToPeer`, not through a raw side channel.
|
||||
- The host starts one-click tests; clients auto-start on `ControlStart`.
|
||||
- Reports are keyed by `RunId`; stale packets, ACKs, and reports from previous runs must be ignored.
|
||||
- Host export should contain one report per lobby member when clients are reachable. If a report is late, host may update the same export file after receipt.
|
||||
- Current default test is a one-minute flow: 50 seconds of traffic and up to 10 seconds of report collection.
|
||||
|
||||
## MapData And NetData
|
||||
|
||||
- `MapData.DeserializedMissingCriticalData` means the save/network map must not be used.
|
||||
|
||||
107
.codex/skills/th1-network-sync/references/network-stress.md
Normal file
107
.codex/skills/th1-network-sync/references/network-stress.md
Normal file
@ -0,0 +1,107 @@
|
||||
# TH1 Network Stress Tool
|
||||
|
||||
This reference documents the Steam P2P stress-test tool and the current healthy baseline after the May 15, 2026 tuning pass.
|
||||
|
||||
## Tool
|
||||
|
||||
- Editor window: `Tools/Steam 网络压测工具`.
|
||||
- Source: `Unity/Assets/Scripts/TH1_Logic/Editor/NetworkStressEditorWindow.cs`.
|
||||
- Message type: `P2PMsgType.NetworkStress` / `NetworkStressMessage`.
|
||||
- Runtime receiver: `GameNetReceiver` must ignore stress messages so diagnostics never mutate game state.
|
||||
- Diagnostics listener: the editor window consumes stress messages from `SimpleP2P.OnMessageReceivedEvent`.
|
||||
|
||||
## One-Click Flow
|
||||
|
||||
1. Start Unity Play Mode on every machine.
|
||||
2. Join every machine into the same Steam lobby.
|
||||
3. Open `Tools/Steam 网络压测工具` on every machine.
|
||||
4. Host clicks `开始 60 秒一键压测`.
|
||||
5. Clients auto-start when receiving `ControlStart`.
|
||||
6. Traffic runs for 50 seconds.
|
||||
7. Host waits up to 10 seconds for client reports.
|
||||
8. Host exports `Unity/NetworkStressReports/TH1NetworkStress_*.json`.
|
||||
|
||||
Clients also export local reports and retry sending their report to host for a short period. Host reports should normally contain one `reports[]` entry per lobby member.
|
||||
|
||||
## Default Load
|
||||
|
||||
- `sendSeconds`: 50
|
||||
- `totalSeconds`: 60
|
||||
- `messagesPerSecond`: 10
|
||||
- `smallPayloadBytes`: 512
|
||||
- `largePayloadBytes`: 131072
|
||||
- `largeEveryMessages`: 120
|
||||
- `DropPercent`: 1
|
||||
- `UnreliablePercent`: 2
|
||||
- `JitterMaxMs`: 80
|
||||
|
||||
This is intentionally a sustainable smoke/stress mix, not a queue-destruction test. Earlier `30 msg/s` with `256 KB` large packets caused queue backlog and P95 latencies around 45 seconds; do not use that as the default acceptance load.
|
||||
|
||||
## Report Interpretation
|
||||
|
||||
Healthy signs:
|
||||
|
||||
- `reports.length` equals current lobby member count.
|
||||
- `sendFailed == 0`
|
||||
- `sendFailureEvents == 0`
|
||||
- `connectionErrors == 0`
|
||||
- `payloadErrors == 0`
|
||||
- `duplicates == 0`
|
||||
- `outOfOrder == 0`
|
||||
- ACK success rate should be about 99% or better under the default two-machine load.
|
||||
|
||||
Expected or acceptable signs:
|
||||
|
||||
- `sequenceGaps` may be non-zero because the tool injects application-level drops.
|
||||
- A very small `ackTimeouts` count at the send/stop boundary can be acceptable if all failure/error counters are zero.
|
||||
- `state` in embedded reports may reflect the local state at report generation time; prefer counters over display state when diagnosing transport health.
|
||||
|
||||
Risk signs:
|
||||
|
||||
- Any `payloadErrors` means data corruption or message reconstruction/deserialization mismatch.
|
||||
- Any `duplicates` means duplicate delivery reached the diagnostics layer.
|
||||
- Any unexpected `outOfOrder` under the ordered path needs investigation.
|
||||
- Sustained `ackTimeouts` above roughly 1% of `sentEnqueued` means queue backlog, report timing, or connection degradation.
|
||||
- `sendFailed`, `sendFailureEvents`, or `connectionErrors` should be treated as transport failures unless they were intentionally caused by leaving the lobby.
|
||||
|
||||
## May 15, 2026 Two-Machine Baseline
|
||||
|
||||
Latest healthy report: `Unity/NetworkStressReports/TH1NetworkStress_20260515_174728.json`.
|
||||
|
||||
Host-side report:
|
||||
|
||||
- `sentAttempted`: 494
|
||||
- `sentEnqueued`: 494
|
||||
- `sendFailed`: 0
|
||||
- `connectionErrors`: 0
|
||||
- `payloadErrors`: 0
|
||||
- `duplicates`: 0
|
||||
- `outOfOrder`: 0
|
||||
- `ackReceived`: 493
|
||||
- `ackTimeouts`: 1
|
||||
- `latencyAvgMs`: about 942
|
||||
- `latencyP95Ms`: about 6202
|
||||
|
||||
Client-side report:
|
||||
|
||||
- `sentAttempted`: 494
|
||||
- `sentEnqueued`: 494
|
||||
- `sendFailed`: 0
|
||||
- `connectionErrors`: 0
|
||||
- `payloadErrors`: 0
|
||||
- `duplicates`: 0
|
||||
- `outOfOrder`: 0
|
||||
- `ackReceived`: 493
|
||||
- `ackTimeouts`: 1
|
||||
- `latencyAvgMs`: about 938
|
||||
- `latencyP95Ms`: about 6183
|
||||
|
||||
Conclusion for that run: the transport path was healthy under the default one-click load. There was no evidence of corruption, duplicate delivery, ordered-path reordering, send failure, or connection failure. The single ACK timeout per side is acceptable as a boundary artifact unless it grows in later runs.
|
||||
|
||||
## Tuning Notes
|
||||
|
||||
- Stress probes must use `Lobby.BroadcastMessage` / `SendMessageToPeer` so they cover real lobby preflight, ordered wrapping, large chunking, queue budgets, and failure events.
|
||||
- Do not use raw Steam send paths for stress probes unless explicitly testing an invite/non-game side channel.
|
||||
- `SimpleP2P.ProcessOutgoingMessageQueue` may batch messages per update, but preserve per-peer FIFO and budget limits.
|
||||
- Reports share the same P2P path as normal diagnostics; keep the post-traffic report wait and client report retry behavior.
|
||||
- Use `RunId` on every control/probe/ACK/report message; ignore packets from stale runs.
|
||||
@ -22,6 +22,8 @@ namespace TH1_Logic.Steam
|
||||
{
|
||||
public static GameNetSender Instance { get; } = new GameNetSender();
|
||||
private static float RecordTime;
|
||||
private const float RequestForceUpdateCooldown = 5f;
|
||||
private float _lastRequestForceUpdateTime = -RequestForceUpdateCooldown;
|
||||
|
||||
// 发送消息给房主
|
||||
public bool SendMessage(BaseMessage message)
|
||||
@ -271,6 +273,14 @@ namespace TH1_Logic.Steam
|
||||
public void SendRequestForceUpdate()
|
||||
{
|
||||
if (Main.Instance.GameLogic.GetCurState() == GameState.ForceUpdating) return;
|
||||
|
||||
var now = UnityEngine.Time.time;
|
||||
if (now - _lastRequestForceUpdateTime < RequestForceUpdateCooldown)
|
||||
{
|
||||
LogSystem.LogWarning($"客户端请求重连冷却中: SendRequestForceUpdate, remain={RequestForceUpdateCooldown - (now - _lastRequestForceUpdateTime):F1}s");
|
||||
return;
|
||||
}
|
||||
_lastRequestForceUpdateTime = now;
|
||||
|
||||
LogSystem.LogError($"客户端请求重连: SendRequestForceUpdate");
|
||||
var data = new RequestForceUpdateMessage();
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user