Matching workflow

docs/matching-workflow.md @ develop
View on GitHub

Per-function matching workflow

This is the loop a contributor (human or agent) runs when picking up one row of config/ffxivgame.yaml.

0. Prerequisites

  • make split has run (asm + symbol map generated, Phase 1).
  • tools/cl-wine.sh is configured and the Rosetta Stone function matches (Phase 2).
  • objdiff is installed and configured.

1. Claim the function

Reserve the VA before you start so two contributors don't match the same function. Post a comment on the coordination issue:

/claim FUN_004a1230 ffxivgame

Wait for the bot to reply granted (or extended, if it's already yours). The VA token may be FUN_<va>, 0x<va>, or bare hex; the binary stem defaults to ffxivgame. The full lifecycle — 60h lease, heartbeat, PR pinning, sweeper expiry, merge-time auto-release — is in claim-protocol.md.

Not by editing YAML. config/<bin>.yaml is gitignored and regenerated by make split, so a status: wip edit there is not a durable claim and gets lost on the next split. The /claim comment is the only claim mechanism.

Each row of the (locally regenerated) config/ffxivgame.yaml work pool still tells you what to match:

- rva: 0x004a1230
  end: 0x004a12a0
  size: 0x70
  module: net/blowfish
  symbol: Blowfish::Init
  type: matching                # matching | functional | middleware-crt | ...
  status: unmatched              # unmatched | matched | functional
  owner: null

2. Read the disassembly

asm/ffxivgame/004a1230_Blowfish__Init.s is the per-function dump written by tools/ghidra_scripts/dump_functions.py. It's already RVA-rebased and labeled with whatever symbols the seed sources have provided.

Quick reads:

  • Calling convention: __cdecl (caller cleans), __stdcall (callee cleans, ret 0xN), __fastcall (ecx/edx pre-loaded), or member function __thiscall (this in ecx).
  • Local stack frame size: from the sub esp, 0xN in the prologue.
  • Return type guess: from how eax is used at ret.

3. Pull Ghidra's pseudo-C

Open the Ghidra project (build/ghidra/ffxivgame.gpr) and copy the function's decompiled view into a scratch buffer. Ghidra's output is a strong starting point but never byte-correct; you rewrite it.

4. Write the C/C++

src/ffxivgame/<module>/<symbol>.cpp — one function per file in the early days for clean per-function diffing. Add the AGPL header. Add #includes. Replace iVar1 / local_4 with meaningful names. Replace integer constants with named enums where possible (e.g. opcodes defined in include/net/opcodes.h).

For matching modules, pick the simplest plausible C — extra if (x) { y; } vs if (x) y; can affect codegen, but more importantly the structure (loop type, guard pattern, return-style) maps to specific codegen idioms. When in doubt, mirror the Ghidra structure literally; refactor for readability after it matches.

5. Build the function

make src/ffxivgame/net/blowfish/Init.obj

The Makefile invokes tools/cl-wine.sh with the locked MSVC_FLAGS= and produces a single .obj per .cpp.

6. Diff

For matching functions:

make diff FUNC=Blowfish::Init

tools/compare.py invokes objdiff with the original .text slice for that RVA range against the new .obj. Output: per-line diff + a one-line OK/PARTIAL/MISMATCH verdict.

For functional functions:

make test FUNC=Blowfish::Init

Runs tests/net/blowfish/init_test.cpp (a small main that loads a known input and asserts the output bytes). The behavioural fixture either comes from a packet capture (captures/) or is hand-written from a Project Meteor reference.

7. Iterate

If matching fails, the canonical bag of tricks (in rough order of how often they're the cause):

SymptomFix
Wrong register allocationReorder local declarations; MSVC allocates in source order.
Off-by-one stack frameAdd a dead local of the right type; sometimes a temp the optimiser leaves materialised.
Branch direction flippedNegate the condition: if (x) A; else B;if (!x) B; else A;. MSVC emits the first arm's branch unconditionally and the second arm with a forward jump.
Missing __stdcall / __cdecl mismatchCheck the calling convention against the prologue's ret N.
Member fn looks __cdeclShould be __thiscall. Use a class member declaration.
FP code mismatchedMSVC 2005 uses x87, not SSE2 by default. Don't /arch:SSE2.
if (a && b) vs if (a) if (b)Both are valid; MSVC's order-of-evaluation lowering can pick either. Try the alternative.
for vs whileSame loop body, different prologue. Try both.
Switch jump tableMSVC builds a jump table at >=4 cases, dense by default. Add cases / reorder until the table layout matches.
String literal positionsIf .rdata strings are coming out at different offsets, pool them with __declspec(selectany) or check /GF (string pooling).
__security_cookie dropoutsFunction had /GS enabled but you didn't add a buffer big enough. Add a char buf[5] local; /GS triggers cookies for any local array of size 5+ bytes.
Tail call missingMSVC 2005 doesn't tail-call by default; use __forceinline on the callee or if(...) return f(); form.

8. Commit

One commit per function in the early phases. Subject:

decomp: match Blowfish::Init @0x004a1230

Body: brief notes on which compiler flag combo / refactor was needed, any unusual idioms (helps the next contributor recognise the same shape).

For functional decomps:

decomp: functional ComputeDamage @0x008c5a40

Behavioural fixture: tests/battle/compute_damage_test.cpp asserts
against three damage samples drawn from ffxiv_youtube_atlas_context.md
(Plumage 47-51, Cure +210-+240, Chaos Thrust 89-104).

9. Open the PR and ship

Open a PR from your fork into develop (title as in §8). The PR adds your one src/<bin>/_rosetta/FUN_<va>.cpp. You do not hand-edit any status or progress file:

  • Opening the PR pins your claim (the lease stops expiring) via claim-pr.yml.
  • Merging it auto-releases the claim (reconcile.ymltools/claim.py release-solved) and regenerates the progress regions in README.md / docs/decomp-status.md / PLAN.md from the committed _rosetta tree (make reconcile + make update-docs).
  • Closing it un-merged frees the VA back to the pool.

The committed _rosetta tree is the solved set — there is no status: matched to flip. The work pool's status: column is no longer the source of truth; see claim-protocol.md and the README's "Headline numbers" note.


Back to meteor-decomp