Engineering map

Execution systems

Production trading systems are made from boring interfaces under unusual stress: gateways, logs, replayability, engine equivalence, failure modes, and decisions that need to be auditable later.

System design

Prefer explicit contracts.

Gateway API

Strategies should express intent through a small, regular surface.

Event log

Replay should show what the strategy requested and what the simulated venue accepted.

Engine equivalence

Compiled speed is useful only when it preserves observable behavior.

Implementation trail

Every systems claim should point to a contract.

The bridge is explicit: strategy intent goes through the gateway API, replay records the execution path, alternative engines must pass the equivalence suite, and releases should prove the wheel tells the same story as the source tree.

Concrete references

Docs, tests, release gates.

Architecture

System contracts

Replay and gateway architecture Strategy gateway API

Equivalence

Engine behavior

Execution engine design Execution equivalence tests

Release gates

Shipping discipline

Release checklist CI smoke install

In depth

The gateway is the recording boundary.

Production trading systems do not look like the systems strategy writeups assume they look like. Strategies arrive as research code. Engineers arrive expecting microservices. The actual artifact is somewhere in between, and the design choices that matter most are the unglamorous ones — what the gateway API looks like, what the event log records, what gets versioned, what is reproducible from a single commit.

The reason these choices matter is that the system has to survive two things at once: live trading, where mistakes are expensive and immediate, and post-hoc analysis, where the question is some version of "why did the strategy do that on Tuesday". The same system has to answer both, and the only way to do that is to be deliberate about what gets recorded and what is allowed to differ.

A small, regular API is the lever. If strategies express every action — place, modify, cancel, observe — through one well-defined surface, then the surface itself can be the recording boundary. Everything inside it is testable. Everything outside it is the venue's responsibility. The simulation lives on one side; the live gateway lives on the other; the strategy does not know which one is on the line.

Engine equivalence is the same idea in another costume. If the fast engine and the slow engine produce identical event logs against the same replay, then "fast" is a non-event from a research perspective — just a deployment choice. It buys speed without buying a separate question about correctness. See ordersim for the artifact and market microstructure for the questions the recorded event stream is supposed to answer.

Frequently asked

Questions that come up.

What is the difference between an execution system and a strategy?

The strategy decides what to do; the execution system carries it out. In practice the boundary is the gateway API. Anything the strategy expresses through that API is intent; anything that happens to that intent — fills, cancels, latency, partial executions — is the system's job to record and replay later.

Why does deterministic replay matter?

Without deterministic replay there is no way to audit what the strategy did, or to ask "would a different version of the strategy have done better against the same market". Replay is the unit of post-hoc analysis. Anything non-deterministic in the replay path leaks into the conclusion.

Does the system support both backtest and live execution?

ordersim is a simulator only. The Python-facing gateway API is meant to be the same shape a production gateway would expose, so that a strategy can target a single surface and the bridging code is small and reviewable. The user still owns the production gateway.

What is the engine-equivalence test?

The Python and C++ engines must produce identical event logs against the same replay. Identical, not approximate. The equivalence test is part of CI; the C++ engine cannot ship unless its log matches. This makes performance a release decision, not a research variable.

How is a release verified before publishing?

A release checklist runs the equivalence tests, a smoke install from the built wheel, and a determinism check against a reference replay. The wheel has to produce the same event stream as the source tree on the same input. If it doesn't, the release fails — and that's the point.