Description
When building Stellar transactions, the current code loads the account sequence number from Horizon, builds the transaction, and submits it. When multiple transactions are built concurrently for the same signing account (e.g., multiple reward claims in the same second), they all load the same sequence number. All but the first will fail with txFailed: "bad_seq".
This is a well-known Stellar development challenge that requires a local sequence number cache with atomic increment and retry logic.
Problem Analysis
Current code (src/stellar/transactions.ts, lines 15-48)
async invokeContract(
contractId: string,
method: string,
args: StellarSdk.xdr.ScVal[],
source?: StellarSdk.Keypair
): Promise<InvokeResult> {
const keypair = source ?? config.PLATFORM_KEYPAIR;
const account = await stellarClient.getAccount(keypair.publicKey());
// ^^^每次都从 Horizon 加载当前 sequence number
// If two calls happen concurrently, both get the same sequence
const tx = new StellarSdk.TransactionBuilder(account, {
fee: StellarSdk.BASE_FEE,
networkPassphrase: config.stellar.passphrase,
})
.addOperation(contract.call(method, ...args))
.setTimeout(60)
.build();
tx.sign(keypair);
const result = await stellarClient.submitTransaction(tx);
return { hash: result.hash!, ... };
}
The race condition timeline
T=0ms: Request A loads account (seq=42)
T=1ms: Request B loads account (seq=42) ← Same!
T=2ms: Request A builds tx with seq=42, signs, submits
T=3ms: Request B builds tx with seq=42, signs, submits
T=50ms: Request A succeeds (seq incremented to 43 on-chain)
T=51ms: Request B fails with "bad_seq" (expected seq=43, got 42)
Impact
Under moderate load (e.g., 10 reward claims in 1 second), 9 will fail with bad_seq. The client receives an error even though the operation is valid. There is no retry mechanism.
Required Implementation
A. Local Sequence Number Cache
// New file: src/stellar/sequence-cache.ts
import { redis } from "../config/redis.js";
import { stellarClient } from "./client.js";
import { logger } from "../utils/logger.js";
const LOCK_TTL_MS = 5_000;
const CACHE_KEY_PREFIX = "stellar:seq:";
export class SequenceCache {
private localSeq: Map<string, bigint> = new Map();
async getNextSequence(accountId: string): Promise<bigint> {
// 1. Check local in-memory cache first
const local = this.localSeq.get(accountId);
if (local !== undefined) {
const next = local + 1n;
this.localSeq.set(accountId, next);
return next;
}
// 2. Load from Horizon
const account = await stellarClient.getAccount(accountId);
const seq = BigInt(account.sequence);
this.localSeq.set(accountId, seq);
return seq + 1n;
}
invalidate(accountId: string): void {
this.localSeq.delete(accountId);
}
resetTo(accountId: string, seq: bigint): void {
this.localSeq.set(accountId, seq);
}
}
export const sequenceCache = new SequenceCache();
B. Atomic Transaction Builder with Retry
// Updated src/stellar/transactions.ts
const MAX_SEQ_RETRIES = 3;
async function buildAndSubmitTx(
method: string,
args: StellarSdk.xdr.ScVal[],
source: StellarSdk.Keypair
): Promise<InvokeResult> {
const contract = new StellarSdk.Contract(config.contracts.learnToken);
for (let attempt = 0; attempt < MAX_SEQ_RETRIES; attempt++) {
try {
const seqNum = await sequenceCache.getNextSequence(source.publicKey());
const account = new StellarSdk.Account(source.publicKey(), seqNum.toString());
const tx = new StellarSdk.TransactionBuilder(account, {
fee: StellarSdk.BASE_FEE,
networkPassphrase: config.stellar.passphrase,
})
.addOperation(contract.call(method, ...args))
.setTimeout(60)
.build();
tx.sign(source);
// Simulate first (read-only, doesn't consume sequence)
const simResult = await stellarClient.getSorobanRpc().simulateTransaction(tx);
if (StellarSdk.SorobanRpc.isSimulationError(simResult)) {
throw new StellarError(`Simulation failed: ${simResult.error}`);
}
// Assemble and submit
const assembledTx = StellarSdk assembling SorobanRpc.assembleTransaction(tx, simResult).build();
assembledTx.sign(source);
const result = await stellarClient.submitTransaction(assembledTx);
return { hash: result.hash!, ... };
} catch (err) {
if (err instanceof Error && err.message.includes("bad_seq")) {
// Sequence number was wrong — invalidate cache and retry
sequenceCache.invalidate(source.publicKey());
logger.warn({ attempt, err: err.message }, "bad_seq error, retrying with fresh sequence");
continue;
}
throw err;
}
}
throw new StellarError(`Failed after ${MAX_SEQ_RETRIES} attempts due to sequence conflicts`);
}
C. Account-Level Locking
For the same account, serialize transaction building:
// New file: src/utils/account-lock.ts
import AsyncLock from "async-lock";
const accountLock = new AsyncLock({
timeout: 10_000,
maxPending: 50,
});
export function withAccountLock<T>(
accountId: string,
fn: () => Promise<T>
): Promise<T> {
return accountLock.acquire(`account:${accountId}`, fn);
}
// Usage in transactions.ts:
return withAccountLock(keypair.publicKey(), () =>
buildAndSubmitTx(method, args, keypair)
);
Install: npm install async-lock
D. Handle bad_seq at the API Level
// In reward.service.ts, wrap the on-chain call:
try {
const result = await invokeContract(...);
txHash = result.hash;
} catch (err) {
if (err instanceof StellarError && err.message.includes("bad_seq")) {
// Don't fail the entire request — the tx might actually succeed on-chain
// Check Horizon for the account's latest sequence
const account = await stellarClient.getAccount(keypair.publicKey());
logger.warn({
submissionId,
accountSeq: account.sequence,
}, "bad_seq after invoke — tx may have landed");
// Continue with DB update; the indexer will confirm
} else {
throw err;
}
}
Dependencies to Add
Files to create/modify
- New:
src/stellar/sequence-cache.ts
- New:
src/utils/account-lock.ts
- Modify:
src/stellar/transactions.ts — use cache + lock + retry
- Modify:
src/stellar/client.ts — add getAccount caching with short TTL
Testing Requirements
- Simulate 10 concurrent transactions for the same account
- Verify all 10 succeed (no bad_seq errors)
- Verify sequence numbers are monotonic (42, 43, 44, ... 51)
- Mock Horizon to return stale sequence → verify retry with fresh sequence
- Test account lock serialization under high concurrency
References
Description
When building Stellar transactions, the current code loads the account sequence number from Horizon, builds the transaction, and submits it. When multiple transactions are built concurrently for the same signing account (e.g., multiple reward claims in the same second), they all load the same sequence number. All but the first will fail with
txFailed: "bad_seq".This is a well-known Stellar development challenge that requires a local sequence number cache with atomic increment and retry logic.
Problem Analysis
Current code (
src/stellar/transactions.ts, lines 15-48)The race condition timeline
Impact
Under moderate load (e.g., 10 reward claims in 1 second), 9 will fail with
bad_seq. The client receives an error even though the operation is valid. There is no retry mechanism.Required Implementation
A. Local Sequence Number Cache
B. Atomic Transaction Builder with Retry
C. Account-Level Locking
For the same account, serialize transaction building:
Install:
npm install async-lockD. Handle
bad_seqat the API LevelDependencies to Add
Files to create/modify
src/stellar/sequence-cache.tssrc/utils/account-lock.tssrc/stellar/transactions.ts— use cache + lock + retrysrc/stellar/client.ts— addgetAccountcaching with short TTLTesting Requirements
References