Skip to content

[Expert] Implement Stellar sequence number management with local sequence cache and retry #8

@DeFiVC

Description

@DeFiVC

Description

When building Stellar transactions, the current code loads the account sequence number from Horizon, builds the transaction, and submits it. When multiple transactions are built concurrently for the same signing account (e.g., multiple reward claims in the same second), they all load the same sequence number. All but the first will fail with txFailed: "bad_seq".

This is a well-known Stellar development challenge that requires a local sequence number cache with atomic increment and retry logic.

Problem Analysis

Current code (src/stellar/transactions.ts, lines 15-48)

async invokeContract(
  contractId: string,
  method: string,
  args: StellarSdk.xdr.ScVal[],
  source?: StellarSdk.Keypair
): Promise<InvokeResult> {
  const keypair = source ?? config.PLATFORM_KEYPAIR;
  const account = await stellarClient.getAccount(keypair.publicKey());
  // ^^^每次都从 Horizon 加载当前 sequence number
  // If two calls happen concurrently, both get the same sequence

  const tx = new StellarSdk.TransactionBuilder(account, {
    fee: StellarSdk.BASE_FEE,
    networkPassphrase: config.stellar.passphrase,
  })
    .addOperation(contract.call(method, ...args))
    .setTimeout(60)
    .build();

  tx.sign(keypair);
  const result = await stellarClient.submitTransaction(tx);
  return { hash: result.hash!, ... };
}

The race condition timeline

T=0ms:  Request A loads account (seq=42)
T=1ms:  Request B loads account (seq=42)  ← Same!
T=2ms:  Request A builds tx with seq=42, signs, submits
T=3ms:  Request B builds tx with seq=42, signs, submits
T=50ms: Request A succeeds (seq incremented to 43 on-chain)
T=51ms: Request B fails with "bad_seq" (expected seq=43, got 42)

Impact

Under moderate load (e.g., 10 reward claims in 1 second), 9 will fail with bad_seq. The client receives an error even though the operation is valid. There is no retry mechanism.

Required Implementation

A. Local Sequence Number Cache

// New file: src/stellar/sequence-cache.ts
import { redis } from "../config/redis.js";
import { stellarClient } from "./client.js";
import { logger } from "../utils/logger.js";

const LOCK_TTL_MS = 5_000;
const CACHE_KEY_PREFIX = "stellar:seq:";

export class SequenceCache {
  private localSeq: Map<string, bigint> = new Map();

  async getNextSequence(accountId: string): Promise<bigint> {
    // 1. Check local in-memory cache first
    const local = this.localSeq.get(accountId);
    if (local !== undefined) {
      const next = local + 1n;
      this.localSeq.set(accountId, next);
      return next;
    }

    // 2. Load from Horizon
    const account = await stellarClient.getAccount(accountId);
    const seq = BigInt(account.sequence);
    this.localSeq.set(accountId, seq);
    return seq + 1n;
  }

  invalidate(accountId: string): void {
    this.localSeq.delete(accountId);
  }

  resetTo(accountId: string, seq: bigint): void {
    this.localSeq.set(accountId, seq);
  }
}

export const sequenceCache = new SequenceCache();

B. Atomic Transaction Builder with Retry

// Updated src/stellar/transactions.ts
const MAX_SEQ_RETRIES = 3;

async function buildAndSubmitTx(
  method: string,
  args: StellarSdk.xdr.ScVal[],
  source: StellarSdk.Keypair
): Promise<InvokeResult> {
  const contract = new StellarSdk.Contract(config.contracts.learnToken);

  for (let attempt = 0; attempt < MAX_SEQ_RETRIES; attempt++) {
    try {
      const seqNum = await sequenceCache.getNextSequence(source.publicKey());
      const account = new StellarSdk.Account(source.publicKey(), seqNum.toString());

      const tx = new StellarSdk.TransactionBuilder(account, {
        fee: StellarSdk.BASE_FEE,
        networkPassphrase: config.stellar.passphrase,
      })
        .addOperation(contract.call(method, ...args))
        .setTimeout(60)
        .build();

      tx.sign(source);

      // Simulate first (read-only, doesn't consume sequence)
      const simResult = await stellarClient.getSorobanRpc().simulateTransaction(tx);
      if (StellarSdk.SorobanRpc.isSimulationError(simResult)) {
        throw new StellarError(`Simulation failed: ${simResult.error}`);
      }

      // Assemble and submit
      const assembledTx = StellarSdk assembling SorobanRpc.assembleTransaction(tx, simResult).build();
      assembledTx.sign(source);

      const result = await stellarClient.submitTransaction(assembledTx);
      return { hash: result.hash!, ... };

    } catch (err) {
      if (err instanceof Error && err.message.includes("bad_seq")) {
        // Sequence number was wrong — invalidate cache and retry
        sequenceCache.invalidate(source.publicKey());
        logger.warn({ attempt, err: err.message }, "bad_seq error, retrying with fresh sequence");
        continue;
      }
      throw err;
    }
  }

  throw new StellarError(`Failed after ${MAX_SEQ_RETRIES} attempts due to sequence conflicts`);
}

C. Account-Level Locking

For the same account, serialize transaction building:

// New file: src/utils/account-lock.ts
import AsyncLock from "async-lock";

const accountLock = new AsyncLock({
  timeout: 10_000,
  maxPending: 50,
});

export function withAccountLock<T>(
  accountId: string,
  fn: () => Promise<T>
): Promise<T> {
  return accountLock.acquire(`account:${accountId}`, fn);
}

// Usage in transactions.ts:
return withAccountLock(keypair.publicKey(), () =>
  buildAndSubmitTx(method, args, keypair)
);

Install: npm install async-lock

D. Handle bad_seq at the API Level

// In reward.service.ts, wrap the on-chain call:
try {
  const result = await invokeContract(...);
  txHash = result.hash;
} catch (err) {
  if (err instanceof StellarError && err.message.includes("bad_seq")) {
    // Don't fail the entire request — the tx might actually succeed on-chain
    // Check Horizon for the account's latest sequence
    const account = await stellarClient.getAccount(keypair.publicKey());
    logger.warn({
      submissionId,
      accountSeq: account.sequence,
    }, "bad_seq after invoke — tx may have landed");
    // Continue with DB update; the indexer will confirm
  } else {
    throw err;
  }
}

Dependencies to Add

npm install async-lock

Files to create/modify

  • New: src/stellar/sequence-cache.ts
  • New: src/utils/account-lock.ts
  • Modify: src/stellar/transactions.ts — use cache + lock + retry
  • Modify: src/stellar/client.ts — add getAccount caching with short TTL

Testing Requirements

  • Simulate 10 concurrent transactions for the same account
  • Verify all 10 succeed (no bad_seq errors)
  • Verify sequence numbers are monotonic (42, 43, 44, ... 51)
  • Mock Horizon to return stale sequence → verify retry with fresh sequence
  • Test account lock serialization under high concurrency

References

Metadata

Metadata

Assignees

Labels

GrantFox OSSIssue tracked in GrantFox OSSMaybe RewardedIssue may be eligible for a GrantFox rewardOfficial CampaignCampaign: Official CampaignadvancedAdvanced difficultyenhancementNew feature or requesttypescriptTypeScript language

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions