Skip to content

Card Hand Identification #103

Description

@yushanwe

Feature Description
Assistive tool to identify and track playing cards in a hand

Problem It Solves
Identifying playing cards in a hand during a card game

Proposed Solution
Read and update a list of cards in the user's hand

Implementation details

Assume each generated tool implements one user-facing task. If this issue enumerates multiple stages, execute one ordered copilot_llm_call(...) per stage and explicitly pass useful structured artifacts to later calls with metadata={"previous_stage_artifact": ...}. Use the stage capability as capability. Choose only from these capabilities: general_reasoning, ocr, object_detection_localization, structured_visual_understanding, spatial_reasoning, navigation, camera_motion, or temporal_reasoning. Never use visual_reasoning. The backend may evaluate and escalate reasoning capabilities according to the execution policy. Generated tools must not choose implementations, models, providers, detector backends, fallback order, retries, or verification logic. Do not implement detection, OCR, VLM, LLM, model loading, or provider calls inside generated tool files. Generated tools must not create routers, capability registries, detector/OCR/LLM wrappers, new model-router clients, provider-specific DEFAULT_MODEL constants, COCO_CLASSES, .pt model loading/discovery logic, or direct provider calls.

Alternatives Considered

Example usage
Read the cards visible in my hand and tell me their suits and ranks. As my hand changes, update the card list accordingly

Live Mode
no

Live Query

Additional Context
Card game context
Unless otherwise specified, in streaming mode, any verbal/text response should be limited to 15 words. No such limit applies to one-shot output.

Task Stages

Stage 1

  • Goal: Identify the cards and their properties
  • Capability: structured_visual_understanding

Write the code for this tool inside the tools folder. Assume the tool implements one user-facing task. For a Task Stages section, make one ordered copilot_llm_call(...) per stage and explicitly pass useful artifacts to later calls. Use only these capabilities: general_reasoning, ocr, object_detection_localization, structured_visual_understanding, spatial_reasoning, navigation, camera_motion, temporal_reasoning. Never use visual_reasoning. Do not import litellm, call litellm.completion(), create new model-router clients, create ModelRouter classes, resolve model names, resolve API keys, import detector libraries/provider SDKs/YOLO, define COCO_CLASSES, hardcode model names, load .pt files, or call YOLO(...). Do not select implementations, providers, fallback order, retries, or verification logic in generated tools.

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions