Feature Description
System to categorize mail as important or junk based on photos
Problem It Solves
Cannot easily distinguish important from junk mail
Proposed Solution
Mail categorization system
Implementation details
Assume each generated tool implements one user-facing task. If this issue enumerates multiple stages, execute one ordered copilot_llm_call(...) per stage and explicitly pass useful structured artifacts to later calls with metadata={"previous_stage_artifact": ...}. Use the stage capability as capability. Choose only from these capabilities: general_reasoning, ocr, object_detection_localization, structured_visual_understanding, spatial_reasoning, navigation, camera_motion, or temporal_reasoning. Never use visual_reasoning. The backend may evaluate and escalate reasoning capabilities according to the execution policy. Generated tools must not choose implementations, models, providers, detector backends, fallback order, retries, or verification logic. Do not implement detection, OCR, VLM, LLM, model loading, or provider calls inside generated tool files. Generated tools must not create routers, capability registries, detector/OCR/LLM wrappers, new model-router clients, provider-specific DEFAULT_MODEL constants, COCO_CLASSES, .pt model loading/discovery logic, or direct provider calls.
Alternatives Considered
Example usage
Take photos of mail and tell user which pieces are likely important or junk
Live Mode
no
Live Query
Additional Context
Custom GPT: No
Unless otherwise specified, in streaming mode, any verbal/text response should be limited to 15 words. No such limit applies to one-shot output.
Task Stages
Stage 1
- Goal: Detect objects.
- Capability: object_detection_localization
Stage 2
- Goal: Read text.
- Capability: ocr
Stage 3
- Goal: Make high-level semantic understanding of the detected objects.
- Capability: general_reasoning
Write the code for this tool inside the tools folder. Assume the tool implements one user-facing task. For a Task Stages section, make one ordered copilot_llm_call(...) per stage and explicitly pass useful artifacts to later calls. Use only these capabilities: general_reasoning, ocr, object_detection_localization, structured_visual_understanding, spatial_reasoning, navigation, camera_motion, temporal_reasoning. Never use visual_reasoning. Do not import litellm, call litellm.completion(), create new model-router clients, create ModelRouter classes, resolve model names, resolve API keys, import detector libraries/provider SDKs/YOLO, define COCO_CLASSES, hardcode model names, load .pt files, or call YOLO(...). Do not select implementations, providers, fallback order, retries, or verification logic in generated tools.
Feature Description
System to categorize mail as important or junk based on photos
Problem It Solves
Cannot easily distinguish important from junk mail
Proposed Solution
Mail categorization system
Implementation details
Assume each generated tool implements one user-facing task. If this issue enumerates multiple stages, execute one ordered
copilot_llm_call(...)per stage and explicitly pass useful structured artifacts to later calls withmetadata={"previous_stage_artifact": ...}. Use the stage capability ascapability. Choose only from these capabilities:general_reasoning,ocr,object_detection_localization,structured_visual_understanding,spatial_reasoning,navigation,camera_motion, ortemporal_reasoning. Never usevisual_reasoning. The backend may evaluate and escalate reasoning capabilities according to the execution policy. Generated tools must not choose implementations, models, providers, detector backends, fallback order, retries, or verification logic. Do not implement detection, OCR, VLM, LLM, model loading, or provider calls inside generated tool files. Generated tools must not create routers, capability registries, detector/OCR/LLM wrappers, new model-router clients, provider-specificDEFAULT_MODELconstants,COCO_CLASSES,.ptmodel loading/discovery logic, or direct provider calls.Alternatives Considered
Example usage
Take photos of mail and tell user which pieces are likely important or junk
Live Mode
no
Live Query
Additional Context
Custom GPT: No
Unless otherwise specified, in streaming mode, any verbal/text response should be limited to 15 words. No such limit applies to one-shot output.
Task Stages
Stage 1
Stage 2
Stage 3
Write the code for this tool inside the tools folder. Assume the tool implements one user-facing task. For a Task Stages section, make one ordered
copilot_llm_call(...)per stage and explicitly pass useful artifacts to later calls. Use only these capabilities:general_reasoning,ocr,object_detection_localization,structured_visual_understanding,spatial_reasoning,navigation,camera_motion,temporal_reasoning. Never usevisual_reasoning. Do not importlitellm, calllitellm.completion(), create new model-router clients, create ModelRouter classes, resolve model names, resolve API keys, import detector libraries/provider SDKs/YOLO, defineCOCO_CLASSES, hardcode model names, load.ptfiles, or callYOLO(...). Do not select implementations, providers, fallback order, retries, or verification logic in generated tools.