Feature Description
A tool that identifies grocery store items and their prices via camera and provides audio feedback. It also confirms the name and price of items when they are grabbed by the user.
Problem It Solves
Difficulty navigating grocery stores, specifically identifying the names and prices of items on shelves and confirming which item has been picked up.
Proposed Solution
A visual assistive tool that provides audio identifications and prices for grocery store items as they are seen on camera. When an item is grabbed, the tool uses the nearest price label to confirm the name and price of the selected item.
Implementation details
If not specified otherwise above, tools for object detection should either utilize Yolo11 and COCO or YoloWorld, based on the conditions described in the copilot instructions. Tools involving text extraction should utilize the Google Cloud Vision API.
Alternatives Considered
Example usage
As the user scans a cereal shelf, the tool says 'Cheerios, 3.50', 'Lucky Charms, 4 dollars', and 'Raisin Bran, 2.50'. When the user picks up a box of Cheerios, the tool announces 'You grabbed Cheerios, 3.50'.
Custom GPT
no
GPT Query
Additional Context
Unless otherwise specified, in streaming mode, any verbal/text response should be limited to 15 words. No such limit applies to one-shot output.
Video Summary
This video demonstrates a mockup of a tool for visual assistance in a grocery store. Using a shelf of miscellaneous items, the user simulates how the tool would provide audio feedback as someone navigates through an aisle.
In the first example, the user shows that as they move their camera up to a box of Cheerios, the tool would read the item and its price: "Cheerios, 3.50".
As they move past other items on the shelf, the tool would continue to identify what is being seen: "Lucky Charms, 4 dollars," then "Raisin Bran, 2.50."
Finally, the user demonstrates that the tool would provide confirmation when a shopper picks up an item, by saying "You grabbed Cheerios, 3.50."
In short, this tool would provide audio identifications and prices for grocery store items as they are seen or interacted with.
Write the code for this tool inside the tools folder
Feature Description
A tool that identifies grocery store items and their prices via camera and provides audio feedback. It also confirms the name and price of items when they are grabbed by the user.
Problem It Solves
Difficulty navigating grocery stores, specifically identifying the names and prices of items on shelves and confirming which item has been picked up.
Proposed Solution
A visual assistive tool that provides audio identifications and prices for grocery store items as they are seen on camera. When an item is grabbed, the tool uses the nearest price label to confirm the name and price of the selected item.
Implementation details
If not specified otherwise above, tools for object detection should either utilize Yolo11 and COCO or YoloWorld, based on the conditions described in the copilot instructions. Tools involving text extraction should utilize the Google Cloud Vision API.
Alternatives Considered
Example usage
As the user scans a cereal shelf, the tool says 'Cheerios, 3.50', 'Lucky Charms, 4 dollars', and 'Raisin Bran, 2.50'. When the user picks up a box of Cheerios, the tool announces 'You grabbed Cheerios, 3.50'.
Custom GPT
no
GPT Query
Additional Context
Unless otherwise specified, in streaming mode, any verbal/text response should be limited to 15 words. No such limit applies to one-shot output.
Video Summary
This video demonstrates a mockup of a tool for visual assistance in a grocery store. Using a shelf of miscellaneous items, the user simulates how the tool would provide audio feedback as someone navigates through an aisle.
In the first example, the user shows that as they move their camera up to a box of Cheerios, the tool would read the item and its price: "Cheerios, 3.50".
As they move past other items on the shelf, the tool would continue to identify what is being seen: "Lucky Charms, 4 dollars," then "Raisin Bran, 2.50."
Finally, the user demonstrates that the tool would provide confirmation when a shopper picks up an item, by saying "You grabbed Cheerios, 3.50."
In short, this tool would provide audio identifications and prices for grocery store items as they are seen or interacted with.
Write the code for this tool inside the tools folder