Thank you for open-sourcing this amazing project and your great work on DriveAgent-R1!
To better understand the benchmarks and facilitate community reproduction/further research, would it be possible to release the evaluation scripts mentioned in your paper? Specifically, we are looking forward to:
- Full Evaluation Scripts: The complete scripts needed to reproduce the main benchmark results reported in the paper.
- GPT Evaluation Scripts: The specific scripts, prompts, or pipeline used for the GPT-based evaluation.
Thank you for open-sourcing this amazing project and your great work on DriveAgent-R1!
To better understand the benchmarks and facilitate community reproduction/further research, would it be possible to release the evaluation scripts mentioned in your paper? Specifically, we are looking forward to: