We chose 130 images from BDD100K. We selected 124 images that meet the criteria from among 18 weather-related tags.
Weather | Time | Number |
---|---|---|
dawn/dusk | 8 | |
clear | daytime | 8 |
night | 8 | |
dawn/dusk | 7 | |
partly cloudy | daytime | 8 |
night | 5 | |
dawn/dusk | 7 | |
overcast | daytime | 8 |
night | 8 | |
dawn/dusk | 7 | |
rainy | daytime | 8 |
night | 8 | |
dawn/dusk | 8 | |
snowy | daytime | 7 |
night | 8 | |
dawn/dusk | 1 | |
foggy | daytime | 5 |
night | 7 |
We simplified DriveLM and selected four QAs that we thought were worth exploring. In order to achieve better results, we used ChatGPT to improve the question based on LLaVA's pre-training and alignment data, making the model achieve about 30% accuracy among the questions we set.
You can download our data image from Google Drive
Please follow LLaVA to configurate the environment, LLaMA(or Vincuna) weights.
You need to modify the model path in sh.
sh exc.sh
OR
python model_vqa.py \
--model-path /path/to/llava \
--question-file \
/path/to/LLaVA/AD/124data_question.jsonl \
--image-folder \
/path/to/LLaVA/124_data_image \
--answers-file \
/path/to/LLaVA/AD/Result/124data_answer.jsonl
In our case, 124_chosen-ref.jsonl
is the answer which was annotated by human. And our orginal annotation excel file is on google drive.
Modify the path in sh.
sh eva.sh
OR
OPENAI_API_KEY="sk-***********************************" python llava/eval/eval_gpt_review_visual.py \
--question /path/to/LLaVA/AD/124data_question.jsonl \
--answer-list \
/path/to/LLaVA/AD/124_chosen_ref.jsonl \
/path/to/LLaVA/AD/Result/124data_answer.jsonl \
--rule \
llava/eval/table/rule.json \
--output \
/path/to/review.json
Open Score_tuple and modify the path. Then run it.
Modify the path in summary.sh.
sh summary.sh
- Ruoyu Chen
- Zirui Song
- Zhenhao Chen
- Dayan Guan*
- Helin Wang
- YangJing Pu
- Zihui Cui
- Yushan Jiang
We would like to express our gratitude to the students(Yuehuan Wang, Yuxiao Huang, Zedong Zhao, Zheng Sun,Yuan Huang, Zhe Fu) who participated in data annotation.
- Vicuna: the codebase we built upon, and our base model Vicuna-13B that has the amazing language capabilities!
- LLaVA
- LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
- Drive on Language: Unlocking the future where autonomous driving meets the unlimited potential of language
- DriveGPT4:Interpretable End-to-end Autonomous Driving via Large Language Model
- LINGO-1: Exploring Natural Language for Autonomous Driving