Vision-LLM-Alignment/data at master · NiuTrans/Vision-LLM-Alignment

Name		Name	Last commit message	Last commit date
parent directory ..
prediction_sample		prediction_sample
README.md		README.md
convert_to_llava_ppo_dataset.py		convert_to_llava_ppo_dataset.py
convert_to_llava_reward_dataset.py		convert_to_llava_reward_dataset.py
ppo_samples.json		ppo_samples.json
reward_samples.json		reward_samples.json
sft_samples.json		sft_samples.json
sft_samples_multi_image.json		sft_samples_multi_image.json

README.md

The source of the dataset is described below:

Image Folder: Download 2017 Train images from https://cocodataset.org/#download.
SFT Dataset: Download from LLaVA-Instruct. Like LLaVA, we can also pre-train for feature alignment using SFT datasets, such as CC3M.
RM Dataset: Download from LLaVA-Human-Preference. We use the convert_to_llava_reward_dataset.py script to convert it into the format of our reward training dataset.
PPO Dataset: We use the same data format as SFT, but in this answer field we can leave it empty or give it a random string.