Parallelized Eva Evaluation#44
Conversation
|
Thanks Anis, but can you clean this up so I can push to main? Why do you have both a slurm_run_evals.sbatch and slurm_run_evals_packed.sbatch? I dont understand the difference of how these are meant to be used Also generally looks like lots of AI-generated code that is not necessary e.g., there are lines of code like "This is the failure mode behind the CUDA OOMs in job 33308" which has no meaning to other people. Also lots of new lines of code that I'm not sure are necessary and which I don't know how to interpret stuff like are not necessary; a PR should strictly implement the intended feature and leave everything unchanged that can be left unchanged |
|
Hi Paul, To answer your question, slurm_run_evals.sbatch was added by accident. The packed script is what will parallelize the eva evals. I will take down slurm_run_evals.sbatch so that there is no confusion. I will also clean up the packed script to get rid of the unnecessary code asap. |
Purpose
This PR introduces a Slurm script that parallelizes the evaluation for OpenMidnight on the Eva evaluation suite. The default parameters are set for ViT Base.
Usage
Run the packed eval script as follows:
sbatch OpenMidnight/slurm_run_evals_packed.sbatch
/path/to/sweep_or_eval
/admin/home/achihoub/openmidnight_eval_results
Optional parameters include: