Commit graph

17 commits

Author SHA1 Message Date
db295c545c
Remove qwen2.5vl:3b for now 2025-06-25 00:30:31 -04:00
e706d68e72
Remove under-performing prompts 2025-06-25 00:29:55 -04:00
dd16b71f54
Try to reduce mentions of people when there are none 2025-06-25 00:22:57 -04:00
12769edaf4
Improve system prompt, drop temp back down to 0.1 2025-06-25 00:20:37 -04:00
6807db8ad9
Add CLAUDE.md 2025-06-24 23:51:31 -04:00
81f4ac2396
Add results-take1 summary 2025-06-24 23:51:21 -04:00
437a4a3284
Simplify, focus on llava:7b and qwen2.5vl:3b and 768px and 1024px images 2025-06-24 23:05:19 -04:00
9c32f2d04c
Add first batch of results 2025-06-24 22:47:30 -04:00
554488d1c4
Add scripts to work in parallel and aggregate results separately 2025-06-24 21:55:21 -04:00
c9fbfc1b67
Tweak concurrency/parallelism per model 2025-06-24 11:14:53 -04:00
7b6a1e5479
Pull models properly when benchmarking 2025-06-24 10:20:18 -04:00
86a382a700
Make benchmarking a lot faster and more efficient, stop deleting models 2025-06-24 10:09:36 -04:00
0d0e7a7cb0
Add benchmark script 2025-06-24 10:01:18 -04:00
f2750ed0e2
Pull models as needed 2025-06-24 09:46:44 -04:00
02a430c60e
Add more default models 2025-06-24 09:43:22 -04:00
7e4036ff20
Rename original photos 2025-06-24 09:30:03 -04:00
c283e6fb4f
First commit 2025-06-24 09:22:33 -04:00