Skip to content

Pull requests: allenai/olmo-eval

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add MMMU-Pro image-QA benchmark (Molmo2-4B)
#232 opened Jun 25, 2026 by uwGZQ Loading…
Update rich requirement from <15,>=13.7 to >=13.7,<16 dependencies Pull requests that update a dependency file python Pull requests that update python code
#231 opened Jun 24, 2026 by dependabot Bot Loading…
Update openai-agents requirement from ~=0.7.0 to >=0.7,<0.18 dependencies Pull requests that update a dependency file python Pull requests that update python code
#230 opened Jun 24, 2026 by dependabot Bot Loading…
Update click requirement from ~=8.3.2 to >=8.3.2,<8.5.0 dependencies Pull requests that update a dependency file python Pull requests that update python code
#229 opened Jun 24, 2026 by dependabot Bot Loading…
Bump ty from 0.0.39 to 0.0.53 in the dev-dependencies group dependencies Pull requests that update a dependency file python Pull requests that update python code
#228 opened Jun 24, 2026 by dependabot Bot Loading…
Bump the actions group with 3 updates dependencies Pull requests that update a dependency file github_actions Pull requests that update GitHub Actions code
#227 opened Jun 24, 2026 by dependabot Bot Loading…
Roryd/science lit
#217 opened Jun 16, 2026 by donovanr Contributor Draft
13 tasks
Safety Suite Update
#208 opened Jun 12, 2026 by mgmorgan23 Contributor Loading…
7 of 13 tasks
Add general:posttrain:dev suite
#193 opened May 26, 2026 by finbarrtimbers Contributor Loading…
2 tasks done
Propagate suite-level CLI overrides to constituent task specs
#180 opened May 15, 2026 by finbarrtimbers Contributor Loading…
1 task
Report cumulative vLLM generation progress across batches
#179 opened May 15, 2026 by finbarrtimbers Contributor Loading…
1 task
Finbarr/prebuild sandbox
#158 opened Apr 27, 2026 by finbarrtimbers Contributor Draft
13 tasks
how to use add a serialized task walkthru
#121 opened Apr 8, 2026 by IanMagnusson Contributor Loading…
3 tasks done
DNM: Tweaks to support MSWEA + swe-bench
#118 opened Apr 3, 2026 by undfined Collaborator Draft
WIP: add swebench as external eval
#109 opened Apr 1, 2026 by aetting Draft
1 of 13 tasks
[WIP] Hard reasoning
#102 opened Mar 25, 2026 by rlebras Contributor Loading…
13 tasks
adds simple smoke tests
#76 opened Feb 27, 2026 by warmbowski Contributor Loading…
3 of 13 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.