Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
I've attempted several runs to replicate the results of Qwen/Qwen2.5-Math-7B-Instruct on College Math dataset but I'm getting ~41.8 which is too far off from the 46.8 as reported (despite using the ...