We present two comprehensive benchmarks to evaluate the performance of language models in coding assistance tasks, covering code writing, debugging, code review, and conceptual understanding. Our main ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results