Currently, we're consistently using list[int] to represent output_tokens in ModelRunnerOutput which is very inefficient from GC prospective. The default setup of GC is (700, 10, 10) which means if ...