For the real-world problem I created this for, I'm seeing a ~60% speedup compared to standard MEX multi-threading, saving me a couple of days in total simulation time. In addition, adding ...