You have a big data project. You understand the problem domain, you know what infrastructure to use, and maybe you’ve even decided on the framework you will use to process all that data, but one ...