News

This is a brief walkthrough of the mrjob codebase. It's aimed primarily at mrjob maintainers and contributors, but should also be useful for anyone trying to debug an issue with mrjob. mrjob lets you ...
Mrjob is a Python package for running Hadoop streaming jobs using Amazon's Elastic MapReduce service. It's offered under the Apache 2.0 license.
MapReduce can be written with Java, but for the purpose of simplicity and readability, we're gonna stick with Python. But before we start, we need to install the open-source mapReduce library, MRjob, ...