mrjob.local - simulate Hadoop locally with subprocesses¶
-
class
mrjob.local.LocalMRJobRunner(**kwargs)¶ Runs an
MRJoblocally, for testing purposes. Invoked when you run your job with-r local.Unlike
InlineMRJobRunner, this actually spawns multiple subprocesses for each task.It’s rare to need to instantiate this class directly (see
__init__()for details).New in version 0.6.8: can run Spark steps as well, on the
local-clusterSpark master.
-
LocalMRJobRunner.__init__(**kwargs)¶ Arguments to this constructor may also appear in
mrjob.confunderrunners/local.LocalMRJobRunner‘s constructor takes the same keyword args asMRJobRunner. However, please note:- cmdenv is combined with
combine_local_envs() - python_bin defaults to
sys.executable(the current python interpreter) - hadoop_input_format, hadoop_output_format,
and partitioner are ignored because they
require Java. If you need to test these, consider starting up a
standalone Hadoop instance and running your job with
-r hadoop.
- cmdenv is combined with