mrjob.hadoop - run on your Hadoop cluster

class mrjob.hadoop.HadoopJobRunner(**kwargs)

Runs an MRJob on your Hadoop cluster. Invoked when you run your job with -r hadoop.

Input and support files can be either local or on HDFS; use hdfs://... URLs to refer to files on HDFS.

HadoopJobRunner.__init__(**kwargs)

HadoopJobRunner takes the same arguments as MRJobRunner, plus some additional options which can be defaulted in mrjob.conf.

Utilities

mrjob.hadoop.fully_qualify_hdfs_path(path)

If path isn’t an hdfs:// URL, turn it into one.