mrjob.spark.runner - run on any Spark cluster

Job Runner

class mrjob.spark.runner.SparkMRJobRunner(max_output_files=None, mrjob_cls=None, **kwargs)

Runs a MRJob on your Spark cluster (with or without Hadoop). Invoked when you run your job with -r spark.

See Running on your Spark cluster for more information.

The Spark runner can also run “classic” MRJobs directly on Spark, without using Hadoop streaming. See Running “classic” MRJobs on Spark.

New in version 0.6.8.