mrjob.conf - parse and write config files¶
“mrjob.conf” is the name of both this module, and the global config file
for mrjob
.
Reading and writing mrjob.conf¶
-
mrjob.conf.
find_mrjob_conf
()¶ Look for
mrjob.conf
, and return its path. Places we look:- The location specified by
MRJOB_CONF
~/.mrjob.conf
/etc/mrjob.conf
Return
None
if we can’t find it.- The location specified by
-
mrjob.conf.
load_opts_from_mrjob_conf
(runner_alias, conf_path=None, already_loaded=None)¶ Load a list of dictionaries representing the options in a given mrjob.conf for a specific runner, resolving includes. Returns
[(path, values)]
. If conf_path is not found, return[(None, {})]
.Parameters: - runner_alias (str) – String identifier of the runner type, e.g.
emr
,local
, etc. - conf_path (str) – location of the file to load
- already_loaded (list) – list of real (according to
os.path.realpath()
) conf paths that have already been loaded (used byload_opts_from_mrjob_confs()
).
Relative
include:
paths are relative to the real (after resolving symlinks) path of the including conf fileThis will only load each config file once, even if it’s referenced from multiple paths due to symlinks.
- runner_alias (str) – String identifier of the runner type, e.g.
-
mrjob.conf.
load_opts_from_mrjob_confs
(runner_alias, conf_paths=None)¶ Load a list of dictionaries representing the options in a given list of mrjob config files for a specific runner. Returns
[(path, values), ...]
. If a path is not found, use(None, {})
as its value.If conf_paths is
None
, look for a config file in the default locations (seefind_mrjob_conf()
).Parameters: - runner_alias (str) – String identifier of the runner type, e.g.
emr
,local
, etc. - conf_path – locations of the files to load
This will only load each config file once, even if it’s referenced from multiple paths due to symlinks.
- runner_alias (str) – String identifier of the runner type, e.g.
Combining options¶
Combiner functions take a list of values to combine, with later options taking
precedence over earlier ones. None
values are always ignored.
-
mrjob.conf.
combine_cmds
(*cmds)¶ Take zero or more commands to run on the command line, and return the last one that is not
None
. Each command should either be a list containing the command plus switches, or a string, which will be parsed withshlex.split()
. The string must either be a byte string or a unicode string containing no non-ASCII characters.Returns either
None
or a list containing the command plus arguments.
-
mrjob.conf.
combine_dicts
(*dicts)¶ Combine zero or more dictionaries. Values from dicts later in the list take precedence over values earlier in the list.
If you pass in
None
in place of a dictionary, it will be ignored.
-
mrjob.conf.
combine_envs
(*envs)¶ Combine zero or more dictionaries containing environment variables. Environment variable values may be wrapped in
ClearedValue
.Environment variables later from dictionaries later in the list take priority over those earlier in the list.
For variables ending with
PATH
, we prepend (and add a colon) rather than overwriting. Wrapping a path value inClearedValue
disables this behavior.Environment set to
ClearedValue(None)
will delete environment variables earlier in the list, rather than setting them toNone
.If you pass in
None
in place of a dictionary in envs, it will be ignored.
-
mrjob.conf.
combine_jobconfs
(*jobconfs)¶ Like combine_dicts(), but non-string values are converted to Java-readable string (e.g. True becomes ‘true’). Keys whose value is None are blanked out.
-
mrjob.conf.
combine_lists
(*seqs)¶ Concatenate the given sequences into a list. Ignore
None
values.Generally this is used for a list of commands we want to run; the “default” commands get run before any commands specific to your job.
Strings, bytes, and non-sequence objects (e.g. numbers) are treated as single-item lists.
-
mrjob.conf.
combine_local_envs
(*envs)¶ Same as
combine_envs()
, except that paths are combined using the local path separator (e.g;
on Windows rather than:
).
-
mrjob.conf.
combine_path_lists
(*path_seqs)¶ Concatenate the given sequences into a list. Ignore None values. Resolve
~
(home dir) and environment variables, and expand globs that refer to the local filesystem.Can take single strings as well as lists.
-
mrjob.conf.
combine_paths
(*paths)¶ Returns the last value in paths that is not
None
. Resolve~
(home dir) and environment variables.
-
mrjob.conf.
combine_values
(*values)¶ Return the last value in values that is not
None
.The default combiner; good for simple values (booleans, strings, numbers).