Title: | Batch Computing with R |
Description: | Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page. |
Author: | Bernd Bischl <bernd_bischl@gmx.net>, Michel Lang <michellang@gmail.com>, Henrik Bengtsson <henrikb@braju.com> |
Maintainer: | Bernd Bischl <bernd_bischl@gmx.net> |
URL: | https://github.com/tudo-r/BatchJobs |
BugReports: | https://github.com/tudo-r/BatchJobs/issues |
MailingList: | batchjobs@googlegroups.com |
License: | BSD_2_clause + file LICENSE |
Depends: | R (≥ 3.0.0), BBmisc (≥ 1.9), methods |
Imports: | backports (≥ 1.1.1), brew, checkmate (≥ 1.8.0), data.table (≥ 1.9.6), DBI, digest, parallel, RSQLite (≥ 1.0.9011), sendmailR, stats, stringi (≥ 0.4-1), utils |
Suggests: | MASS, testthat |
Version: | 1.9 |
RoxygenNote: | 7.1.2 |
Encoding: | UTF-8 |
NeedsCompilation: | no |
Packaged: | 2022-03-21 10:39:57 UTC; michel |
Repository: | CRAN |
Date/Publication: | 2022-03-21 11:30:02 UTC |
The BatchJobs package
Description
Provides Map, Reduce and Filter variants to generate jobs on batch computing systems like PBS/Torque, LSF, SLURM and Sun Grid Engine. Multicore and SSH systems are also supported. For further details see the project web page.
Additional information
- Homepage:
- Wiki:
- FAQ:
- Configuration:
The package currently support the following further R options, which you can set
either in your R profile file or a script via options
:
- BatchJobs.verbose
This boolean flag can be set to
FALSE
to reduce the console output of the package operations. Usually you want to see this output in interactive work, but when you use the package in e.g. knitr documents, it clutters the resulting document too much.- BatchJobs.check.posix
If this boolean flag is enabled, the package checks your registry file dir (and related user-defined directories) quite strictly to be POSIX compliant. Usually this is a good idea, you do not want to have strange chars in your file paths, as this might results in problems when these paths get passed to the scheduler or other command-line tools that the package interoperates with. But on some OS this check might be too strict and cause problems. Setting the flag to
FALSE
allows to disable the check entirely. The default isFALSE
on Windows systems andTRUE
else.
Add packages to registry.
Description
Mutator function for packages
in makeRegistry
.
Usage
addRegistryPackages(reg, packages)
Arguments
reg |
[ |
packages |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Add source dirs to registry.
Description
Mutator function for src.dirs
in makeRegistry
.
Usage
addRegistrySourceDirs(reg, src.dirs, src.now = TRUE)
Arguments
reg |
[ |
src.dirs |
[ |
src.now |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Add source files to registry.
Description
Mutator function for src.files
in makeRegistry
.
Usage
addRegistrySourceFiles(reg, src.files, src.now = TRUE)
Arguments
reg |
[ |
src.files |
[ |
src.now |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
applyJobFunction ONLY FOR INTERNAL USAGE.
Description
applyJobFunction ONLY FOR INTERNAL USAGE.
Usage
applyJobFunction(reg, job, cache)
Arguments
reg |
[ |
job |
[ |
cache |
[ |
Value
[any]. Result of job.
Map function over all combinations.
Description
Maps an n-ary-function over a list of all combinations which are given by some vectors.
Internally expand.grid
is used to compute the combinations, then
batchMap
is called.
Usage
batchExpandGrid(reg, fun, ..., more.args = list())
Arguments
reg |
[ |
fun |
[ |
... |
[any] |
more.args |
[ |
Value
[data.frame
]. Expanded grid of combinations produced by expand.grid
.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x, y, z) x * y + z
# lets store the param grid
grid = batchExpandGrid(reg, f, x = 1:2, y = 1:3, more.args = list(z = 10))
submitJobs(reg)
waitForJobs(reg)
y = reduceResultsVector(reg)
# later, we can always access the param grid like this
grid = getJobParamDf(reg)
cbind(grid, y = y)
Export R object to be available on the slaves.
Description
Saves objects as RData
files in the “exports” subdirectory of your file.dir
to be later loaded on the slaves.
Usage
batchExport(reg, ..., li = list(), overwrite = FALSE)
Arguments
reg |
[ |
... |
[any] |
li |
[ |
overwrite |
[ |
Value
[character
]. Invisibly returns a character vector of exported objects.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Maps a function over lists or vectors, adding jobs to a registry.
Description
You can then submit these jobs to the batch system.
Usage
batchMap(reg, fun, ..., more.args = list(), use.names = FALSE)
Arguments
reg |
[ |
fun |
[ |
... |
[any] |
more.args |
[ |
use.names |
[ |
Value
Vector of type integer
with job ids.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:10)
print(reg)
Combination of makeRegistry, batchMap and submitJobs.
Description
Combination of makeRegistry
, batchMap
and submitJobs
for quick computations on the cluster.
Should only be used by skilled users who know what they are doing.
Creates the file.dir, maps function, potentially chunks jobs and submits them.
Usage
batchMapQuick(
fun,
...,
more.args = list(),
file.dir = NULL,
packages = character(0L),
chunk.size,
n.chunks,
chunks.as.arrayjobs = FALSE,
inds,
resources = list()
)
Arguments
fun |
[ |
... |
[any] |
more.args |
[ |
file.dir |
[ |
packages |
[ |
chunk.size |
[ |
n.chunks |
[ |
chunks.as.arrayjobs |
[ |
inds |
[ |
resources |
[ |
Value
[Registry
]
Maps a function over the results of a registry by using batchMap.
Description
Maps a function over the results of a registry by using batchMap.
Usage
batchMapResults(
reg,
reg2,
fun,
...,
ids,
part = NA_character_,
more.args = list()
)
Arguments
reg |
[ |
reg2 |
[ |
fun |
[ |
... |
[any] |
ids |
[ |
part |
[ |
more.args |
[ |
Value
Vector of type integer
with job ids.
Examples
reg1 = makeRegistry(id = "BatchJobsExample1", file.dir = tempfile(), seed = 123)
# square some numbers
f = function(x) x^2
batchMap(reg1, f, 1:10)
# submit jobs and wait for the jobs to finish
submitJobs(reg1)
waitForJobs(reg1)
# look at results
reduceResults(reg1, fun = function(aggr,job,res) c(aggr, res))
reg2 = makeRegistry(id = "BatchJobsExample2", file.dir = tempfile(), seed = 123)
# define function to tranform results, we simply do the inverse of the squaring
g = function(job, res) sqrt(res)
batchMapResults(reg1, reg2, fun = g)
# submit jobs and wait for the jobs to finish
submitJobs(reg2)
waitForJobs(reg2)
# check results
reduceResults(reg2, fun = function(aggr,job,res) c(aggr, res))
Manually query the BatchJobs database
Description
Manually query the BatchJobs database
Usage
batchQuery(reg, query, flags = "ro")
Arguments
reg |
[ |
query |
[ |
flags |
[ |
Value
[data.frame
] Result of the query.
Examples
reg = makeRegistry("test", file.dir = tempfile())
batchMap(reg, identity, i = 1:10)
batchQuery(reg, "SELECT * FROM test_job_status")
Reduces via a binary function over a list adding jobs to a registry.
Description
Each jobs reduces a certain number of elements on one slave. You can then submit these jobs to the batch system.
Usage
batchReduce(reg, fun, xs, init, block.size, more.args = list())
Arguments
reg |
[ |
fun |
[ |
xs |
[ |
init |
[any] |
block.size |
[ |
more.args |
[ |
Value
Vector of type integer
with job ids.
Examples
# define function to reduce on slave, we want to sum a vector
f = function(aggr, x) aggr + x
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
# sum 20 numbers on each slave process, i.e. 5 jobs
batchReduce(reg, fun = f, 1:100, init = 0, block.size = 5)
submitJobs(reg)
waitForJobs(reg)
# now reduce one final time on master
reduceResults(reg, fun = function(aggr,job,res) f(aggr, res))
Reduces results via a binary function and adds jobs for this to a registry.
Description
Each jobs reduces a certain number of results on one slave.
You can then submit these jobs to the batch system.
Later, you can do a final reduction with reduceResults
on the master.
Usage
batchReduceResults(
reg,
reg2,
fun,
ids,
part = NA_character_,
init,
block.size,
more.args = list()
)
Arguments
reg |
[ |
reg2 |
[ |
fun |
[ |
ids |
[ |
part |
[ |
init |
[any] |
block.size |
[ |
more.args |
[ |
Value
Vector of type integer
with job ids.
Examples
# generating example results:
reg1 = makeRegistry(id = "BatchJobsExample1", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg1, f, 1:20)
submitJobs(reg1)
waitForJobs(reg1)
# define function to reduce on slave, we want to sum the squares
myreduce = function(aggr, job, res) aggr + res
# sum 5 results on each slave process, i.e. 4 jobs
reg2 = makeRegistry(id = "BatchJobsExample2", file.dir = tempfile(), seed = 123)
batchReduceResults(reg1, reg2, fun = myreduce, init = 0, block.size = 5)
submitJobs(reg2)
waitForJobs(reg2)
# now reduce one final time on master
reduceResults(reg2, fun = myreduce)
Unload exported R objects.
Description
Removes RData
files from the “exports” subdirectory of your file.dir
and thereby prevents loading on the slave.
Usage
batchUnexport(reg, what)
Arguments
reg |
[ |
what |
[ |
Value
[character
]. Invisibly returns a character vector of unexported objects.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Call an arbitrary function on specified SSH workers.
Description
Calls can be made in parallel or consecutively, the function waits until all calls have finished and returns call results. In consecutive mode the output on the workers can also be shown on the master during computation.
Please read and understand the comments for argument dir
.
Note that this function should only be used for short administrative
tasks or information gathering on the workers, the true work horse for
real computation is submitJobs
.
In makeSSHWorker
various options for load
management are possible. Note that these will be
ignored for the current call to execute it immediatly.
Usage
callFunctionOnSSHWorkers(
nodenames,
fun,
...,
consecutive = FALSE,
show.output = consecutive,
simplify = TRUE,
use.names = TRUE,
dir = getwd()
)
Arguments
nodenames |
[ |
fun |
[ |
... |
[any] |
consecutive |
[ |
show.output |
[ |
simplify |
[ |
use.names |
[ |
dir |
[ |
Value
Results of function calls, either a list or simplified.
Cluster functions helper: Brew your template into a job description file.
Description
This function is only intended for use in your own cluster functions implementation.
Calls brew silently on your template, any error will lead to an exception.
If debug mode is turned on in the configuration, the file is stored at the same place as the
corresponding R script in the “jobs”-subdir of your files directory,
otherwise in the temp dir via tempfile
.
Usage
cfBrewTemplate(conf, template, rscript, extension)
Arguments
conf |
[ |
template |
[ |
rscript |
[ |
extension |
[ |
Value
[character(1)
]. File path of result.
Cluster functions helper: Handle an unknown error during job submission.
Description
This function is only intended for use in your own cluster functions implementation.
Simply constructs a SubmitJobResult
object with status code 101,
NA as batch job id and an informative error message containing the output of the OS command in output
.
Usage
cfHandleUnknownSubmitError(cmd, exit.code, output)
Arguments
cmd |
[ |
exit.code |
[ |
output |
[ |
Value
Cluster functions helper: Kill a batch job via OS command
Description
This function is only intended for use in your own cluster functions implementation.
Calls the OS command to kill a job via system
like this: “cmd batch.job.id”.
If the command returns an exit code > 0, the command is repeated
after a 1 second sleep max.tries-1
times.
If the command failed in all tries, an exception is generated.
Usage
cfKillBatchJob(cmd, batch.job.id, max.tries = 3L)
Arguments
cmd |
[ |
batch.job.id |
[ |
max.tries |
[ |
Value
Nothing.
Cluster functions helper: Read in your brew template file.
Description
This function is only intended for use in your own cluster functions implementation.
Simply reads your template and returns it as a character vector. If you do this in the constructor of your cluster functions once, you can avoid this repeated file access later on.
Usage
cfReadBrewTemplate(template.file)
Arguments
template.file |
[ |
Value
[character
].
Check job ids.
Description
Simply checks if probided vector of job ids is valid and throws an error if something is odd.
Usage
checkIds(reg, ids, check.present = TRUE, len = NULL)
Arguments
reg |
[ |
ids |
[ |
check.present |
[ |
len |
[ |
Value
Invisibly returns the vector of ids, converted to integer.
BatchJobs configuration.
Description
In order to understand how the package should be configured please read https://github.com/tudo-r/BatchJobs/wiki/Configuration.
See Also
Other conf:
getConfig()
,
loadConfig()
,
setConfig()
ONLY FOR INTERNAL USAGE.
Description
ONLY FOR INTERNAL USAGE.
Usage
copyRequiredJobFiles(reg1, reg2, id)
Arguments
reg1 |
[ |
reg2 |
[ |
id |
[ |
Value
Nothing.
ONLY FOR INTERNAL USAGE.
Description
ONLY FOR INTERNAL USAGE.
Usage
dbCreateJobDefTable(reg)
Arguments
reg |
[ |
Value
Nothing.
ONLY FOR INTERNAL USAGE.
Description
ONLY FOR INTERNAL USAGE.
Usage
dbGetJobs(reg, ids)
Arguments
reg |
[ |
ids |
[ |
Value
[list of Job
]. Retrieved jobs from DB.
Helper function to debug multicore mode.
Description
Useful in case of severe errors. Tries different operations of increasing difficulty and provides debug output on the console
Usage
debugMulticore(r.options)
Arguments
r.options |
[ |
Value
Nothing.
See Also
Other debug:
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Helper function to debug SSH mode.
Description
Useful in case of configuration problems. Tries different operations of increasing difficulty and provides debug output on the console.
Note that this function does not access nor use information specified for your cluster functions in your configuration.
Usage
debugSSH(
nodename,
ssh.cmd = "ssh",
ssh.args = character(0L),
rhome = "",
r.options = c("--no-save", "--no-restore", "--no-init-file", "--no-site-file"),
dir = getwd()
)
Arguments
nodename |
[ |
ssh.cmd |
[ |
ssh.args |
[ |
rhome |
[ |
r.options |
[ |
dir |
[ |
Value
Nothing.
See Also
Other debug:
debugMulticore()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Find all results where a specific condition is true.
Description
Find all results where a specific condition is true.
Usage
filterResults(reg, ids, fun, ...)
Arguments
reg |
[ |
ids |
[ |
fun |
[ |
... |
[any] |
Value
[integer
]. Ids of jobs where fun(job, result)
returns TRUE
.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:10)
submitJobs(reg)
waitForJobs(reg)
# which square numbers are even:
filterResults(reg, fun = function(job, res) res %% 2 == 0)
Find jobs depending on computional state.
Description
findDone
: Find jobs which succesfully terminated.
Usage
findDone(reg, ids, limit = NULL)
findNotDone(reg, ids, limit = NULL)
findMissingResults(reg, ids, limit = NULL)
findErrors(reg, ids, limit = NULL)
findNotErrors(reg, ids, limit = NULL)
findTerminated(reg, ids, limit = NULL)
findNotTerminated(reg, ids, limit = NULL)
findSubmitted(reg, ids, limit = NULL)
findNotSubmitted(reg, ids, limit = NULL)
findOnSystem(reg, ids, limit = NULL)
findNotOnSystem(reg, ids, limit = NULL)
findRunning(reg, ids, limit = NULL)
findNotRunning(reg, ids, limit = NULL)
findStarted(reg, ids, limit = NULL)
findNotStarted(reg, ids, limit = NULL)
findExpired(reg, ids, limit = NULL)
findDisappeared(reg, ids, limit = NULL)
Arguments
reg |
[ |
ids |
[ |
limit |
[ |
Value
[integer
]. Ids of jobs.
Finds ids of jobs that match a query.
Description
Finds ids of jobs that match a query.
Usage
findJobs(reg, ids, pars, jobnames)
Arguments
reg |
[ |
ids |
[ |
pars |
[R expression] |
jobnames |
[ |
Value
[integer
]. Ids for jobs which match the query.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x, y) x * y
batchExpandGrid(reg, f, x = 1:2, y = 1:3)
findJobs(reg, pars = (y > 2))
Returns a list of BatchJobs configuration settings
Description
Returns a list of BatchJobs configuration settings
Usage
getConfig()
Value
list
of current configuration variables with classs “Config”.
See Also
Other conf:
configuration
,
loadConfig()
,
setConfig()
Get error messages of jobs.
Description
Get error messages of jobs.
Usage
getErrorMessages(reg, ids)
Arguments
reg |
[ |
ids |
[ |
Value
[character
]. Error messages for jobs as character vector
NA
if job has terminated successfully.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Get job from registry by id.
Description
Get job from registry by id.
Usage
getJob(reg, id, check.id = TRUE)
Arguments
reg |
[ |
id |
[ |
check.id |
[ |
Value
[Job
].
Get ids of jobs in registry.
Description
Get ids of jobs in registry.
Usage
getJobIds(reg)
Arguments
reg |
[ |
Value
[character
].
Get computational information of jobs.
Description
Returns time stamps (submitted, started, done, error),
time running, approximate memory usage (in Mb),
error messages (shortened, see showLog
for detailed error messages),
time in queue, hostname of the host the job was executed,
assigned batch ID, the R PID and the seed of the job.
To estimate memory usage the sum of the last column of gc
is used.
Column “time.running” displays the time until either the job was done, or an error occured;
it will by NA
in case of time outs or hard R crashes.
Usage
getJobInfo(
reg,
ids,
pars = FALSE,
prefix.pars = FALSE,
select,
unit = "seconds"
)
Arguments
reg |
[ |
ids |
[ |
pars |
[ |
prefix.pars |
[ |
select |
[ |
unit |
[ |
Value
[data.frame
].
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Get the physical location of job files on the hard disk.
Description
Get the physical location of job files on the hard disk.
Usage
getJobLocation(reg, ids)
Arguments
reg |
[ |
ids |
[ |
Value
[character
] Vector of directories.
Get number of jobs in registry.
Description
Get number of jobs in registry.
Usage
getJobNr(reg)
Arguments
reg |
[ |
Value
[integer(1)
].
Retrieve Job Parameters.
Description
Returns parameters for all jobs as the rows of a data.frame.
Usage
getJobParamDf(reg, ids)
Arguments
reg |
[ |
ids |
[ |
Value
[data.frame
]. Rows are named with job ids.
Examples
# see batchExpandGrid
Function to get the resources that were submitted for some jobs.
Description
Throws an error if call it for unsubmitted jobs.
Usage
getJobResources(reg, ids, as.list = TRUE)
Arguments
reg |
[ |
ids |
[ |
as.list |
[ |
Value
[list
| data.frame
]. List (or data.frame) of resource lists as passed to submitJobs
.
Get jobs from registry by id.
Description
Get jobs from registry by id.
Usage
getJobs(reg, ids, check.ids = TRUE)
Arguments
reg |
[ |
ids |
[ |
check.ids |
[ |
Value
[list of Job
].
Get log file paths for jobs.
Description
Get log file paths for jobs.
Usage
getLogFiles(reg, ids)
Arguments
reg |
[ |
ids |
[ |
Value
[character
]. Vector of file paths to log files.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Function to get job resources in job function.
Description
Return the list passed to submitJobs
, e.g.
nodes, walltime, etc.
Usage
getResources()
Details
Can only be called in job function during job execution on slave.
Value
[list
].
Print and return R installation and other information for SSH workers.
Description
Workers are queried in parallel via callFunctionOnSSHWorkers
.
The function will display a warning if the first lib path on the worker
is not writable as this indicates potential problems in the configuration
and installPackagesOnSSHWorkers
will not work.
Usage
getSSHWorkersInfo(nodenames)
Arguments
nodenames |
[ |
Value
[list
]. Displayed information as a list named by nodenames.
See Also
Grep log files for a pattern.
Description
Searches for occurence of pattern
in log files.
Usage
grepLogs(
reg,
ids,
pattern = "warn",
ignore.case = TRUE,
verbose = FALSE,
range = 2L
)
Arguments
reg |
[ |
ids |
[ |
pattern |
[ |
ignore.case |
[ |
verbose |
[ |
range |
[ |
Value
[integer
]. Ids of jobs where pattern was found in the log file.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Install packages on SSH workers.
Description
Installation is done via callFunctionOnSSHWorkers
and install.packages
.
Note that as usual the function tries to install
the packages into the first path of .libPaths()
of each each worker.
Usage
installPackagesOnSSHWorkers(
nodenames,
pkgs,
repos = getOption("repos"),
consecutive = TRUE,
show.output = consecutive,
...
)
Arguments
nodenames |
[ |
pkgs |
[ |
repos |
[ |
consecutive |
[ |
show.output |
[ |
... |
[any] |
Value
Nothing.
See Also
Kill some jobs on the batch system.
Description
Kill jobs which have already been submitted to the batch system. If a job is killed its internal state is reset as if it had not been submitted at all.
The function informs if (a) the job you want to kill has not been submitted, (b) the job has already terminated, (c) for some reason no batch job id is available. In all 3 cases above, nothing is changed for the state of this job and no call to the internal kill cluster function is generated.
In case of an error when killing, the function tries - after a short sleep - to kill the remaining batch jobs again. If this fails again for some jobs, the function gives up. Only jobs that could be killed are reset in the DB.
Usage
killJobs(reg, ids, progressbar = TRUE)
Arguments
reg |
[ |
ids |
[ |
progressbar |
[ |
Value
[integer
]. Ids of killed jobs.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
resetJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Examples
## Not run:
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) Sys.sleep(x)
batchMap(reg, f, 1:10 + 5)
submitJobs(reg)
waitForJobs(reg)
# kill all jobs currently _running_
killJobs(reg, findRunning(reg))
# kill all jobs queued or running
killJobs(reg, findNotTerminated(reg))
## End(Not run)
Load a specific configuration file.
Description
Load a specific configuration file.
Usage
loadConfig(conffile = ".BatchJobs.R")
Arguments
conffile |
[ |
Value
Invisibly returns a list of configuration settings.
See Also
Other conf:
configuration
,
getConfig()
,
setConfig()
Load exported R data objects.
Description
Loads exported RData
object files in the “exports” subdirectory of your file.dir
and assigns the objects to the global environment.
Usage
loadExports(reg, what = NULL)
Arguments
reg |
[ |
what |
[ |
Value
[character
]. Invisibly returns a character vector of loaded objects.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Load a previously saved registry.
Description
Loads a previously created registry from the file system.
The file.dir
is automatically updated upon load if adjust.paths
is set to
TRUE
, so be careful if you use the registry on multiple machines simultaneously,
e.g. via sshfs or a samba share.
There is a heuristic included which tries to detect if the location of the registry has changed and returns a read-only registry if necessary.
Usage
loadRegistry(file.dir, work.dir, adjust.paths = FALSE)
Arguments
file.dir |
[ |
work.dir |
[ |
adjust.paths |
[ |
Value
[Registry
].
Loads a specific result file.
Description
Loads a specific result file.
Usage
loadResult(
reg,
id,
part = NA_character_,
missing.ok = FALSE,
impute.val = NULL
)
Arguments
reg |
[ |
id |
[ |
part |
[ |
missing.ok |
[ |
impute.val |
[any] |
Value
[any]. Result of job.
See Also
Loads result files for id vector.
Description
Loads result files for id vector.
Usage
loadResults(
reg,
ids,
part = NA_character_,
simplify = FALSE,
use.names = "ids",
missing.ok = FALSE,
impute.val = NULL
)
Arguments
reg |
[ |
ids |
[ |
part |
[ |
simplify |
[ |
use.names |
[ |
missing.ok |
[ |
impute.val |
[any] |
Value
[list
]. Results of jobs as list, possibly named by ids.
See Also
Create a ClusterFuntions object.
Description
Use this funtion when you implement a backend for a batch system. You must define the functions specified in the arguments.
Usage
makeClusterFunctions(
name,
submitJob,
killJob,
listJobs,
getArrayEnvirName,
class = NULL,
...
)
Arguments
name |
[ |
submitJob |
[ |
killJob |
[ |
listJobs |
[ |
getArrayEnvirName |
[ |
class |
[ |
... |
[ |
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
Create cluster functions for sequential execution in same session.
Description
All jobs executed under these cluster functions are executed
sequentially, in the same interactive R process that you currently are.
That is, submitJob
does not return until the
job has finished. The main use of this ClusterFunctions
implementation is to test and debug programs on a local computer.
Listing jobs returns an empty vector (as no jobs can be running when you call this)
and killJob
returns at once (for the same reason).
Usage
makeClusterFunctionsInteractive(write.logs = TRUE)
Arguments
write.logs |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create cluster functions for LSF systems.
Description
Job files are created based on the brew template
template.file
. This file is processed with brew and then
submitted to the queue using the bsub
command. Jobs are
killed using the bkill
command and the list of running jobs
is retrieved using bjobs -u $USER -w
. The user must have the
appropriate privileges to submit, delete and list jobs on the
cluster (this is usually the case).
The template file can access all arguments passed to the
submitJob
function, see here ClusterFunctions
.
It is the template file's job to choose a queue for the job
and handle the desired resource allocations.
Examples can be found on
https://github.com/tudo-r/BatchJobs/tree/master/examples/cfLSF.
Usage
makeClusterFunctionsLSF(
template.file,
list.jobs.cmd = c("bjobs", "-u $USER", "-w")
)
Arguments
template.file |
[ |
list.jobs.cmd |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create cluster functions for sequential execution on local host.
Description
All jobs executed under these cluster functions are executed
sequentially, but in an independent, new R session.
That is, submitJob
does not return until the
job has finished. The main use of this ClusterFunctions
implementation is to test and debug programs on a local computer.
Listing jobs returns an empty vector (as no jobs can be running when you call this)
and killJob
returns at once (for the same reason).
Usage
makeClusterFunctionsLocal()
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Use multiple cores on local Linux machine to spawn parallel jobs.
Description
Jobs are spawned by starting multiple R sessions on the commandline
(similar like on true batch systems).
Packages parallel
or multicore
are not used in any way.
Usage
makeClusterFunctionsMulticore(
ncpus = max(getOption("mc.cores", parallel::detectCores()) - 1L, 1L),
max.jobs,
max.load,
nice,
r.options = c("--no-save", "--no-restore", "--no-init-file", "--no-site-file"),
script
)
Arguments
ncpus |
[ |
max.jobs |
[ |
max.load |
[ |
nice |
[ |
r.options |
[ |
script |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create cluster functions for OpenLava systems.
Description
Job files are created based on the brew template
template.file
. This file is processed with brew and then
submitted to the queue using the bsub
command. Jobs are
killed using the bkill
command and the list of running jobs
is retrieved using bjobs -u $USER -w
. The user must have the
appropriate privileges to submit, delete and list jobs on the
cluster (this is usually the case).
The template file can access all arguments passed to the
submitJob
function, see here ClusterFunctions
.
It is the template file's job to choose a queue for the job
and handle the desired resource allocations.
Examples can be found on
https://github.com/tudo-r/BatchJobs/tree/master/examples/cfOpenLava.
Usage
makeClusterFunctionsOpenLava(
template.file,
list.jobs.cmd = c("bjobs", "-u $USER", "-w")
)
Arguments
template.file |
[ |
list.jobs.cmd |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create cluster functions for Sun Grid Engine systems.
Description
Job files are created based on the brew template
template.file
. This file is processed with brew and then
submitted to the queue using the qsub
command. Jobs are
killed using the qdel
command and the list of running jobs
is retrieved using qselect
. The user must have the
appropriate privileges to submit, delete and list jobs on the
cluster (this is usually the case).
The template file can access all arguments passed to the
submitJob
function, see here ClusterFunctions
.
It is the template file's job to choose a queue for the job
and handle the desired resource allocations.
Examples can be found on
https://github.com/tudo-r/BatchJobs/tree/master/examples/cfSGE.
Usage
makeClusterFunctionsSGE(template.file, list.jobs.cmd = c("qstat", "-u $USER"))
Arguments
template.file |
[ |
list.jobs.cmd |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create cluster functions for SLURM-based systems.
Description
Job files are created based on the brew template
template.file
. This file is processed with brew and then
submitted to the queue using the sbatch
command. Jobs are
killed using the scancel
command and the list of running jobs
is retrieved using squeue
. The user must have the
appropriate privileges to submit, delete and list jobs on the
cluster (this is usually the case).
The template file can access all arguments passed to the
submitJob
function, see here ClusterFunctions
.
It is the template file's job to choose a queue for the job
and handle the desired resource allocations.
Examples can be found on
https://github.com/tudo-r/BatchJobs/tree/master/examples/cfSLURM.
Usage
makeClusterFunctionsSLURM(
template.file,
list.jobs.cmd = c("squeue", "-h", "-o %i", "-u $USER")
)
Arguments
template.file |
[ |
list.jobs.cmd |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Create an SSH cluster to execute jobs.
Description
Worker nodes must share the same file system and be accessible by ssh without manually entering passwords (e.g. by ssh-agent or passwordless pubkey). Note that you can also use this function to parallelize on multiple cores on your local machine. But you still have to run an ssh server and provide passwordless access to localhost.
Usage
makeClusterFunctionsSSH(..., workers)
Arguments
... |
[ |
workers |
[list of |
Value
[ClusterFunctions
].
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsTorque()
,
makeClusterFunctions()
Examples
## Not run:
# Assume you have three nodes larry, curley and moe. All have 6
# cpu cores. On curley and moe R is installed under
# "/opt/R/R-current" and on larry R is installed under
# "/usr/local/R/". larry should not be used extensively because
# somebody else wants to compute there as well.
# Then a call to 'makeClusterFunctionsSSH'
# might look like this:
cluster.functions = makeClusterFunctionsSSH(
makeSSHWorker(nodename = "larry", rhome = "/usr/local/R", max.jobs = 2),
makeSSHWorker(nodename = "curley", rhome = "/opt/R/R-current"),
makeSSHWorker(nodename = "moe", rhome = "/opt/R/R-current"))
## End(Not run)
Create cluster functions for torque-based systems.
Description
Job files are created based on the brew template
template.file
. This file is processed with brew and then
submitted to the queue using the qsub
command. Jobs are
killed using the qdel
command and the list of running jobs
is retrieved using qselect
. The user must have the
appropriate privileges to submit, delete and list jobs on the
cluster (this is usually the case).
The template file can access all arguments passed to the
submitJob
function, see here ClusterFunctions
.
It is the template file's job to choose a queue for the job
and handle the desired resource allocations.
Examples can be found on
https://github.com/tudo-r/BatchJobs/tree/master/examples/cfTorque.
Usage
makeClusterFunctionsTorque(
template.file,
list.jobs.cmd = c("qselect", "-u $USER", "-s EHQRTW")
)
Arguments
template.file |
[ |
list.jobs.cmd |
[ |
Value
See Also
Other clusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsLocal()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSLURM()
,
makeClusterFunctionsSSH()
,
makeClusterFunctions()
Creates a job description.
Description
Usually you will not do this manually. Every object is a list that contains the passed arguments of the constructor.
Usage
makeJob(id = NA_integer_, fun, fun.id = digest(fun), pars, name, seed)
Arguments
id |
[ |
fun |
[ |
fun.id |
[ |
pars |
[ |
name |
[ |
seed |
[ |
Construct a registry object.
Description
Note that if you don't want links in your paths (file.dir
, work.dir
) to get resolved and have
complete control over the way the path is used internally, pass an absolute path which begins with “/”.
Usage
makeRegistry(
id,
file.dir,
sharding = TRUE,
work.dir,
multiple.result.files = FALSE,
seed,
packages = character(0L),
src.dirs = character(0L),
src.files = character(0L),
skip = TRUE
)
Arguments
id |
[ |
file.dir |
[ |
sharding |
[ |
work.dir |
[ |
multiple.result.files |
[ |
seed |
[ |
packages |
[ |
src.dirs |
[ |
src.files |
[ |
skip |
[ |
Details
Every object is a list that contains the passed arguments of the constructor.
Value
[Registry
]
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
print(reg)
Create SSH worker for SSH cluster functions.
Description
Create SSH worker for SSH cluster functions.
Usage
makeSSHWorker(
nodename,
ssh.cmd = "ssh",
ssh.args = character(0L),
rhome = "",
ncpus,
max.jobs,
max.load,
nice,
r.options = c("--no-save", "--no-restore", "--no-init-file", "--no-site-file"),
script
)
Arguments
nodename |
[ |
ssh.cmd |
[ |
ssh.args |
[ |
rhome |
[ |
ncpus |
[ |
max.jobs |
[ |
max.load |
[ |
nice |
[ |
r.options |
[ |
script |
[ |
Value
[SSHWorker
].
Create a SubmitJobResult object.
Description
Use this function in your implementation of makeClusterFunctions
to create a return value for the submitJob
function.
Usage
makeSubmitJobResult(status, batch.job.id, msg, ...)
Arguments
status |
[ |
batch.job.id |
[ |
msg |
[ |
... |
[ |
Value
[SubmitJobResult
]. A list, containing
status
, batch.job.id
and msg
.
Reduce results from result directory.
Description
The following functions provide ways to reduce result files into either specific R objects (like vectors, lists, matrices or data.frames) or to arbitrarily aggregate them, which is a more general operation.
Usage
reduceResults(
reg,
ids,
part = NA_character_,
fun,
init,
impute.val,
progressbar = TRUE,
...
)
reduceResultsList(
reg,
ids,
part = NA_character_,
fun,
...,
use.names = "ids",
impute.val,
progressbar = TRUE
)
reduceResultsVector(
reg,
ids,
part = NA_character_,
fun,
...,
use.names = "ids",
impute.val
)
reduceResultsMatrix(
reg,
ids,
part = NA_character_,
fun,
...,
rows = TRUE,
use.names = "ids",
impute.val
)
reduceResultsDataFrame(
reg,
ids,
part = NA_character_,
fun,
...,
use.names = "ids",
impute.val,
strings.as.factors = FALSE
)
reduceResultsDataTable(
reg,
ids,
part = NA_character_,
fun,
...,
use.names = "ids",
impute.val
)
Arguments
reg |
[ |
ids |
[ |
part |
[ |
fun |
[ |
init |
[ |
impute.val |
[any] |
progressbar |
[ |
... |
[any] |
use.names |
[ |
rows |
[ |
strings.as.factors |
[ |
Value
Aggregated results, return type depends on function. If ids
is empty: reduceResults
returns init
(if available) or NULL
, reduceResultsVector
returns c()
,
reduceResultsList
returns list()
, reduceResultsMatrix
returns matrix(0,0,0)
,
reduceResultsDataFrame
returns data.frame()
.
Examples
# generate results:
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:5)
submitJobs(reg)
waitForJobs(reg)
# reduce results to a vector
reduceResultsVector(reg)
# reduce results to sum
reduceResults(reg, fun = function(aggr, job, res) aggr+res)
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) list(a = x, b = as.character(2*x), c = x^2)
batchMap(reg, f, 1:5)
submitJobs(reg)
waitForJobs(reg)
# reduce results to a vector
reduceResultsVector(reg, fun = function(job, res) res$a)
reduceResultsVector(reg, fun = function(job, res) res$b)
# reduce results to a list
reduceResultsList(reg)
# reduce results to a matrix
reduceResultsMatrix(reg, fun = function(job, res) res[c(1,3)])
reduceResultsMatrix(reg, fun = function(job, res) c(foo = res$a, bar = res$c), rows = TRUE)
reduceResultsMatrix(reg, fun = function(job, res) c(foo = res$a, bar = res$c), rows = FALSE)
# reduce results to a data.frame
print(str(reduceResultsDataFrame(reg)))
# reduce results to a sum
reduceResults(reg, fun = function(aggr, job, res) aggr+res$a, init = 0)
Remove a registry object.
Description
If there are no live/running jobs, the registry will be closed and all of its files will be removed from the file system. If there are live/running jobs, an informative error is generated. The default is to prompt the user for confirmation.
Usage
removeRegistry(reg, ask = c("yes", "no"))
Arguments
reg |
[ |
ask |
[ |
Value
[logical[1]
]
Remove packages from registry.
Description
Mutator function for packages
in makeRegistry
.
Usage
removeRegistryPackages(reg, packages)
Arguments
reg |
[ |
packages |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Remove packages from registry.
Description
Mutator function for src.dirs
in makeRegistry
.
Usage
removeRegistrySourceDirs(reg, src.dirs)
Arguments
reg |
[ |
src.dirs |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceFiles()
,
setRegistryPackages()
Remove source files from registry.
Description
Mutator function for src.files
in makeRegistry
.
Usage
removeRegistrySourceFiles(reg, src.files)
Arguments
reg |
[ |
src.files |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
setRegistryPackages()
Reset computational state of jobs.
Description
Reset state of jobs in the database. Useful under two circumstances: Either to re-submit them because of changes in e.g. external data or to resolve rare issues when jobs are killed in an unfortunate state and therefore blocking your registry.
The function internally lists all jobs on the batch system and
if those include some of the jobs you want to reset, it informs you to kill them first by raising
an exception.
If you really know what you are doing, you may set force
to TRUE
to omit this sanity check.
Note that this is a dangerous operation to perform which may harm
the database integrity. In this case you HAVE to make externally sure that none of the jobs
you want to reset are still running.
Usage
resetJobs(reg, ids, force = FALSE)
Arguments
reg |
[ |
ids |
[ |
force |
[ |
Value
Vector of reseted job ids.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
setJobFunction()
,
showLog()
,
testJob()
Sanitize a path
Description
Replaces backward slashes with forward slashes and optionally normalizes the path.
Usage
sanitizePath(path, make.absolute = TRUE, normalize.absolute = FALSE)
Arguments
path |
[ |
make.absolute |
[ |
normalize.absolute |
[ |
Value
character
with sanitized paths.
Set and overwrite configuration settings
Description
Set and overwrite configuration settings
Usage
setConfig(conf = list(), ...)
Arguments
conf |
[ |
... |
[ |
Value
Invisibly returns a list of configuration settings.
See Also
Other conf:
configuration
,
getConfig()
,
loadConfig()
Sets the job function for already existing jobs.
Description
Use this function only as last measure when there is a bug in a part of your job function and you have already computed a large number of (unaffected) results. This function allows you to fix the error and to associate the jobs with the corrected function.
Note that by default the computational state of the affected jobs is also reset.
Usage
setJobFunction(reg, ids, fun, more.args = list(), reset = TRUE, force = FALSE)
Arguments
reg |
[ |
ids |
[ |
fun |
[ |
more.args |
[ |
reset |
[ |
force |
[ |
Value
Nothing.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
showLog()
,
testJob()
Set job names.
Description
Set job names.
Usage
setJobNames(reg, ids, jobnames)
Arguments
reg |
[ |
ids |
[ |
jobnames |
[ |
Value
Named vector of job ids.
Set packages for a registry.
Description
Mutator function for packages
in makeRegistry
.
Usage
setRegistryPackages(reg, packages)
Arguments
reg |
[ |
packages |
[ |
Value
[Registry
]. Changed registry.
See Also
Other exports:
addRegistryPackages()
,
addRegistrySourceDirs()
,
addRegistrySourceFiles()
,
batchExport()
,
batchUnexport()
,
loadExports()
,
removeRegistryPackages()
,
removeRegistrySourceDirs()
,
removeRegistrySourceFiles()
Show information about available computational resources on cluster.
Description
Currently only supported for multicore and SSH mode.
Displays: Name of node, current load, number of running R processes, number of R processes
with more than 50
The latter counts either jobs belonging to reg
or all BatchJobs jobs if reg was not passed.
Usage
showClusterStatus(reg)
Arguments
reg |
[ |
Value
[data.frame
].
Display the contents of a log file.
Description
Display the contents of a log file, useful in case of errors.
Note this rare special case: When you use chunking, submit some jobs, some jobs fail,
then you resubmit these jobs again in different chunks, the log files will contain the log
of the old, failed job as well. showLog
tries to jump to the correct part
of the new log file with a supported pager.
Usage
showLog(reg, id, pager = getOption("pager"))
Arguments
reg |
[ |
id |
[ |
pager |
[ |
Value
[character(1)
]. Invisibly returns path to log file.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
testJob()
Retrieve or show status information about jobs.
Description
E.g.: How many there are, how many are done, any errors, etc.
showStatus
displays on the console, getStatus
returns an informative result
without console output.
Usage
showStatus(reg, ids, run.and.exp = TRUE, errors = 10L)
getStatus(reg, ids, run.and.exp = TRUE)
Arguments
reg |
[ |
ids |
[ |
run.and.exp |
[ |
errors |
[ |
Value
[list
]. List of absolute job numbers. showStatus
returns them
invisibly.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:10)
submitJobs(reg)
waitForJobs(reg)
# should show 10 submitted jobs, which are all done.
showStatus(reg)
Source registry files
Description
Sources all files found in src.dirs
and specified via src.files
.
Usage
sourceRegistryFiles(reg, envir = .GlobalEnv)
Arguments
reg |
[ |
envir |
[ |
Value
Nothing.
Submit jobs or chunks of jobs to batch system via cluster function.
Description
If the internal submit cluster function completes successfully, the retries
counter is set back to 0 and the next job or chunk is submitted.
If the internal submit cluster function returns a fatal error, the submit process
is completely stopped and an exception is thrown.
If the internal submit cluster function returns a temporary error, the submit process
waits for a certain time, which is determined by calling the user-defined
wait
-function with the current retries
counter, the counter is
increased by 1 and the same job is submitted again. If max.retries
is
reached the function simply terminates.
Potential temporary submit warnings and errors are logged inside your file
directory in the file “submit.log”.
To keep track you can use tail -f [file.dir]/submit.log
in another
terminal.
Usage
submitJobs(
reg,
ids,
resources = list(),
wait,
max.retries = 10L,
chunks.as.arrayjobs = FALSE,
job.delay = FALSE,
progressbar = TRUE
)
Arguments
reg |
[ |
ids |
[ |
resources |
[ |
wait |
[ |
max.retries |
[ |
chunks.as.arrayjobs |
[ |
job.delay |
[ |
progressbar |
[ |
Value
[integer
]. Vector of submitted job ids.
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) x^2
batchMap(reg, f, 1:10)
submitJobs(reg)
waitForJobs(reg)
# Submit the 10 jobs again, now randomized into 2 chunks:
chunked = chunk(getJobIds(reg), n.chunks = 2, shuffle = TRUE)
submitJobs(reg, chunked)
Sweep obsolete files from the file system.
Description
Removes R scripts, log files, resource informations and temporarily stored configuration files from the registry's file directory. Assuming all your jobs completed successfully, none of these are needed for further work. This operation potentially releases quite a lot of disk space, depending on the number of your jobs. BUT A HUGE WORD OF WARNING: IF you later notice something strange and need to determine the reason for it, you are at a huge disadvantage. Only do this at your own risk and when you are sure that you have successfully completed a project and only want to archive your produced experiments and results.
Usage
sweepRegistry(reg, sweep = c("scripts", "conf"))
Arguments
reg |
[ |
sweep |
[ |
Value
[logical
]. Invisibly returns TRUE
on success and FALSE
if some files could not be removed.
Syncronize staged queries into the registry.
Description
If the option “staged.queries” is enabled, all communication from the nodes
to the master is done via files in the subdirectory “pending” of the file.dir
.
This function checks for such files and merges the information into the database.
Usually you do not have to call this function yourself.
Usage
syncRegistry(reg)
Arguments
reg |
[ |
Value
Invisibly returns TRUE
on success.
Tests a job by running it with Rscript in a new process.
Description
Useful for debugging. Note that neither the registry, database or file directory are changed.
Usage
testJob(reg, id, resources = list(), external = TRUE)
Arguments
reg |
[ |
id |
[ |
resources |
[ |
external |
[ |
Value
[any]. Result of job. If the job did not complete because of an error, NULL is returned.
See Also
Other debug:
debugMulticore()
,
debugSSH()
,
getErrorMessages()
,
getJobInfo()
,
getLogFiles()
,
grepLogs()
,
killJobs()
,
resetJobs()
,
setJobFunction()
,
showLog()
Examples
reg = makeRegistry(id = "BatchJobsExample", file.dir = tempfile(), seed = 123)
f = function(x) if (x==1) stop("oops") else x
batchMap(reg, f, 1:2)
testJob(reg, 2)
ONLY FOR INTERNAL USAGE.
Description
ONLY FOR INTERNAL USAGE.
Usage
updateRegistry(reg)
Arguments
reg |
[ |
Value
[any]. Updated Registry
or FALSE
if no updates were performed.
Wait for termination of jobs on the batch system.
Description
Waits for termination of jobs while displaying a progress bar containing summarizing informations of the jobs. The following abbreviations are used in the progress bar: “S” for number of jobs on system, “D” for number of jobs successfully terminated, “E” for number ofjobs terminated with an R exception and “R” for number of jobs currently running on the system.
Usage
waitForJobs(
reg,
ids,
sleep = 10,
timeout = 604800,
stop.on.error = FALSE,
progressbar = TRUE
)
Arguments
reg |
[ |
ids |
[ |
sleep |
[ |
timeout |
[ |
stop.on.error |
[ |
progressbar |
[ |
Value
[logical(1)
]. Returns TRUE
if all jobs terminated successfully
and FALSE
if either an error occurred or the timeout is reached.