Automator.py

Main SyQADA Driver. This initializes and runs a series of BatchRunners.

It is invoked as:

>>> syqada auto

The Automator takes a configuration, a sample file, and a protocol, and initializes and then runs the steps of a workflow.

A pipeline is an instance of a protocol. A pipeline is configured as a directory that contains a control directory. The control directory contains a config file, a samples file, and a protocol file (or soft link to one), plus any individual task configurations.

The workflows directory of the SyQADA release contains protocol definitions that list the tasks of the protocol and where to find them. It also contains a protocol task definition with defaults for each task.

syqada auto:

reads the protocol file and builds a protocol
for each task:
  constructs a working directory and METADATA file
then
for each task:
  determines whether it has been done
  if not,
     determines parameters from defaults and overrides
     constructs the appropriate BatchRunner
     fires it off
     waits for completion
  verifies results
  rinses, moves to the next task to repeat
>>> Automator.py [--configuration configfile
                  --sample_file samplefile
                  --protocol controlfile
                  ]
                  [--project PROJECTNAME]
                  [--parameters parameters]
                  [--init]
                  [--ignore NNNN ...]

–protocol

file that names the tasks to be run

–configuration

file that provides definitions of terms used in templates, usually executable paths

–sample_file

file that names the samples on which tasks are to be run

–parameters

parameter file for various tasks. I dream of creating a standard parameters file for filling certain parameters in the configs. Still not used.

–ignore

optionally ignore an error in a step and proceed to the next step by listing its numeric prefix, e.g.,:
>>> syqada auto --ignore 0004 0007

I have found this useful for cases where sourcedata has typos in embedded sample names, so that one sample of 300 failed a step.

Developer Documentation Only Below This Point:

Architecturally, a Protocol is a series of Tasks (both of these are defined in Protocol.py) All tasks have TASKDEF, NAME, TEMPLATE, WALLTIME, PROCESSORS plus other task-specific attributes. The Automator constructs a Protocol from a protocol file, and then invokes a BatchRunner for each task of the Protocol until it either fails or the entire Protocol completes successfully.

class Automator.Automator(args, taskformat=None)

Automate a series of BatchRunners.

Accept a configuration file, sample file, a protocol, and numerous other input parameters, and sequentially build or reload a BatchRunner for each task in the protocol, delegating to the BatchRunner the building and running of the JobBatch for the task, checking its results, and continuing as long as there are no errors.

initialize(stderr=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>)

1.1 Delegate most of the work to the Protocol

process_batchroot(batchroot, stderr, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>, stdout=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, replication=None, run_parallel=False)

Determine whether the batch is runnable (it may be complete or in error, and it can change state because of externally run syqada manage batchroot –fix, etc). If so, run it, and then determine whether it completed successfully so that the workflow can continue.

batchroot the task directory in question

replication the individual replicate

run_parallel whether this replication (must be a replication) has the parallel flag set (Added to fix bug409 running a single task with parallel=True).

run(stderr=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>)

Execute the tasks in order, stopping if one fails.

Automator.command_needs_no_config(args)

isolated to try to create some logical explanation for something that I think will end up cropping up in weird places.

Automator.parse(input=None, stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>)

Return and create all of the necessary information to get the JobBatch running Complain about semantic validation issues.