Before You Begin¶
Of course you have already read the Caveats. You can often use SyQADA for existing workflows without problems that demand much Unix knowledge, but please do not underestimate your need to learn Unix if you want to run bioinformatics workflows using SyQADA or any other system, especially those constructed by yourself. Elsewhere in this manual is a Useful Shell Stuff Primer that may be helpful to your learning Unix. More help is as close as googling “learn unix the hard way.”
It is my goal that SyQADA should give a reasonably informative message any time it can detect a user error. Once SyQADA submits jobs to the system to run, it does little to help understand computation errors that may result. However, it can and does put those errors in a standard place so that you can find them easily.
A workflow (see Pipeline definition) is a series of tasks requiring some degree of human intervention (at least starting it and examining its results) during its execution. SyQADA has a specific standard format for any and every task, and a specific way of defining a sequence of tasks and then creating named directories in which to run each task. It makes sense to discuss the structure of a task first, and then the structure of a workflow.
A quick word about word usage in this document¶
At various times in the history of SyQADA development the words batch and step were each used where the word task is used today. Because they are nearly interchangeable in use, little has been done to standardize older portions of this guide on the modern term, so you may expect to see any one of them at various points.
Another pair of words that are almost interchangeable are protocol and workflow. A protocol actually specifies what will be done in a workflow, but I find myself using metonymy to use one word in place of the other.