Running the allele specific expression (ASE) pipeline

This workflow will perform all of the steps required to identify genes exhibiting allelic imballance.

It will retrieve read data from the SRA, or use simulated datasets produced by the simulate reads pipeline

Note

while the software can also process arbitrary fastq files (as long as they are paired), please note that it will not be able to construct a report

All log files will be found in the logs directory at the end of the run.

Edit the file targets.txt

manually edit the file targets.txt, placing one accession number per line, e.g.

ERR2352620
ERR2352630
ERR2352640
ERR2352650
ERR2352660
ERR2352670

Alternatively, if you have run the simulate reads pipeline, you can specify simulated data as follows:

simulation_1_x1
simulation_1_x2

Make sure the names correspond to the prefix of paired fastq files in the reads directory, e.g.

reads/simulation_1_x1.1.fastq.gz
reads/simulation_1_x1.2.fastq.gz
reads/simulation_1_x2.1.fastq.gz
reads/simulation_1_x2.2.fastq.gz

Modifying workflow parameters

If desired, manually edit the file aliseq_params.json

Running the workflow

launch the workflow with:

bash ../ALISEQ/aliseq.sh

Note

If you encounter the following error mesage (e.g. after a power failure or loss of network connection) Error: Directory cannot be locked..., run the following cmd before retrying

snakemake --unlock -s ../ALISEQ/scripts/workflow.snk