Running the allele specific expression (ASE) pipeline¶
This workflow will perform all of the steps required to identify genes exhibiting allelic imballance.
It will retrieve read data from the SRA, or use simulated datasets produced by the simulate reads pipeline
Note
while the software can also process arbitrary fastq files (as long as they are paired), please note that it will not be able to construct a report
All log files will be found in the logs
directory at the end of the run.
Edit the file targets.txt¶
manually edit the file targets.txt, placing one accession number per line, e.g.
ERR2352620
ERR2352630
ERR2352640
ERR2352650
ERR2352660
ERR2352670
Alternatively, if you have run the simulate reads pipeline, you can specify simulated data as follows:
simulation_1_x1
simulation_1_x2
Make sure the names correspond to the prefix of paired fastq files in the reads directory, e.g.
reads/simulation_1_x1.1.fastq.gz
reads/simulation_1_x1.2.fastq.gz
reads/simulation_1_x2.1.fastq.gz
reads/simulation_1_x2.2.fastq.gz
Modifying workflow parameters¶
If desired, manually edit the file aliseq_params.json
Running the workflow¶
launch the workflow with:
bash ../ALISEQ/aliseq.sh
Note
If you encounter the following error mesage (e.g. after a power failure or loss of network connection) Error: Directory cannot be locked...
, run the following cmd before retrying
snakemake --unlock -s ../ALISEQ/scripts/workflow.snk