You must always ensure that the virtualenv that you installed into during the Installation is activated prior to running any portion of the pipeline.
You only have to do this if you have closed your terminal since the last time your activated.
You need to specify the full path to the activate script such as
$> . /path/to/puticr/puticr/bin/activate
You can read more about what virtualenv is here
puticr_cli -h
If your fastq file has a .fq.gz extension, make sure to rename to tissueName.*.fastq.gz extension and it should be gzipped. The name of the fastq file does matter. For example, Brain.21423B_S1_R1_001.fastq.gz, Liver.21423L_S3_R1_001.fastq.gz, oocyte.5423B_S1_R1_001.fastq.gz, Brain.21424B_S5_R1_001.fastq.gz
You must use TissueName.*.fastq.gz``
In this case, the goal is to find a region where the genome is methylated 50%. We will use the lower methylation cutoff of 0.3 and the upper cutoff 0.7. Don’t forget to store the tissueName.*.fastq.gz file under any directory name and pass to -f option. I suggest to store the raw data into separate folders. One for somatic tissue fastq.gz files and the other for gamet cells (oocyte and sperm). The pipleline only support single end reads.
puticr_cli -o outputDir1 -f testData/raw -l 6 -c 12 -L 0.3 -U 0.7 --indexed=no --refGenome=hs_chr6.fa --refGenomeDir=testData/genome2/
The goal is to find a region where the genome is methylated 100%
puticr_cli -o outputDir2 -f testData/raw -l 6 -c 12 -L 0.8 -U 1.0 --indexed=no --refGenome=hs_chr6.fa --refGenomeDir=testData/genome2/
Assuming you downlaoded and saved hs38.fa under mygenome/genome3/, you will turn on indexed=yes. Indexing a genome is a slower process, be petient.
puticr_cli -o outputDir1 -f testData/raw -l 6 -c 12 -L 0.3 -U 0.7 --indexed=yes --refGenome=hs38.fa --refGenomeDir=mygenome/genome3/
Then, the next time you run a different data or the same data. You don’t need to re-index. This is a one time process unless you used different version of the genome.
Don’t forget to change –indexed=no.
puticr_cli -o outputDir2 -f testData/raw -l 6 -c 12 -L 0.8 -U 1.0 --indexed=no --refGenome=hs38.fa --refGenomeDir=mygenome/genome3/
If it fails then an error is reported that generally suggest where it failed by checking the key files created at each stage. Most likely, the error occurs on the suggested stage or the stage before it. You will likely have to check the log files to get an idea what went wrong and go from there.