Merging of bams

Once you have sequenced all you libraries you need to merge them all into a single file. We merge to library level before removing PCR duplicates to retain as many reads as possible.


Merging to library level

Since fragment duplicates are created at the final library step (the PCR level) we merge the “raw” bams (located in /proj/snic2020-2-10/private/Data/Human/Ancient/hg19bams/mapped) before removal of PCR duplicates. This is done with the script called merge_libraries.sh.

sbatch -A $proj -p core -n 2 -t 2-0:00:00 -J merge_${libraryname} merge_libraries.sh ${OUTDIR} ${libraryname} ${ref} ${file1} ${file2} ...
      # libraryname = should be something similar to hej001-b1e1l1_AAAATTT.merged
      # ref = hs37d5.fa, GRCh38_full.fa or human_b36_male_nohaps.fa
      # fileX = full path to "raw" bam file /proj/sllstore2017020/private/hg19bams/mapped/..fa.bam
      

This will output one “raw” bamfile, the final 90perc bamfile, mt consensus sequence, damage plot, and read length plot. All in the folder where you submitted the script from.

sequence_stats.txt: filename, human sequences, mean RL, Clonality [%], Too short [%], Genome cov, MT cov, MT reads, X reads, Y reads, gender


Final merge

After you merged to library level, you must condense the different cons.90perc.bam-files into a single file. This is done with start_last_merge.sh. Some statistics are calculated as well. You need to prepare a list with the path for each library (the 90perc.bam files), one per line. Then you need to specify an output name, and lastly which reference your files where mapped against (only filename).

sbatch -A $proj -p core -n 3 -t 5-0:00:00 -J final_map_$sample start_last_merge.sh list.txt hej001_90perc_libr_190101.merge $ref
    

This will output a final bam file, a damage plot, a read length plot and some statistics in a file called libr_cov.txt . All located in the same folder where you started the script.

libr_cov.txt: filename, number of human sequences, avg. read lenght, genome coverage, MT coverage, MT reads, X reads, Y reads, gender