站长网 大数据 基因数据处理19之BWA匹配算法串产生、匹配、评价等整体流程

基因数据处理19之BWA匹配算法串产生、匹配、评价等整体流程

环境: ubunut BWA samtools wgsim bwa.kit工具 1.数据下载:需要在bwa.kit下 bwa.kit/run-gen-ref hs38DH 2.串产生: hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH$ wgsim -N 1000 -1 10 hs38DH.fa span style=”font-family: Arial,Helvetica,sans-seri

环境:

ubunut

BWA

samtools

wgsim

bwa.kit工具

1.数据下载:需要在bwa.kit下

bwa.kit/run-gen-ref hs38DH  

2.串产生:

hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH$ wgsim -N 1000 -1 10 hs38DH.fa <span style="font-family: Arial,Helvetica,sans-serif; font-size: 12px;">hs38DHSE1N10000L10F1.fq  </span> /dev/null 

将文件移到

<pre name="code" class="plain">~/cloud/adam/xubo/data/hs38DH/test20160415


3.匹配+评价

hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L10F1.fq >hs38DHSE1N10000L10F1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.10 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln ../hs38DH.fa hs38DHSE1N10000L10F1.fq
[main] Real time: 0.195 sec; CPU: 0.119 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa samse ../hs38DH.fa hs38DHSE1N10000L10F1.sai hs38DHSE1N10000L10F1.fq >hs38DHSE1N10000L10F1.sam
[bwa_aln_core] convert to sequence coordinate... 0.04 sec
[bwa_aln_core] refine gapped alignments... 0.00 sec
[bwa_aln_core] print alignments... 0.02 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa samse ../hs38DH.fa hs38DHSE1N10000L10F1.sai hs38DHSE1N10000L10F1.fq
[main] Real time: 0.154 sec; CPU: 0.086 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ samtools flagstat hs38DHSE1N10000L10F1.sam
9935 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
9907 + 0 mapped (99.72% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L
hs38DHSE1N10000L10F1.fq   hs38DHSE1N10000L10F1.sai  hs38DHSE1N10000L10F1.sam  hs38DHSE1N10000L70F1.fq   
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq >hs38DHSE1
hs38DHSE1L10000F1.fq      hs38DHSE1N10000L10F1.fq   hs38DHSE1N10000L10F1.sai  hs38DHSE1N10000L10F1.sam  hs38DHSE1N10000L70F1.fq   
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq >hs38DHSE1N10000L70F1.sai
[bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 0.72 sec
[bwa_aln_core] write to the disk... 0.00 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa aln ../hs38DH.fa hs38DHSE1N10000L70F1.fq
[main] Real time: 0.779 sec; CPU: 0.734 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ bwa samse ../hs38DH.fa hs38DHSE1N10000L70F1.sai hs38DHSE1N10000L70F1.fq >hs38DHSE1N10000L70F1.sam
[bwa_aln_core] convert to sequence coordinate... 0.04 sec
[bwa_aln_core] refine gapped alignments... 0.01 sec
[bwa_aln_core] print alignments... 0.03 sec
[bwa_aln_core] 9935 sequences have been processed.
[main] Version: 0.7.12-r1039
[main] CMD: bwa samse ../hs38DH.fa hs38DHSE1N10000L70F1.sai hs38DHSE1N10000L70F1.fq
[main] Real time: 0.207 sec; CPU: 0.109 sec
hadoop@Mcnode1:~/cloud/adam/xubo/data/hs38DH/test20160415$ samtools flagstat hs38DHSE1N10000L70F1.sam 
9935 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
9567 + 0 mapped (96.30% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

最后得出匹配的准确率在96%左右

本文来自网络,不代表站长网立场,转载请注明出处:https://www.tzzz.com.cn/html/shuju/2021/0528/7181.html

作者: dawei

【声明】:站长网内容转载自互联网,其相关言论仅代表作者个人观点绝非权威,不代表本站立场。如您发现内容存在版权问题,请提交相关链接至邮箱:bqsm@foxmail.com,我们将及时予以处理。
联系我们

联系我们

0577-28828765

在线咨询: QQ交谈

邮箱: xwei067@foxmail.com

工作时间:周一至周五,9:00-17:30,节假日休息

返回顶部