【发布时间】:2014-03-23 12:27:47
【问题描述】:
我正在尝试从 FASTA 格式的文件中计算字符串中某些字符的百分比。所以文件看起来像这样;
>label
sequence
>label
sequence
>label
sequence
我正在尝试从“序列”字符串中计算特定字符(例如 G)的百分比。 在计算完之后(我已经能够做到),我试图打印一个句子,上面写着:“G 在(例如)标签 1 中的百分比是(例如)53%”。
所以我的问题真的是,我如何对序列字符串进行计算,然后通过上面的标签在其对应的输出中命名每个字符串?
到目前为止,我的代码计算出百分比,但我无法识别它。
#!/usr/bin/perl
use strict;
# opens file
my $infile = "Lab1_seq.fasta.txt";
open INFILE, $infile or die "$infile: $!\n";
# reads each line
while (my $line = <INFILE>){
chomp $line;
#creates an array
my @seq = split (/>/, $line);
# Calculates percent
if ($line !~ />/){
my $G = ($line =~ tr/G//);
my $C = ($line =~ tr/C//);
my $total = $G + $C;
my $length = length($line);
my $percent = ($total / $length) * 100;
#prints the percentage of G's and C's for label is x%
print "The percentage of G's and C's for @seq[1] is $percent\n";
}
else{
}
}
close INFILE
当我真的试图让它也说出与序列对应的每个标签的名称时,它会吐出这个输出(如下)
The percentage of G's and C's for is 53.4868841970569
The percentage of G's and C's for is 52.5443110348771
The percentage of G's and C's for is 50.8746355685131
【问题讨论】: