如何在 Perl 中从多个文件创建 HTML 文档？答案

【问题标题】：How can create an HTML document from several files in Perl?如何在 Perl 中从多个文件创建 HTML 文档？
【发布时间】：2010-12-11 06:47:12
【问题描述】：

我需要一些帮助来制作这个脚本......到目前为止，我认为它应该看起来像这样（我经常使用 AWK）

#!/usr/bin/perl -w
@filelist=(
file1,
file2,
file3
)
print ("<html>");
foreach $filelist {
  print ("<table>";
  print ("<td>"$filelist"</td>")
  foreach [line in the file]
    print ("<td>"$1, $2"</td>");
  }
  print ("</table>";
}
print ("</html>");

所以我希望脚本转到每个文件，打印文件名，然后为每一行打印两个字符串 <td>

我走对了吗？

另外，我意识到我编写的 AWK 需要几个 IF 语句 - 我相信它应该看起来像这样

while(<F_IN>) {
if ($1>=2) {
    print "<td class=\"green\" title=\""$1,$2"\">"
}
if ($1>=1) {
    print "<td class=\"amber\" title=\""$1,$2"\">"
}
if ($1>=0) {
    print "<td class=\"red\" title=\""$1,$2"\">"
}

【问题讨论】：

像“green”这样的 CSS 类名是个坏主意。使用与使用类所传达的概念相关的类名。例如，如果您正在编写一个脚本来显示系统状态信息，绿色应该是“status_ok”，琥珀色应该是“status_warning”，红色应该是“status_alert”。这样，当您为创建一个色盲无法使用的系统而烦恼时，您可以将颜色切换为白色、浅蓝色和深蓝色，而无需在所有 html 中重命名您的类，或者更糟的是只更改样式表，所以那个班级red = background: dark-blue; color: white。
:D 谢谢 - 这只会被少数人使用，但我明白你的意思:)
@Soop，与做错的成本相比，从一开始就使用好名字的难度很小。可怜可怜的笨蛋，他们必须在 2 年内维护你的代码。可能是你。
@思南，冷静点。不同的人有不同的学习方式。授予“你不能只是编造 [东西] 并期望计算机知道你的意思”。停下来，记住大量的文档对于初学者来说是多么的不堪重负。并不是每个人都能幸运地在开始学习 Perl 时获得一份副本。
@daotoad 我很冷静。问题是，我可以发布一些显然不是 Perl 的东西，并假装它是一个 Perl 问题。哦，顺便说一句，运气与它无关。 stackoverflow.com/questions/336715/…

标签： html perl

【解决方案1】：

您的代码实际上没有任何意义。我建议在尝试更复杂的事情之前先看看基本的Perl intro。

也就是说，你可能想要做的是这样的事情：

#!/usr/bin/perl

use strict;      # never forget this line
use warnings;    # or this one, it's better than the -w flag

use CGI qw(:standard);   # this gives us convenient functions for outputting
                         # HTML, plus other stuff for processing CGI requests

my @filelist = ( 'file1', 'file2', 'file3' );

print header;
print start_html;
print start_table;

foreach my $file( @filelist ) { 
    print Tr( td( $file ) );
}

print end_table;
print end_html;

有关 CGI 模块的（广泛的）文档，请参阅 CPAN 的 CGI。

编辑

如果您想要表中每个文件的内容，那么您需要将每个文件读入循环内的变量中。以下是您如何做到的示例：

foreach my $file( @filelist ) { 
    open my $fh, $file or die "Could not open $file: $!";

    local $/ = undef;   # this allows us to slurp the whole file at once
    my $filedata = <$fh>;

    print Tr( td( $filedata ) );
}

您应该阅读更多文档：Perl open() tutorial，以及关于 slurping 的所有内容。

【讨论】：

哇，你真的懂 perl！谢谢你。四个问题： 1 - 我需要在文件名周围加上引号吗？ 2 - 如果我想打开文件以逐行解析它们，我需要做整个“打开（句柄“$filename”）;，以及如何引用数组的一部分？3 - 我已经编辑我的原始请求，并且我使用了我从 AWK 知道的 $1、$2 - 这会按顺序引用每一行上的字符串吗？ 4 - 我可以（与 AWK 一样）然后将其输出到带有类似“ perl myfile > page.html"?
+1 @Friedo：我想他想要tds 中那些文件的内容。 @Soop 在做复杂的事情之前，你必须真正尝试学习一下语言。
默认打印到 STDOUT，所以你可以做任何正常的重定向。您可以在脚本中打开一个句柄并打印到它print $out "stuff to print"。遗憾的是，您也可以使用select $out; print 'stuff';，但不要这样做。这种方法会杀死小猫。 select 可用于设置print 和几个特殊变量的默认句柄。永远不要那样使用它。如果您在文档中看到它，请假装您没有。所以打开一个句柄并使用print FILEHANDLE LIST 打印到它，或者重定向你的输出。

【解决方案2】：

使用HTML::Template，这样您就可以完全控制 HTML，并且您的代码不会不可读。有关如何在代码中嵌入模板的示例，请参阅 my answer to another question。当然，相同的模板可以存在于脚本之外，这就是将表示与逻辑分离的好处。

#!/usr/bin/perl

use strict; use warnings;
use autodie;

use HTML::Template;

my @files = qw(file1 file2);
my @files_loop;

for my $file (@files) {
    open my $fh, '<', $file;

    push @files_loop, {
        LINES => [
            map { chomp; length $_ ? {LINE => $_} : () } <$fh>
    ]};
}

my $tmpl = HTML::Template->new(filehandle => \*DATA);
$tmpl->param(FILES => \@files_loop );
print $tmpl->output;

__DATA__
<html>
<body>
<TMPL_LOOP FILES>
<table>
<TMPL_LOOP LINES><tr><td><TMPL_VAR LINE></td></tr></TMPL_LOOP>
</table>
</TMPL_LOOP>
</html>

输出：

C:\Temp> fish.pl

<html>
<body>

<table>
<tr><td>bye bye</td></tr><tr><td>hello</td></tr><tr><td>thank you</td></tr><tr><
td>no translation</td></tr>
</table>

<table>
<tr><td>chao</td></tr><tr><td>hola</td></tr><tr><td>gracias</td></tr>
</table>

</html>

【讨论】：

【解决方案3】：

运行您的代码并阅读错误消息。您有多个语法错误。

无论您正在阅读什么教程，都不会给您一些重要的建议：

#!/usr/bin/perl 

use strict;    # Always, until you know when to turn it off--and then in a limited scope.
use warnings;  # see perllexwarn for why this is better than -v  

# you need to quote those bare words.  qw is a handy quoting operator.  See perlop.
my @filelist = qw(  
    file1 
    file2
    file3
);             # added semicolon 

# use a heredoc for big chunks of text.
# your html is seriously bogus.  Fixed somewhat.
print <<'ENDHTML';
<html>
   <head>
   </head>
   <body>
ENDHTML

# $filelist does not exist.
# fixed syntax of foreach
foreach my $file ( @filelist ) {

  # do you really want a new table for each file?
  print "<table>\n";

  # need to escape your quotes
  print "<tr><td>\"$file\"</td></tr>";

  # open a file handle.
  # use lexical handles instead of old style global handles.
  # use 3 argument open instead of 2 argument style
  open( my $fh, '<', $file);

  # $fh evaluates to true if we were able to open the file.
  if ( $fh ) {

    # iterate over the file.
    while( my $line = <$fh> ) {
      # where the hell did $1 and $2 come from?
      print "<tr><td>$line</td></tr>";
    }

  }
  else {
      # put the error message in the table instead of the file contents
      print "<tr><td>Error opening file: $!</td></tr>";          
  }

  print "</table>";
}
print "</html>";

由于您没有测试您的代码，我也没有。但它应该可以工作。

当然，除了最简单的一次性脚本之外，最好避免使用内联 HTML。如果您正在做任何持久或严肃的事情，请使用模板系统，例如 Template::Toolkit 或 HTML::Template。

【讨论】：

啊，谢谢。我没有使用教程，只是我从很久以前记得的一些常识。正如我所说，我习惯于 AWK 中使用 $1、$2 等来引用字符串。
@Soop 编程与常识无关。首先是规范和语法。凭直觉无法预测计算机程序的工作原理。它比这简单得多。你得到一本书，写一些玩具程序，至少在接受工作之前学习一下语言。
刚开始时，perldoc 最有价值的部分是 perlfunc 的 Perl Functions by category 部分。 perldoc.perl.org/perlfunc.html#Perl-Functions-by-Category 你也可以通过perldoc perlfunc 和man perlfunc 来获得它。您可以使用perldoc -f funcname 在cli 上搜索功能，并使用perldoc -q someregexp 搜索FAQ。有关 perldoc 的更多功能，请参阅man perldoc。
@Soop，我认为思南试图告诉您阅读一些基本教程和/或 TFM，而不是立即寻求帮助。查看 Picking Up Perl 以获得一个合理的，如果有些过时的开始：ebb.org/PickingUpPerl/pickingUpPerl_toc.html
@Soop，当您开始学习 Perl 时，知道在哪里可以找到好的教程是一个大问题。尤其是网络上仍有许多可追溯到 90 年代中期的非常古老的资料，其中大部分质量有问题。 Ovid 的 CGI 课程很旧，但写得很好，适合您的任务，请查看 jdporter.perlmonk.org/cgi_course 如果您通过示例学习得最好，请查看 PLEAC - pleac.sourceforge.net

【解决方案4】：

上面有一些很好的答案。根据 Soop 的回答，这次使用Markapl 对主题进行了另一个转折。

use strict; 
use warnings;
use CGI ();
use Markapl;

my @filelist = qw/file1.txt file2.txt file3/;

template 'web' => sub {
  html {
    head {
      html_link (rel => 'stylesheet', type => 'text/css', href => 'test.css') {}
    }
    body {
      table {
        for my $file (@filelist) {
          open my $fh, '<', $file or next;
          row { 
            for my $title (<$fh>) { 
                chomp $title; 
                cell { $file }
                cell ( class => title_class( $title ), title => $title ) {}
} } } } } } };

print CGI::header();
print main->render( 'web' );

sub title_class {
    my $first_char = substr $_[0], 0, 1;
    return 'green' if $first_char >= 2;
    return 'amber' if $first_char == 1;
    return 'red';
}

我确实喜欢这些“建造者”类型的方法。有关更多信息，请参阅此 SO 问题CL-WHO-like HTML templating for other languages?

/I3az/

【讨论】：

【解决方案5】：

这是我与你们交谈后得出的结论——非常感谢大家：D

        #!/usr/bin/perl

    use strict;      # never forget this line
    use warnings;    # or this one, it's better than the -w flag

    use CGI qw(:standard);   # this gives us convenient functions for outputting
                             # HTML, plus other stuff for processing CGI requests

    my @filelist=(
'file1',
'file2',
'file3',
'file4',
'file5',
'file6',
    );
    #
    #
    print "<html>\n";
    print "<head>\n";
    print "<link rel=\"stylesheet\" type=\"text/css\" href=\"test.css\">\n";
    print "</head>\n";
    foreach my $file (@filelist) {
        print "<table>\n";
        print "<div>\n";
        print "<tr>\n";
        print  td( $file );
        open( my $fh, '<', $file);
        if ($fh) {
          while( my $line = <$fh> ) {
          chomp $line;
          if (substr($line, 0, 1)>="2") {
            print "<td class=\"green\" title=\"" . $line . "\">\n";
            }
          elsif (substr($line, 0, 1)=="1") {
            print ("<td class=\"amber\" title=\"" . $line . "\">\n");
            }
          elsif (substr($line, 0, 1)=="0") {
            print ("<td class=\"red\" title=\"" . $line . "\">\n");
            }
          }
        }
        print "</tr>\n";
        print "</div>\n";
        print "</table>\n";
    }
    print "</html>";

HTML 需要稍作调整，其中很多是多余的，但效果很好 :)

【讨论】：

好的，但是您错过了选择已接受的答案。请这样做。