【问题标题】:Perl DBI Postgresql: Returning undef for data that is therePerl DBI Postgresql:为存在的数据返回 undef
【发布时间】:2015-03-28 05:03:16
【问题描述】:

我得到了非常奇怪的结果,我知道它们一定是我做错的小事。我正在尝试检查 postgresql 数据库表中是否存在一行,并且在第一个循环中我得到一个实际值。在我得到一个 undef 之后,在循环的第二次迭代和所有迭代中。为什么?有什么我必须做而我没有做的事情。我没有使用准备,所以我不应该调用完成等。

任何见解都将极大地帮助我调试此问题。

抱歉,代码现在太糟糕了。我一直在进行调试,结果弄得很丑。

对于丑陋的示例输出也很抱歉。我不知道如何用stackoverflow很好地格式化它。

请不要在示例输出中打印“选择名称”。第一次之后的所有迭代都返回 undef。 有问题的 sql 调用位于文件末尾。这条线

my $selectSQL = "select name from crawler_url where url='http://www.maccosmetics.com$item->{'uri'}' ";

Perl 代码:

#!/usr/bin/perl

use LWP::Simple;                # From CPAN
use JSON qw( decode_json );     # From CPAN
use JSON::Parse 'parse_json';
use Data::Dumper;               # Perl core module
use HTML::TreeBuilder 5 -weak;
use Mojo::DOM;
use DBI;
use String::Util qw(trim);
use strict;                     # Good practice
use warnings;                   # Good practice

my $initialize = 0;
my $debug = 1;

&main;

sub main {
    my $dbh = connect2db();

    unless(defined($dbh)) {
        exit 1;
    }


    my $trendsurl;

    my $sth = $dbh->prepare("SELECT company_name from companies where active=1");
    $sth->execute;
    while( my $company = $sth->fetchrow_hashref() ) {
        #print Dumper($company)."\n";

        my $sth2 = $dbh->prepare("SELECT url from crawlers where company_name='$$company{'company_name'}' ");
        $sth2->execute;
        while( my $url = $sth2->fetchrow_hashref() ) {
            #print " NOW ON URL $$url{'url'} ##########\n";
            $trendsurl = $$url{'url'};
            chomp($trendsurl);
            $trendsurl = trim($trendsurl);
            print "URL: ".$trendsurl."\n";

            my $json = get( $trendsurl );
            die "Could not get $trendsurl!" unless defined $json;

            my $parsed_json = parse_json($json);
            my $items = $parsed_json->{'sections'}[0]->{'items'};

            foreach my $item_hash (@$items) {
                #print Dumper($item_hash)."\n";
                my $category = $item_hash->{'name'};
                print "Lip Product Category: $category\n";

                foreach my $item ( @{ $item_hash->{'items'} } ) {
                    print Dumper($item)."\n";

                    my $selectSQL = "select name from crawler_url where url='http://www.maccosmetics.com$item->{'uri'}' ";

                    print $selectSQL."\n" if($debug);

                    my ($productCount) = $dbh->selectrow_array($selectSQL);

                    my $date = localtime;
                    chomp($productCount);
                    trim($productCount);
                    chomp($item->{'name'});
                    trim($item->{'name'});

                    print "Select Name: '$productCount'\n";
                    print "Item Name: '$item->{'name'}'\n";
                    print "Do they equal: ", index($productCount, $item->{'name'}), " \n";

                    print Dumper($productCount);

                    if( index($productCount, $item->{'name'}) == -1 ) {
                        my $insertSQL = "insert into crawler_url (first_seen,url,name,category,last_checked) values ('$date','http://www.maccosmetics.com$item->{'uri'}','$item->{'name'}','$category','$date') ";
                        print $insertSQL."\n" if($debug);
                        my $retVal = $dbh->do($insertSQL);

                        $insertSQL = "insert into urls (company_name,url) values ('$$company{'company_name'}','http://www.maccosmetics.com$item->{'uri'}') ";
                        print $insertSQL."\n" if($debug);
                        $retVal = $dbh->do($insertSQL);
                    }
                    else {
                        #We have seen this before
                        my $updateSQL = "update crawler_url SET (url,name,category,last_checked) = ('http://www.maccosmetics.com$item->{'uri'}','$item->{'name'}','$category','$date' )";
                        print $updateSQL."\n" if($debug);
                        my $retVal = $dbh->do($updateSQL);
                    }
                }
            }
        }
    }
}

sub connect2db {
    return DBI->connect("dbi:Pg:dbname=xxxxxx", "xxxxx", "XXXXXX");
}

样本输出:

URL: http://www.maccosmetics.com/includes/panel_nav/catalog.js?CATEGORY_ID=CAT163&LOCALE=en_US

Lip Product Category: Lipstick

$VAR1 = {
  'uri' => '/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl',
  'description' => 'Colour plus texture for the lips. Stands out on the runway...',
  'name' => 'Lipstick',
  'thumbnail' => '/images/products/56x56/M300.jpg',
  'header' => '/images/pnav/product/headers/pnav_M300_200x12_off.gif',
  'id' => 'CAT168PROD310'
};

select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl'

Select Name: 'Lipstick                                                                                                                        '

Item Name: 'Lipstick'


Do they equal: -1

$VAR1 = 'Lipstick                                                                                                                        ';
update crawler_url SET (url,name,category,last_checked) = ('http://www.maccosmetics.com/product/shaded/168/310/Products/Lips/Lipstick/Lipstick/index.tmpl','Lipstick','Lipstick','Wed Jan 28 21:15:40 2015' )

$VAR1 = {
      'id' => 'CAT168PROD34492',
      'thumbnail' => '/images/products/56x56/MX5G8N.jpg',
      'header' => '/images/pnav/product/headers/pnav_MX5G8N_200x12_off.gif',
      'description' => "Miley Cyrus\x{2019}s shade of VIVA GLAM Lipstick. Her super-sexy hot...",
      'name' => 'VIVA GLAM Miley Cyrus Lipstick',
      'uri' => '/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl'
    };

select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl'

Select Name: ''

Item Name: 'VIVA GLAM Miley Cyrus Lipstick'


Do they equal: 0

$VAR1 = undef;

insert into crawler_url (first_seen,url,name,category,last_checked) values ('Wed Jan 28 21:15:40 2015','http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl','VIVA GLAM Miley Cyrus Lipstick','Lipstick','Wed Jan 28 21:15:40 2015')

insert into urls (company_name,url) values ('MAC                                                             ','http://www.maccosmetics.com/product/shaded/168/34492/Products/Lips/Lipstick/VIVA-GLAM-Miley-Cyrus-Lipstick/index.tmpl')

$VAR1 = {
      'uri' => '/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl',
      'description' => 'Formulated to shade, define and showcase the lips in a rouge-y...',
      'name' => 'Isabel and Ruben Toledo Lipstick ',
      'header' => '/images/pnav/product/headers/pnav_MWWE1T_200x12_off.gif',
      'thumbnail' => '/images/products/56x56/MWWE1T.jpg',
      'id' => 'CAT168PROD34798'
    };

select name from crawler_url where url='http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl'

Select Name: ''


Item Name: 'Isabel and Ruben Toledo Lipstick '

Do they equal: 0

$VAR1 = undef;

insert into crawler_url (first_seen,url,name,category,last_checked) values ('Wed Jan 28 21:15:40 2015','http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl','Isabel and Ruben Toledo Lipstick ','Lipstick','Wed Jan 28 21:15:40 2015')

insert into urls (company_name,url) values ('MAC                                                             ','http://www.maccosmetics.com/product/shaded/168/34798/Products/Lips/Lipstick/Isabel-and-Ruben-Toledo-Lipstick/index.tmpl')

更新: 当我在$dbh->do 调用之前添加next 时,我得到了我期望的结果。所以这与做$dbh->do($insertSQL)$dbh->do($updateSQL) 有关。在第二次交互中再次使用$dbh->selectrow_array($selectSQL) 之前,我是否应该再打一个电话?如果是,为什么?

【问题讨论】:

  • 是的。有 2 个语句不会有问题。我在其他小程序中使用过它,从未遇到任何问题。我认为它与事务有关,特别是与插入或更新有关。默认情况下,根据文档,启用 AutoCommit 标志,因此我不必在 do 函数之后调用 commit。也许这是我在 DBI、DBD::Pg 和 posgresql 关系中缺少的东西。
  • 我不确定当您在同一个表上使用活动游标更新或插入时会发生什么。另外你真的真的应该使用bind parameters,否则你正在寻求SQL注入攻击。
  • 根据我过去对 DBI 的使用情况,我认为更新和插入很好。感谢您提供有关绑定参数的提示。从现在开始,我将更改我的所有 sql 调用以使用它。关于 sql 注入的 wiki 页面出奇地透彻:en.m.wikipedia.org/wiki/SQL_injection
  • 为了更好地回答这个问题,您要么必须提供重现问题的模式转储(带有数据),要么开始删除代码以获得更简单的示例。

标签: perl postgresql undefined dbi


【解决方案1】:

你真的应该添加一个 $sth2->finish();在你的内部 while 循环结束时,还有一个 $sth->finish();在你的外部while循环之后。正如您在问题中所描述的那样,不对内部循环执行完成可能会导致第一次迭代起作用,但不会导致所有后续迭代起作用。

至少可以说,在某事上不执行完成是不好的形式,尽管如果您没有嵌套提取,通常可以侥幸逃脱。一旦您嵌套了没有相应完成的提取,您就会遇到您描述的确切问题。

【讨论】:

  • 语句句柄在销毁时会自动完成,通常是因为变量超出范围或被重新分配。 $sth2 将在循环的每次迭代结束时完成。 $sth 将在 main 结束时结束。
  • @Schwern 这对 do 语句有何影响?根据 DBI 和 DBD::Pg 的文档,do 函数仅进行准备和执行,而不是完成。 do 是否也暗示了结束? search.cpan.org/dist/DBD-Pg/Pg.pm#do
  • @Nick.D A do 没有要完成的游标,它只是执行语句并告诉您影响了多少行。 selectall_* 将为您完成(显式地或通过让内部语句句柄超出范围)。基本上,only call finish when you A) know you're not going to fetch all the data and B) you know the handle will not be destroyed soon。使用prepare_cached 会使事情变得更复杂一些,但如果您重复使用活动句柄,它会发出警告。
猜你喜欢
  • 1970-01-01
  • 2017-02-09
  • 1970-01-01
  • 2015-05-11
  • 1970-01-01
  • 2023-03-03
  • 1970-01-01
  • 2022-01-19
  • 2010-11-19
相关资源
最近更新 更多