首先,您需要设计和创建要接收数据的表,并尝试一些手动数据输入来检查数据模型。
确保每个电子表格都有足够的标题信息来了解如何处理行。
完成后,我会将工作表保存到带有制表符分隔符的文本文件中。
接下来我会用 Perl 编写一个加载程序。它首先读取标题行并确定将行插入数据库的规则。然后将每一行转换为插入到数据库中。
这是我拥有的发票加载程序的示例(所有权利):
if ($first) {
$obj->_hdr2keys(0); # convert spreadhseet header into a lookup
my $hdr = $obj->_copystruct($obj->{ar}[0]);
my @Hhdr = ('invoice header id');
my @Hcols = ('invhid');
my @Htypes = ('serial');
my @Dhdr = ('invoice detail id');
my @Dcols = ('invdid','invhid');
my @Dtypes = ('serial','integer');
for (my $col=0; $col <= $#{$hdr}; $col++) {
my $colname = lc($obj->_pomp($hdr->[$col]));
if ($colname eq 'invoicenumber') {
push @Hhdr, $hdr->[$col];
push @Hcols, $colname;
push @Htypes, 'char(32)';
}
elsif ($colname eq 'buysell') {
push @Hhdr, $hdr->[$col];
push @Hcols, $colname;
push @Htypes, 'boolean';
}
elsif ($colname eq 'suppliercustomer') {
push @Hhdr, $hdr->[$col];
push @Hcols, $colname;
push @Htypes, 'char(64)';
}
elsif ($colname eq 'date') {
push @Hhdr, 'Transaction Date';
push @Hcols, 'transactiondate';
push @Htypes, 'date';
}
elsif ($colname eq 'article') {
push @Dhdr, 'Article id';
push @Dcols, 'artid';
push @Dtypes, 'integer';
push @Dhdr, 'Article Description';
push @Dcols, 'description';
push @Dtypes, 'char(64)';
}
elsif ($colname eq 'qty') {
push @Dhdr, $hdr->[$col];
push @Dcols, $colname;
push @Dtypes, 'integer';
}
elsif ($colname eq 'priceexclbtw') {
push @Dhdr, $hdr->[$col];
push @Dcols, $colname;
push @Dtypes, 'double precision';
}
elsif ($colname eq 'btw') {
push @Dhdr, $hdr->[$col];
push @Dcols, $colname;
push @Dtypes, 'real';
}
}
$obj->_getset('INVHar',
['invoiceheader',
['PK','invhid'],
['__COLUMNS__'],
\@Hcols,
\@Htypes,
\@Hhdr
]
);
$obj->_getset('INVDar',
['invoicedetail',
['PK','invdid'],
['FK','invhid','invoiceheader','invhid'],
['FK','artid','article','artid'],
['__COLUMNS__'],
\@Dcols,
\@Dtypes,
\@Dhdr
]
);
}
$first = 0;
SALESROW: for (my $i=1; $i <= $#{$obj->{ar}}; $i++) {
my @Hrow = ('');
my @Drow = ('');
my $date = $obj->_selectar('', $i, 'Date');
$date =~ s/\-/\//g;
if ($date) {
$obj->_validCSV('date', $date)
or die "CSV format error date |$date| in file $file";
}
my $invtotal = ($obj->_selectar('', $i, 'Invoice Total incl. BTW'));
my $article = $obj->_selectar('', $i, 'Article');
$date or $article or next SALESROW;
if ($date) {
push @Hrow, $obj->_selectar('', $i, 'Invoice Number');
my $buysell = $obj->_selectar('', $i, 'Buy/Sell');
push @Hrow, ($buysell eq 'S') ? 1 : 0;
push @Hrow, $obj->_selectar('', $i, 'Supplier/Customer');
push @Hrow, $date;
push @{$obj->_getset('INVHar')}, \@Hrow;
$invhid++;
}
push @Drow, $invhid;
if ($article eq 'E0154') {
push @Drow, 1;
}
elsif ($article eq 'C0154') {
push @Drow, 2;
}
elsif ($article eq 'C0500') {
push @Drow, 3;
}
elsif ($article eq 'C2000') {
push @Drow, 4;
}
elsif ($article eq 'C5000') {
push @Drow, 5;
}
else {
die "unrecognised sales article $article\n"
. Dumper($obj->{ar}[$i]);
}
push @Drow, undef; # description is in article table
push @Drow, $obj->_selectar('', $i, 'Qty.');
push @Drow, $obj->_selectar('', $i, 'Price excl. BTW');
push @Drow, $obj->_selectar('', $i, 'BTW %');
push @{$obj->_getset('INVDar')}, \@Drow;
}
这会在产品表已从另一个电子表格加载后创建发票的抬头和详细记录。
在上面的示例中,创建了两个数组,INVHar 和 INVDar。当它们准备好时,调用例程将它们加载到数据库中,如下所示。在下一个代码示例中,创建了表以及行,并且更新了元数据库以加载未来的表和管理现有表的外键。在前面的 sn-p 中创建的数组包含创建表和插入行所需的所有信息。还有一个简单的例程 _DBdatacnv,用于在电子表格中的格式和数据库中所需的格式之间进行转换。例如,电子表格中的货币符号需要在插入前去除。
sub _arr2db {
my ($obj) = @_;
my $ar = $obj->_copystruct($obj->_getset('ar'));
my $dbh = $obj->_getset('CDBh');
my $mdbh = $obj->_getset('MDBh');
my $table = shift @$ar;
$mdbh->{AutoCommit} = 0;
$dbh->{AutoCommit} = 0;
my @tables = $mdbh->selectrow_array(
"SELECT id FROM mtables
WHERE name = \'$table\'"
);
my $id = $tables[0] || '';
if ($id) {
$mdbh->do("DELETE FROM mcolumns where tblid=$id");
$mdbh->do("DELETE FROM mtables where id=$id");
}
# process constraints
my %constraint;
while ($#{$ar} >= 0
and $ar->[0][0] ne '__COLUMNS__') {
my $cts = shift @$ar;
my $type = shift @$cts;
if ($type eq 'PK') {
my $pk = shift @$cts;
$constraint{$pk} ||= '';
$constraint{$pk} .= ' PRIMARY KEY';
@$cts and die "unsupported compound key for $table";
}
elsif ($type eq 'FK') {
my ($col, $ft, $fk) = @$cts;
$ft && $fk or die "incomplete FK declaration in CSV for $table";
$constraint{$col} ||= '';
$constraint{$col} .=
sprintf( ' REFERENCES %s(%s)', $ft, $fk );
}
elsif ($type eq 'UNIQUE') {
while (my $uk = shift @$cts) {
$constraint{$uk} ||= '';
$constraint{$uk} .= ' UNIQUE';
}
}
elsif ($type eq 'NOT NULL') {
while (my $nk = shift @$cts) {
$constraint{$nk} ||= '';
$constraint{$nk} .= ' NOT NULL';
}
}
else {
die "unrecognised constraint |$type| for table $table";
}
}
shift @$ar;
unless ($mdbh->do("INSERT INTO mtables (name) values (\'$table\')")) {
warn $mdbh->errstr . ": mtables";
$mdbh->rollback;
die;
}
@tables = $mdbh->selectrow_array(
"SELECT id FROM mtables
WHERE name = \'$table\'"
);
$id = shift @tables;
$dbh->do("DROP TABLE IF EXISTS $table CASCADE")
or die $dbh->errstr;
my $create = "CREATE TABLE $table\n";
my $cols = shift @$ar;
my $types = shift @$ar;
my $desc = shift @$ar;
my $first = 1;
my $last = 0;
for (my $i=0; $i<=$#{$cols}; $i++) {
$last = 1;
if ($first) {
$first = 0;
$create .= "( "
}
else {
$create .= ",\n";
}
$create .= $cols->[$i]
. ' ' . $obj->_DBcnvtype($types->[$i]);
$constraint{$cols->[$i]}
and $create .= ' ' . $constraint{$cols->[$i]};
unless ($mdbh->do("INSERT INTO mcolumns (tblid,name,type,description)
values ($id,\'$cols->[$i]\',\'$types->[$i]\',\'$desc->[$i]\')"))
{
warn $mdbh->errstr;
$mdbh->rollback;
die;
}
}
$last and $create .= ')';
unless ($dbh->do($create)) {
warn $dbh->errstr;
$dbh->rollback;
die;
}
my $count = 0;
while (my $row = shift @$ar) {
$count++;
my $insert = "INSERT INTO $table (";
my $values = 'VALUES (';
my $first = 1;
for (my $i=0; $i<=$#{$cols}; $i++) {
my $colname = $cols->[$i];
unless (defined($constraint{$colname})
and $constraint{$colname} =~ /PRIMARY KEY/) {
if ($first) {
$first = 0;
}
else {
$insert .= ', ';
$values .= ', ';
}
$insert .= $colname;
my $val = $obj->_DBdatacnv('CSV', 'DB',
$types->[$i],$row->[$i]);
if ($val eq '%ABORT') {
$mdbh->rollback;
die;
}
$values .= $val;
}
}
$insert .= ')' . $values . ')';
unless ($dbh->do($insert)) {
warn $dbh->errstr;
warn $insert;
$mdbh->rollback;
die;
}
}
NOINSERT: $mdbh->commit;
$dbh->commit;
# warn "inserted $count rows into $table";
}
更新:好的,我将添加从 CSV 转换为数组的通用例程,为上面的 _arr2db 做好准备,适用于我对系统的所有其他情况:电子表格首先增加 PK FK 和其他约束,然后是列标题数据库的名称,一行数据库类型(名义上的,实际的在_DBcnvdatatype中处理)然后是进入元数据库的一行标签,最后是数据行之前的标记COLUMNS插入。
sub _csv2arr {
my ($obj, $csv ) = @_;
my $ar = [];
my $delim = $obj->_getset('csvdelim') || '\,';
my $basename = basename($csv);
$basename =~ s/\.txt$//;
$ar = [$basename];
open my $fh, $csv
or die "$!: $csv";
while (<$fh>) {
chomp;
my $sa = [];
@$sa = split /$delim/;
push @$ar, $sa;
}
close $fh;
$obj->{ar} = $ar;
}