【问题标题】:Problems to import csv-file into mySQL with PHP使用 PHP 将 csv 文件导入 mySQL 的问题
【发布时间】:2020-10-10 17:33:02
【问题描述】:

我正在尝试导入一个以制表符分隔 (\t) 的大型 csv 文件。

我实现这一目标的步骤:

  1. 上传 csv 文件
  2. 在我的数据库中创建一个新表(文件名不带 .csv)
  3. 由于文件比较大,分批分批
  4. 将批次发送到我的数据库

表将被创建。但它是空的,我不知道为什么。感谢您的帮助。

上传.php

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.1/jquery.js"></script>
<script>
//Declaration of function that will insert data into database
 function senddata(filename, table){
        var file = filename;
        $.ajax({
            type: "POST",
            url: "senddata.php",
            data: {file: file, table: table},
            async: true,
            success: function(html){
                //$("#result").html(html);
            }
        })
        }
 </script>
<?php
$csv = array();
$batchsize = 1000; //split huge CSV file by 1,000
$fileName = $_FILES['csv']['name'];
$table = basename($fileName, ".csv");
$table = strval( $table);
 echo "<script> console.log('File & Table-Name: $table') </script>";

// sql to create table
session_start();
error_reporting(E_ALL);
ini_set('display_errors', 1);

$DB_HOST = "XXXX"; 
$DB_NAME = "XXXX"; 
$DB_USER = "XXXX"; 
$DB_PASS = "XXXX"; 


$conn = new mysqli($DB_HOST, $DB_USER, $DB_PASS, $DB_NAME);
if($conn->connect_errno > 0) {
  die('Connection failed [' . $conn->connect_error . ']');
};
    $query = "SELECT ID FROM " . $table; // that should be id and not ID
    //$result = mysql_query($mysql_connexn, $query); 
    $result = mysqli_query($conn,$query);


if(empty($result)) {
    echo "<script> console.log('Table: $table created!') </script>";
    $query = mysqli_query($conn,"CREATE TABLE IF NOT EXISTS `$table` (
      id  VARCHAR(8),
      preferred_term  VARCHAR(217),
      synonyms  VARCHAR(217),
      PRIMARY KEY(synonyms)
    )");
    }
    else {
        echo "<script> console.log('Table: $table already exists!') </script>";
    } // else

if($_FILES['csv']['error'] == 0){
    $name = $_FILES['csv']['name'];

    $tmp = explode('.', $_FILES['csv']['name']);
    $endTmp = end($tmp);
    $ext = strtolower($endTmp);
    $tmpName = $_FILES['csv']['tmp_name'];
    if($ext === 'csv'){ //check if uploaded file is of CSV format
        if(($handle = fopen($tmpName, 'r')) !== FALSE) {
            set_time_limit(0);
            $row = 0;
            while(($data = fgetcsv($handle, $batchsize, "\t")) !== FALSE) {
                //echo "<script>console.log($data) </script>";
                $col_count = count($data);
                //splitting of CSV file :
                if ($row % $batchsize == 0):
                    $file = fopen("chunks$row.csv","w");
                endif;
                $csv[$row]['col1'] = $data[0];
                $csv[$row]['col2'] = $data[1];
                $csv[$row]['col3'] = $data[2];
                $id = $data[0];
                $preferred_term = $data[1];
                $synonyms = $data[2];
                $json = "'$id', '$preferred_term', '$synonyms'";
                fwrite($file,$json.PHP_EOL);
                //sending the splitted CSV files, batch by batch...
                if ($row % $batchsize == 0):
                    //echo "<script> console.log('chunks$row.csv', '$table'); </script>";
                    echo "<script> senddata('chunks$row.csv', '$table'); </script>";

                endif;
                $row++;
            }
            fclose($file);
            fclose($handle);
        }
    }
    else
    {
        echo "Only CSV files are allowed.";
    }
    //alert once done.
    echo "<script> console.log('CSV File imported!') </script>";
}
?>

senddata.php

<?php
include('connect.php');

$data = $_POST['file'];
$table = $_POST['table'];
$handle = fopen($data, "r");
    
if ($handle) {
    $counter = 0;
    //instead of executing query one by one,
    //prepare 1 SQL query that will insert all values from the batch
    $sql ="INSERT INTO `$table`(id,preferred_term,synonyms) VALUES ";
    while (($line = fgets($handle, "\t")) !== false) {
      $sql .= "($line),";
      $counter++;
    }
    $sql = substr($sql, 0, strlen($sql) - 1);
     if ($conn->query($sql) === TRUE) {
    } else {
     }
    fclose($handle);
} else {
}
//unlink CSV file once already imported to DB to clear directory
unlink($data);
?>

【问题讨论】:

  • 看看批量插入 dev.mysql.com/doc/refman/8.0/en/…,拆分文件,可能是个好主意,如果 mysql 直接处理失败,我只会在我的第二次尝试中使用它
  • 我认为您没有正确使用$batchsize。我同意 nbk:不要尝试过早地拆分文件。我也同意 Lounis:加载数据可能是更好的方法。

标签: php mysql parsing etl


【解决方案1】:

考虑使用LOAD DATA mysql 函数,它非常快并且是为此目的而开发的(大量文件加载)。

https://dev.mysql.com/doc/refman/8.0/en/load-data.html

例子:

LOAD DATA INFILE 'data.csv' 
INTO TABLE my_table 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

所以你不需要遍历每个元素,你只需要配置分隔符、行尾字符等。

比用php和一个大函数来做效率更高。

【讨论】:

  • 我使用 mySQL 5.7 版。不是 8.0 版吗?
  • mysql 5.7 没问题。
猜你喜欢
  • 1970-01-01
  • 2016-05-07
  • 2017-03-03
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多