【问题标题】:Laravel 5.4 upgrade, converting to utf4mb from utf8Laravel 5.4 升级,从 utf8 转换为 utf4mb
【发布时间】:2018-06-17 01:55:28
【问题描述】:

我正忙着将我的一个网站从 5.3 升级到 Laravel 5.4。在浏览当前的 Github 存储库时,我注意到默认字符集和排序规则已从 utf8 更改为 utf8mb4,以便为表情符号提供支持。

我当前的数据库(MariaDB 10.0.29)当前设置为使用utf8,但我想升级它以使用utf8mb4。不幸的是,我无法找到有关此过程的任何文档。

也许我想多了,但我会认为更改数据库的字符集和排序规则需要一些工作,至少运行一些 ALTER TABLE 命令。

  • 是否需要更改数据库和表的字符集和排序规则?或者只需更改我的config/database.php 文件中的设置就足够了吗?
  • 如果是这样,任何人都可以提供有关如何实现此目的的示例迁移(或一些 MySQL 代码),请记住保留现有数据是必不可少的。

谢谢

【问题讨论】:

  • 我在升级到 5.4 以及使用 AWS 上托管的 MariaDB 实例时遇到了问题。不幸的是,由于时间限制,我不得不将其关闭并将所有数据重新导入 MySQL 实例,因为我尝试过的建议不起作用。 TL;DR:我很想得到答案,我自己。
  • 不幸的是,我仍然没有遇到任何可以提供帮助的东西。我很确定更改字符集和排序规则是必要的。我正忙于编写迁移以实现这一目标并能够在必要时恢复,但这并非没有挑战。如果您有兴趣使用它,我会在完成后为您发布。

标签: php mysql database laravel encoding


【解决方案1】:

好吧,我已经为我自己的系统编写了一个迁移来实现这一点。

  • 它允许您有选择地指定连接名称来引用默认连接以外的连接。

  • 它使用SHOW TABLES 查询从连接的数据库中获取表列表。

  • 然后循环遍历每个表,并将所有字符串/字符类型列更新为新的字符集和排序规则。

  • 我已经做到了,因此必须提供回调来确定是否应将列的长度更改为提供的新长度。在我的实现中,长度大于 191 的 VARCHARCHAR 列在向上迁移期间被更新为长度为 191,而长度正好为 191 的 VARCHARCHAR 列被更新为反向长度为 255/向下迁移。

  • 一旦所有字符串/字符列都更新完毕,将运行几个查询来更改表的字符集和排序规则,将所有剩余排序规则转换为新的排序规则,然后更改默认字符集和表格的整理。

  • 最后,数据库的默认字符集和排序规则将被更改。

注意事项

  • 最初,我尝试简单地将表转换为新编码,但遇到了列长度问题。在我的 MySQL/MariaDB 版本中使用 InnoDB 并且更改表排序规则导致错误时,utf8mb4 中的最大字符长度为 191 个字符。

  • 起初我只是想将长度更新为新长度,但我还想提供回滚功能,所以这不是一个选项,因为在反向方法中我会设置utf8mb4 到 255 的列,这太长了,所以我也选择更改排序规则。

  • 然后,我尝试仅更改太长的 varcharchar 列的长度、字符集和排序规则,但在我的系统中,当我有包含以下内容的多列索引时,这会导致错误这样的列。显然,多列索引必须使用相同的排序规则。

  • 重要提示是,反向/向下迁移不会对每个人都 100% 完美。我认为在迁移时不存储有关原始列的额外信息是不可能的。所以我当前的反向/向下迁移实现是假设长度为 191 的列最初是 255。

  • 同样重要的一点是,这将盲目地将所有字符串/字符列的排序规则更改为新排序规则,而不管原始排序规则如何,所以如果有列不同的排序规则,它们都将转换为新的排序规则,反之亦然,不会保留原始排序规则。


<?php

use Illuminate\Database\Migrations\Migration;

class UpgradeDatabaseToUtf8mb4 extends Migration
{
    /**
     * Run the migrations.
     *
     * @return void
     */
    public function up()
    {
        $this->changeDatabaseCharacterSetAndCollation('utf8mb4', 'utf8mb4_unicode_ci', 191, function ($column) {
            return $this->isStringTypeWithLength($column) && $column['type_brackets'] > 191;
        });
    }

    /**
     * Reverse the migrations.
     *
     * @return void
     */
    public function down()
    {
        $this->changeDatabaseCharacterSetAndCollation('utf8', 'utf8_unicode_ci', 255, function ($column) {
            return $this->isStringTypeWithLength($column) && $column['type_brackets'] == 191;
        });
    }

    /**
     * Change the database referred to by the connection (null is the default connection) to the provided character set
     * (e.g. utf8mb4) and collation (e.g. utf8mb4_unicode_ci). It may be necessary to change the length of some fixed
     * length columns such as char and varchar to work with the new encoding. In which case the new length of such
     * columns and a callback to determine whether or not that particular column should be altered may be provided. If a
     * connection other than the default connection is to be changed, the string referring to the connection may be
     * provided as the last parameter (This string will be passed to DB::connection(...) to retrieve an instance of that
     * connection).
     *
     * @param string       $charset
     * @param string       $collation
     * @param null|int     $newColumnLength
     * @param Closure|null $columnLengthCallback
     * @param string|null  $connection
     */
    protected function changeDatabaseCharacterSetAndCollation($charset, $collation, $newColumnLength = null, $columnLengthCallback = null, $connection = null)
    {
        $tables = $this->getTables($connection);

        foreach ($tables as $table) {
            $this->updateColumnsInTable($table, $charset, $collation, $newColumnLength, $columnLengthCallback, $connection);
            $this->convertTableCharacterSetAndCollation($table, $charset, $collation, $connection);
        }

        $this->alterDatabaseCharacterSetAndCollation($charset, $collation, $connection);
    }

    /**
     * Get an instance of the database connection provided with an optional string referring to the connection. This
     * should be null if referring to the default connection.
     *
     * @param string|null $connection
     *
     * @return \Illuminate\Database\Connection
     */
    protected function getDatabaseConnection($connection = null)
    {
        return DB::connection($connection);
    }

    /**
     * Get a list of tables on the provided connection.
     *
     * @param null $connection
     *
     * @return array
     */
    protected function getTables($connection = null)
    {
        $tables = [];

        $results = $this->getDatabaseConnection($connection)->select('SHOW TABLES');
        foreach ($results as $result) {
            foreach ($result as $key => $value) {
                $tables[] = $value;
                break;
            }
        }

        return $tables;
    }

    /**
     * Given a stdClass representing the column, extract the required information in a more accessible format. The array
     * returned will contain the field name, the type of field (Without the length), the length where applicable (or
     * null), true/false indicating the column allowing null values and the default value.
     *
     * @param stdClass $column
     *
     * @return array
     */
    protected function extractInformationFromColumn($column)
    {
        $type = $column->Type;
        $typeBrackets = null;
        $typeEnd = null;

        if (preg_match('/^([a-z]+)(?:\\(([^\\)]+?)\\))?(.*)/i', $type, $matches)) {
            $type = strtolower(trim($matches[1]));

            if (isset($matches[2])) {
                $typeBrackets = trim($matches[2]);
            }

            if (isset($matches[3])) {
                $typeEnd = trim($matches[3]);
            }
        }

        return [
            'field' => $column->Field,
            'type' => $type,
            'type_brackets' => $typeBrackets,
            'type_end' => $typeEnd,
            'null' => strtolower($column->Null) == 'yes',
            'default' => $column->Default,
            'charset' => is_string($column->Collation) && ($pos = strpos($column->Collation, '_')) !== false ? substr($column->Collation, 0, $pos) : null,
            'collation' => $column->Collation
        ];
    }

    /**
     * Tell if the provided column is a string/character type and needs to have it's charset/collation changed.
     *
     * @param string $column
     *
     * @return bool
     */
    protected function isStringType($column)
    {
        return in_array(strtolower($column['type']), ['char', 'varchar', 'tinytext', 'text', 'mediumtext', 'longtext', 'enum', 'set']);
    }

    /**
     * Tell if the provided column is a string/character type with a length.
     *
     * @param string $column
     *
     * @return bool
     */
    protected function isStringTypeWithLength($column)
    {
        return in_array(strtolower($column['type']), ['char', 'varchar']);
    }

    /**
     * Update all of the string/character columns in the database to be the new collation. Additionally, modify the
     * lengths of those columns that have them to be the newLength provided, when the shouldUpdateLength callback passed
     * returns true.
     *
     * @param string        $table
     * @param string        $charset
     * @param string        $collation
     * @param int|null      $newLength
     * @param Closure|null  $shouldUpdateLength
     * @param string|null   $connection
     */
    protected function updateColumnsInTable($table, $charset, $collation, $newLength = null, Closure $shouldUpdateLength = null, $connection = null)
    {
        $columnsToChange = [];

        foreach ($this->getColumnsFromTable($table, $connection) as $column) {
            $column = $this->extractInformationFromColumn($column);

            if ($this->isStringType($column)) {
                $sql = "CHANGE `%field%` `%field%` %type%%brackets% CHARACTER SET %charset% COLLATE %collation% %null% %default%";
                $search = ['%field%', '%type%', '%brackets%', '%charset%', '%collation%', '%null%', '%default%'];
                $replace = [
                    $column['field'],
                    $column['type'],
                    $column['type_brackets'] ? '(' . $column['type_brackets'] . ')' : '',
                    $charset,
                    $collation,
                    $column['null'] ? 'NULL' : 'NOT NULL',
                    is_null($column['default']) ? ($column['null'] ? 'DEFAULT NULL' : '') : 'DEFAULT \'' . $column['default'] . '\''
                ];

                if ($this->isStringTypeWithLength($column) && $shouldUpdateLength($column) && is_int($newLength) && $newLength > 0) {
                    $replace[2] = '(' . $newLength . ')';
                }

                $columnsToChange[] = trim(str_replace($search, $replace, $sql));
            }
        }

        if (count($columnsToChange) > 0) {
            $query = "ALTER TABLE `{$table}` " . implode(', ', $columnsToChange);

            $this->getDatabaseConnection($connection)->update($query);
        }
    }

    /**
     * Get a list of all the columns for the provided table. Returns an array of stdClass objects.
     *
     * @param string      $table
     * @param string|null $connection
     *
     * @return array
     */
    protected function getColumnsFromTable($table, $connection = null)
    {
        return $this->getDatabaseConnection($connection)->select('SHOW FULL COLUMNS FROM ' . $table);
    }

    /**
     * Convert a table's character set and collation.
     *
     * @param string      $table
     * @param string      $charset
     * @param string      $collation
     * @param string|null $connection
     */
    protected function convertTableCharacterSetAndCollation($table, $charset, $collation, $connection = null)
    {
        $query = "ALTER TABLE {$table} CONVERT TO CHARACTER SET {$charset} COLLATE {$collation}";
        $this->getDatabaseConnection($connection)->update($query);

        $query = "ALTER TABLE {$table} DEFAULT CHARACTER SET {$charset} COLLATE {$collation}";
        $this->getDatabaseConnection($connection)->update($query);
    }

    /**
     * Change the entire database's (The database represented by the connection) character set and collation.
     *
     * # Note: This must be done with the unprepared method, as PDO complains that the ALTER DATABASE command is not yet
     *         supported as a prepared statement.
     *
     * @param string      $charset
     * @param string      $collation
     * @param string|null $connection
     */
    protected function alterDatabaseCharacterSetAndCollation($charset, $collation, $connection = null)
    {
        $database = $this->getDatabaseConnection($connection)->getDatabaseName();

        $query = "ALTER DATABASE {$database} CHARACTER SET {$charset} COLLATE {$collation}";

        $this->getDatabaseConnection($connection)->unprepared($query);
    }
}

请,请,请在运行此之前备份您的数据库。使用风险自负!

【讨论】:

  • 很好的答案!谢谢
  • 这是一个很好的答案。我尝试运行迁移以查看是否会出现任何错误,因为我有大约 255 列,但它运行良好。在 MySQL 文档中,他们说 For InnoDB tables that use COMPRESSED or DYNAMIC row format, you can enable the innodb_large_prefix option to permit index key prefixes longer than 767 bytes (up to 3072 bytes). Creating such tables also requires the option values innodb_file_format=barracuda and innodb_file_per_table=true.) In this case, enabling the innodb_large_prefix option enables you to index a maximum of 1024 or 768 characters for utf8 or utf8mb4 columns
  • 感谢您提供此脚本!注意:如果启用了 innodb_large_prefix,我认为您可以将字符长度保留为 255
【解决方案2】:

database 字符集和排序规则是新创建表的默认值table 设置是列的默认设置。

对每个表执行此操作:

ALTER TABLE table_name CONVERT TO utf8mb4;

【讨论】:

  • 这应该是ALTER TABLE tablename CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
  • 更好(通过)5.7 是utf8mb4_unicode_520_ci。对于 8.0,建议使用 utf8mb4_0900_ai_ci
猜你喜欢
  • 2018-01-15
  • 2018-04-23
  • 1970-01-01
  • 2017-12-03
  • 2018-06-17
  • 2020-05-23
  • 2019-06-16
  • 1970-01-01
  • 2017-06-12
相关资源
最近更新 更多