【问题标题】:Split MySQL fields by character count to the nearest full word?按字符数将 MySQL 字段拆分为最接近的完整单词?
【发布时间】:2016-11-27 18:29:45
【问题描述】:

我有一个 MySQL 字段用于 Markdown 格式的博客文章正文。由于我使用的 API,我只能发送 3000 个字符块,但是我的一些帖子有 4500 个字符那么大,其中有超过 2000 个,所以我不想手动拆分它们。

我正在尝试找出一个函数来检查列中每个字段的 char_length,如果它超过 3000 个字符,该函数会将超过 3000 个字符(四舍五入到最接近的单词)的任何内容拆分到第二列中有。这超出了我之前处理过的功能范围,所以我希望朝着正确的方向前进。这是我目前所拥有的基础:

SELECT `Body` from `blogposts`
WHERE char_length(Body) > 3000
SET
Body2 = SUBSTRING(`Body`, 3001, char_length(Body))
Body = SUBSTRING(`Body`, 1, 3000)

由于尚未完成,我尚未对其进行测试。我不相信它会接近我想要的,但在我测试之前我仍在尝试解决的另外两个问题是:

1) 如何让它到最近一个单词的末尾(四舍五入到 3000 个字符以下),而不是恰好在第 3000 个字符处拆分。

2) 如果它试图处理文字,它会在文本中的 markdown/html 上中断,例如将 <div> 拆分为 <div" ">(如果这是第 3000 个字符)。

关于背景,我已经阅读了以下内容:

Split string into table in groups of 26 characters or less, rounded to the nearest word

响应似乎提出了自定义函数来根据设定的长度拆分字符串,尽管这些函数没有得到很好的解释/评论,所以我有点迷茫。

如果在 MySQL 中不容易做到,我愿意在 PHP 中将其拉出并在那里操作数据。

任何见解都将不胜感激!

【问题讨论】:

    标签: php mysql sql string split


    【解决方案1】:
    update   `blogposts`
    
    set     `Body2` = substring(`Body`,3000-instr(reverse(left(`Body`,3000)),' ')+1) 
           ,`Body` = left(`Body`,3000-instr(reverse(left(`Body`,3000)),' '))  
    
    where   char_length(Body) > 3000
    ;
    

    30 个字符的演示

    set @Body = 'My name is Inigo Montoya! You''ve killed my father, prepare to die!';
    
    select  left(@Body,30-instr(reverse(left(@Body,30)),' '))         as field_1
           ,substring(@Body,30-instr(reverse(left(@Body,30)),' ')+1)  as field_2
    ;       
    

    +---------------------------+------------------------------------------+
    | field_1                   | field_2                                  |
    +---------------------------+------------------------------------------+
    | My name is Inigo Montoya! | You've killed my father, prepare to die! |
    +---------------------------+------------------------------------------+
    

    完整示例

    create table `blogposts` (`Body` varchar(3000),`Body2`  varchar(3000));
    

    insert into blogposts (`Body`) values
    
     ('Hello darkness, my old friend'                          )
    ,('I''ve come to talk with you again'                      )
    ,('Because a vision softly creeping'                       )
    ,('Left its seeds while I was sleeping'                    )
    ,('And the vision that was planted in my brain'            )
    ,('Still remains'                                          )
    ,('Within the sound of silence'                            )
    ,('In restless dreams I walked alone'                      )
    ,('Narrow streets of cobblestone'                          )
    ,('''Neath the halo of a street lamp'                      )
    ,('I turned my collar to the cold and damp'                )
    ,('When my eyes were stabbed by the flash of a neon light' )
    ,('That split the night'                                   )
    ,('And touched the sound of silence'                       )
    ,('And in the naked light I saw'                           )
    ,('Ten thousand people, maybe more'                        )
    ,('People talking without speaking'                        )
    ,('People hearing without listening'                       )
    ,('People writing songs that voices never share'           )
    ,('And no one dared'                                       )
    ,('Disturb the sound of silence'                           )
    ;
    

    select  left(`Body`,30-instr(reverse(left(`Body`,30)),' '))         as Body
           ,substring(`Body`,30-instr(reverse(left(`Body`,30)),' ')+1)  as Body2
    
    from    `blogposts`
    
    where   char_length(Body) > 30
    ;
    

    +------------------------------+---------------------------+
    | Body                         | Body2                     |
    +------------------------------+---------------------------+
    | I've come to talk with you   | again                     |
    +------------------------------+---------------------------+
    | Because a vision softly      | creeping                  |
    +------------------------------+---------------------------+
    | Left its seeds while I was   | sleeping                  |
    +------------------------------+---------------------------+
    | And the vision that was      | planted in my brain       |
    +------------------------------+---------------------------+
    | In restless dreams I walked  | alone                     |
    +------------------------------+---------------------------+
    | 'Neath the halo of a street  | lamp                      |
    +------------------------------+---------------------------+
    | I turned my collar to the    | cold and damp             |
    +------------------------------+---------------------------+
    | When my eyes were stabbed by | the flash of a neon light |
    +------------------------------+---------------------------+
    | And touched the sound of     | silence                   |
    +------------------------------+---------------------------+
    | Ten thousand people, maybe   | more                      |
    +------------------------------+---------------------------+
    | People talking without       | speaking                  |
    +------------------------------+---------------------------+
    | People hearing without       | listening                 |
    +------------------------------+---------------------------+
    | People writing songs that    | voices never share        |
    +------------------------------+---------------------------+
    

    update  `blogposts`
    
    set     `Body2` = substring(`Body`,30-instr(reverse(left(`Body`,30)),' ')+1) 
           ,`Body`  = left(`Body`,30-instr(reverse(left(`Body`,30)),' '))        
    
    where   char_length(`Body`) > 30
    ;
    

    select  `Body`
           ,`Body2`
    
    from    `blogposts`
    
    where   `Body2` is not null
    ;
    

    +------------------------------+---------------------------+
    | Body                         | Body2                     |
    +------------------------------+---------------------------+
    | I've come to talk with you   | again                     |
    +------------------------------+---------------------------+
    | Because a vision softly      | creeping                  |
    +------------------------------+---------------------------+
    | Left its seeds while I was   | sleeping                  |
    +------------------------------+---------------------------+
    | And the vision that was      | planted in my brain       |
    +------------------------------+---------------------------+
    | In restless dreams I walked  | alone                     |
    +------------------------------+---------------------------+
    | 'Neath the halo of a street  | lamp                      |
    +------------------------------+---------------------------+
    | I turned my collar to the    | cold and damp             |
    +------------------------------+---------------------------+
    | When my eyes were stabbed by | the flash of a neon light |
    +------------------------------+---------------------------+
    | And touched the sound of     | silence                   |
    +------------------------------+---------------------------+
    | Ten thousand people, maybe   | more                      |
    +------------------------------+---------------------------+
    | People talking without       | speaking                  |
    +------------------------------+---------------------------+
    | People hearing without       | listening                 |
    +------------------------------+---------------------------+
    | People writing songs that    | voices never share        |
    +------------------------------+---------------------------+
    

    【讨论】:

    • 谢谢 - 我正在尝试运行它,但我对这里的内容有点困惑。我想我只需将任何 (at)Body 更改为我的列名,但我在“字段列表”中收到错误“未知列“Body”,所以我想我需要将集合 (at)Body 更改为其他内容?
    • @AdamS.Cochran,忘记将 30 更改为 3000。现在就这样做了
    • 谢谢!我想我还需要将选择更改为更新以使其成为永久性更改?似乎从测试中效果很好。
    • @AdamS.Cochran,是的。请在更新之前备份您的表(创建它的副本)。
    • 在保存这些结果时似乎需要做一些更复杂的事情。因为在更新中使用选择函数似乎不可行?
    【解决方案2】:

    该代码将始终将字符串划分为 3000 个字符并将其推送到数组中。无论字符长度是多少,您都可以使用此代码块。不要忘记,如果您的文本中的字符数少于 3000,那么 $bodyParts 变量中将只有 1 个元素。

    $bodyText; // That came from SQL Ex Query : SELECT body FROM blogposts
    $bodyParts = [];
    $lengthOfBody = strlen($bodyText);
    if($lengthOfBody > 3000){
        $forLoopInt = ceil($lengthOfBody / 3000); // For example if your body text have 3500 characters it will be 2
        echo $forLoopInt;
        for($i = 0; $i<= $forLoopInt - 2; $i++){
            $bodyParts[] = substr($bodyText, ($i) * 3000 , 3000);
        }
        // lets fetch the last part
        $bodyParts[] = substr( $bodyText,($forLoopInt - 1) * 3000); 
    }else{
        $bodyParts[] = $bodyText;
    }
    /* anyway if your body text have characters lower than 3000 , bodyParts array will contain just 1 element, if not it will have Ceil(Length of body / 3000) elements in it. */
    var_dump($bodyParts);
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-12-11
      • 1970-01-01
      • 2011-06-12
      • 1970-01-01
      • 1970-01-01
      • 2011-10-23
      • 2014-06-09
      • 1970-01-01
      相关资源
      最近更新 更多