【问题标题】:How to to tokenize sentence using javascript如何使用javascript标记句子
【发布时间】:2016-05-16 06:19:27
【问题描述】:

我正在尝试使用 JavaScript 拆分函数标记以下句子。

  CHRIS NISWANDEE,
   (SMALLSYS INC,
   795 E DRAGRAM),
   TUCSON AZ 85705,
   USA

我的预期结果是,

 "chris","niswnadee",",","(","smallsys","inc","785","e","dgram","("...
etc

我可以使用以下代码在单词边界处进行拆分,

"CHRIS NISWANDEE, (SMALLSYS INC, 795 E DRAGRAM), TUCSON AZ 85705, USA".split(/\b\s+/)

有什么方法可以让我的结果中包含逗号和括号?

【问题讨论】:

    标签: javascript split tokenize


    【解决方案1】:

    好像你想在/\s+|\b/ 上分手。

    它的意思是:“任何空格序列(\s+|)任何单词边界(\b)”

    "CHRIS NISWANDEE, (SMALLSYS INC, 795 E DRAGRAM), TUCSON AZ 85705, USA".split(/\s|\b/)
    

    输出

    ["CHRIS", "NISWANDEE", ",", "(", "SMALLSYS", "INC", ",", "795", "E", "DRAGRAM", "),", "TUCSON", "AZ", "85705", ",", "USA"]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-04-03
      • 2021-07-15
      • 1970-01-01
      • 2020-05-24
      • 2012-12-15
      • 2017-09-08
      • 2013-07-15
      • 1970-01-01
      相关资源
      最近更新 更多