【发布时间】:2016-09-24 11:36:24
【问题描述】:
我想从数组中获取最常用的单词。唯一的问题是瑞典语字符(Å、Ä 和 Ö)只会显示为 �。
$string = 'This is just a test post with the Swedish characters Å, Ä, and Ö. Also as lower cased characters: å, ä, and ö.';
echo '<pre>';
print_r(array_count_values(str_word_count($string, 1, 'àáãâçêéíîóõôúÀÁÃÂÇÊÉÍÎÓÕÔÚ')));
echo '</pre>';
该代码将输出以下内容:
Array
(
[This] => 1
[is] => 1
[just] => 1
[a] => 1
[test] => 1
[post] => 1
[with] => 1
[the] => 1
[Swedish] => 1
[characters] => 2
[�] => 1
[�] => 1
[and] => 2
[�] => 1
[Also] => 1
[as] => 1
[lower] => 1
[cased] => 1
[�] => 1
[�] => 1
[�] => 1
)
我怎样才能让它“看到”瑞典语字符和其他特殊字符?
【问题讨论】:
-
您不应该对任何名称以
str开头的 PHP 函数不是多字节安全感到惊讶。手册中的用户 cmets 提出了替代方案。 -
@CBroe
...PHP function with a name starting with str...这个函数在哪里? -
试试这个函数
mb_str_word_count而不是str_word_count:stackoverflow.com/a/17725577/6797531 -
@CatalinB 谢谢,但输出将是这样的:
Array([This is just a test post with the Swedish characters �, �, and Ö. Also as lower cased characters: �, �, and �.] => 1)