如何在 Ruby 中使用编码 utf-8？答案

【问题标题】：How can I use encode utf-8 in Ruby?如何在 Ruby 中使用编码 utf-8？
【发布时间】：2014-02-15 20:19:27
【问题描述】：

我正在尝试从文件的第一行中提取一个单词：

LOCATION,Feij�,AC,a,b,c

这样：

2.0.0-p247 :005 > File.foreach(file).first

=> "位置,Feij\xF3,AC,a,b,c\r\n"`

但是当我尝试使用拆分时：

2.0.0-p247 :008 > File.foreach(file).first.split(",")

ArgumentError: UTF-8 中的无效字节序列来自 (irb):8:in split' from (irb):8 from /home/bleh/.rvm/rubies/ruby-2.0.0-p247/bin/irb:13:in'

我期待的是：费约

我已经尝试了很多组合，例如 .encode 和 .force_encoding。

一些想法？

【问题讨论】：

你可以试试File.foreach(file, :encoding => 'utf-8').first 吗？
您好，Arup。是的，我收到了 Feij\xF3。尝试拆分时，同样的错误
好的，现在试试File.foreach(file, :encoding => 'ascii-8:utf-8').first ?
警告：忽略不支持的编码 ascii-8
试试这个：File.foreach(file, :encoding => 'windows-1252:utf-8').first 它看起来像是在 latin1 补充编码的

【解决方案1】：

ISO-8859-1 encoding中的字符ó是\xF3，所以这可能是文件的编码（也可能是CP-1252。

您可以将编码指定为 File::foreach 的 arg，也可以要求 Ruby 为您将其重新编码为 UTF-8：

File.foreach(file, :encoding => 'iso-8859-1:utf-8').first.split(",")

【讨论】：