【问题标题】:Unable to load Arff file in weka无法在 weka 中加载 Arff 文件
【发布时间】:2014-05-14 22:13:08
【问题描述】:

我正在尝试在 Weka 中打开 Arff,但出现两个错误。

首先,文件未被识别为“Arff 数据文件”。原因:文件读取过早结束 Token[EOL],第 3267 行。

如果我单击缺少值“?”的“使用转换器”,

第二个错误是, csvloader 加载失败。原因:值数错误,读取2,预期1,读取Token[EOF],3267

文件是:

https://www.dropbox.com/s/xs0ssnvs42bik5c/sg.arff

【问题讨论】:

    标签: weka arff


    【解决方案1】:

    任何arff file 应该在值之间包含逗号,而您的文件没有。你确定这是一个有效的 arff 文件吗?

    您的 arff 文件无效。你的属性是重复的。您只需声明一次。例如,如果您具有以下属性

    set CLASSPATH=.;d:\tools\Weka-3-7\weka.jar
    d:\atilla\downloads>java weka.core.Instances sg.arff
    java.io.IOException: Unable to determine structure as arff (Reason: java.lang.Il
    legalArgumentException: Attribute names are not unique! Causes: 'campus' 'friend' 'homework' 'people' 'people' 'do' 'work' 'work' 'study' 'campus' 'people' 'people' 'life' 'learn' 'study' 'learn' 'put' 'study' 'learn' 'institute' 'get' 'put
    

    以下是您的文件构建的有效 arff 文件。

    @relation sg-test
    @attribute campus real
    @attribute utilitarian real
    @attribute put real
    @attribute much real
    @attribute make real
    @attribute look real
    @attribute nice real
    @attribute people real
    @attribute busy real
    @attribute have real
    @attribute real real
    @attribute friendship real
    @attribute institute real
    @attribute end real
    @attribute pick real
    @attribute homework real
    @attribute friend real
    @attribute lose real
    @attribute way real
    @attribute crushed real
    @attribute lie real
    @attribute say real
    @attribute do real
    @attribute work real
    @attribute time real
    @attribute type real
    @attribute study real
    @attribute room real
    @attribute many real
    @attribute great real
    @attribute place real
    @attribute go real
    @attribute city real
    @attribute dull real
    @attribute Class {term,score}
    @data 
    0.0,0.041666666666666664,-0.019185326611942655,0.005523215037172114,0.0,0.012052341597796145,0.02062568512992925,0.0,-0.030000000000000006,0.708941605839416,0.0,0.12317518248175183,0.05020802460556254,-0.019147145462196667,0.125,0.0,0.0,-0.06617570128224504,0.0,0.10948905109489052,0.10948905109489052,0.0,-0.3490625485300618,0.00402808616500622,0.0,-0.125,0.0,-0.028925619834710748,0.006898734933282365,-0.019185326611942655,0.015740237951508994,0.015740237951508994,0.12091857471887278,0.0,term
    

    当我执行相同的命令时。我从 Weka 获得以下信息。

    Relation Name:  sg-test
    Num Instances:  1
    Num Attributes: 35
    
         Name                      Type  Nom  Int Real     Missing      Unique  Dist
       1 campus                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
       2 utilitarian                Num   0%   0% 100%     0 /  0%     1 /100%     1 
       3 put                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
       4 much                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
       5 make                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
       6 look                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
       7 nice                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
       8 people                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
       9 busy                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      10 have                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      11 real                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
      12 friendship                 Num   0%   0% 100%     0 /  0%     1 /100%     1 
      13 institute                  Num   0%   0% 100%     0 /  0%     1 /100%     1 
      14 end                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
      15 pick                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      16 homework                   Num   0% 100%   0%     0 /  0%     1 /100%     1 
      17 friend                     Num   0% 100%   0%     0 /  0%     1 /100%     1 
      18 lose                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      19 way                        Num   0% 100%   0%     0 /  0%     1 /100%     1 
      20 crushed                    Num   0%   0% 100%     0 /  0%     1 /100%     1 
      21 lie                        Num   0%   0% 100%     0 /  0%     1 /100%     1 
      22 say                        Num   0% 100%   0%     0 /  0%     1 /100%     1 
      23 do                         Num   0%   0% 100%     0 /  0%     1 /100%     1 
      24 work                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      25 time                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
      26 type                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      27 study                      Num   0% 100%   0%     0 /  0%     1 /100%     1 
      28 room                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      29 many                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      30 great                      Num   0%   0% 100%     0 /  0%     1 /100%     1 
      31 place                      Num   0%   0% 100%     0 /  0%     1 /100%     1 
      32 go                         Num   0%   0% 100%     0 /  0%     1 /100%     1 
      33 city                       Num   0%   0% 100%     0 /  0%     1 /100%     1 
      34 dull                       Num   0% 100%   0%     0 /  0%     1 /100%     1 
      35 Class                      Nom 100%   0%   0%     0 /  0%     1 /100%     1 
    

    【讨论】:

    • 对不起,错误地放置了旧文件。这是 Arff 文件dropbox.com/s/xs0ssnvs42bik5c/sg.arff
    • 嗨 atilla,您能提出任何解决方案吗?继续我的项目对我来说是紧急情况。
    • 非常感谢您的回复。现在与已发送的文件相比,属性的数量 na 的数量更少。这可以用于 weka 中的分类吗?我可以使用朴素贝叶斯或 SVM 吗?你能帮我解决这个问题吗?
    • 我建议先尝试基本的weka教程,然后再回到你的问题
    • 当然..vl 如果我的问题没有解决 vl 请再次联系您。谢谢你..
    猜你喜欢
    • 1970-01-01
    • 2020-12-19
    • 2014-11-20
    • 2017-08-28
    • 2014-08-12
    • 2016-11-09
    • 2015-04-30
    • 2015-06-24
    • 1970-01-01
    相关资源
    最近更新 更多