此文不对理论做相关阐述,仅涉及代码实现:
1.熵计算公式:
P为正例,Q为反例
Entropy(S) = -PLog2(P) - QLog2(Q);
2.信息增量计算:
Gain(S,Sv) = Entropy(S) - (|Sv|/|S|)ΣEntropy(Sv);
举例:
转化数据输入:
5 14
Outlook Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain
Temperature Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild
Humidity High High High High Normal Normal Normal High Normal Normal Normal High Normal High
Wind Weak Strong Weak Weak Weak Strong Strong Weak Weak Weak Strong Strong Weak Strong
PlayTennis No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
Outlook Temperature Humidity Wind PlayTennis
1 package com.qunar.data.tree; 2 3 /** 4 * ********************************************************* 5 * <p/> 6 * Author: XiJun.Gong 7 * Date: 2016-09-02 15:28 8 * Version: default 1.0.0 9 * Class description: 10 * <p>统计该类型出现的次数</p> 11 * <p/> 12 * ********************************************************* 13 */ 14 public class CountMap<T> { 15 16 private T key; //类型 17 private int value; //出现的次数 18 19 public CountMap() { 20 this(null, 0); 21 } 22 23 public CountMap(T key, int value) { 24 this.key = key; 25 this.value = value; 26 } 27 28 public T getKey() { 29 return key; 30 } 31 32 public void setKey(T key) { 33 this.key = key; 34 } 35 36 public int getValue() { 37 return value; 38 } 39 40 public void setValue(int value) { 41 this.value = value; 42 } 43 }