Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs

FCNs

1.abstract
提出了FCNN，The FCN is trained to optimize a continuous version of the Pseudo F-measure metric and an ensemble of FCNs outperform the competition winners on 4 of 7 DIBCO competitions.
指出这个模型在Palm Leaf Manuscripts里面效果也很好

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs
2.introduction
指出在识别文档之前应该先二值化
我们提出的FCNN能对各种文档图片作二值化并且不需要调参
回顾其它传统方法，指出FCNs learn from training data to exploit the spatial arrangements of pixels without relying on a handcrafted bias on local shapes.
本文贡献：
1.提出FCNs和架构在二值化上的使用
2.We show that directly optimizing the proposed continuous Pseudo F-measure exceeds the previous state-of-the-art on DIBCO competition data.
3.通过计算学习曲线得出结论：数据多样性比数据质量重要
4.证明了在数据输入特征更多的时候，FCNN表现更出色

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs
3.related work
总结了前人的方法
4.methods
A.FULLY CONVOLUNTIONAL NETWORKS
介绍了结构，各个输入输出尺寸和relu函数
B.MULTI-SCALE
提出不同尺寸输入合并能提高性能效果
介绍了上采样和下采样结构
C.Pseudo F-measure Loss
看不懂
D.Datasets and Metrics
用了两个数据集 DIBCOs [1]–[7] and Palm Leaf Manuscripts (PLM)
E.Implementation Details
看不懂，讲了下训练细节

5.experiments
A.loss functions
用了四个损失函数：P-FM, FM, P-FM + FM, and Cross Entropy (CE)
其中，由于P-FM有点问题（predicting border pixels as background）？就用了P-FM+FM一起，CE loss is a standard classification based loss
B.DIBCO performance
将自己FCNs与冠军队伍作了对比
C.Architecture Search
作者更改了一些结构的超参数（depth,width,kernel,scale），发现效果提升不大，认为对于数据集的更改比结构更改更好
D.How Much Data is Enough?
数据增多可能会提高效果，但太多会适得其反，绘制了关于数据量的学习曲线
提出多样性的training集至关重要
E. Input Features
指出Relative Darkness特征的重要性
We used RD features with a window size of 5x5 and a similarity threshold of ±10 in all experiments in this paper.

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs

6.conclusion
首先，我们结合P-FM和FM损失函数，并用FCNs训练文档图片二值化
其次，指出训练集的多样性比其它重要很多
最后，we analyzed using additional features as input to the FCN and found that Relative Darkness features [26] and the output of Howe binarization [9] perform best.

Document Image Binarization with Fully Convolutional Neural Networks 图片文档二值化FCNs