【发布时间】:2019-11-11 22:30:02
【问题描述】:
我正在为计算机视觉应用程序注释数据集。我有 xml 文件形式的标准化坐标(xmin,ymin,xmax,ymax)
完整的xml如下所示:
<annotation>
<folder>image</folder>
<filename>100_icdar13.png</filename>
<path>/Users/image/100_icdar13.png</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>816</width>
<height>608</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>text</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>192</xmin>
<ymin>157</ymin>
<xmax>530</xmax>
<ymax>223</ymax>
</bndbox>
</object>
<object>
<name>text</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>561</xmin>
<ymin>159</ymin>
<xmax>645</xmax>
<ymax>219</ymax>
</bndbox>
</object>
<object>
<name>text</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>74</xmin>
<ymin>247</ymin>
<xmax>465</xmax>
<ymax>311</ymax>
</bndbox>
</object>
<object>
<name>text</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>493</xmin>
<ymin>255</ymin>
<xmax>625</xmax>
<ymax>305</ymax>
</bndbox>
</object>
<object>
<name>text</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>85</xmin>
<ymin>339</ymin>
<xmax>496</xmax>
<ymax>400</ymax>
</bndbox>
</object>
</annotation>
我想对这个数据集进行非规范化并以下列格式导出所有的框
x1, y1, x2, y2, x3, y3, x4, y4, text
我该怎么做,我可以使用什么算法来实现这一点?
【问题讨论】:
标签: python tensorflow machine-learning computer-vision object-detection