【问题标题】:Deep level xml parsing to csv using NodeJS使用 NodeJS 将深层 xml 解析为 csv
【发布时间】:2017-11-09 18:24:09
【问题描述】:

我有一个中等大小的 xml ~ 5mb,需要转换为 csv。

显然不会去重新发明轮子, 所以一个两层的方法 - 1> xml转json 2> json转csv

我当前的代码是:

const xml_obj = {}
const htt = require('http-status-code-node');
var fs = require('fs'); 
var xml2js = require('xml2js');
var converter = require('json-2-csv'); 

xml_obj["convert"] = (req, res, next) => { 
  var parser = new xml2js.Parser();
  fs.readFile(__dirname + '/directoryexport.xml', function (err, data) {   
    parser.parseString(data, function (err, result) { 
      console.log('Done'); 
      var callback = function (err, ycsv) {
        if (err) return console.log(err); 
        ///
        res.setHeader('Content-Disposition', 'attachment; filename=testing.csv');
        res.set('Content-Type', 'text/csv');
        res.status(200).send(result);
        ///
      }
      var documents = [];      
      documents.push(result)
      converter.json2csv(documents, callback);      
    })
  });
} 
module.exports = xml_obj.convert

然而,嵌套的 xml 提供了一个多层 json,它产生单个字符串而不是适当分隔的 csv..

The current output CSV

The Original xml

The XML structure

The Json I get on converting xml

也根据 json 到 csv 转换器的文档 如果输入 json 的结构正确,例如:

[
    {
        Make: 'Nissan',
        Model: 'Murano',
        Year: '2013',
        Specifications: {
            Mileage: '7106',
            Trim: 'S AWD'
        }
    },
    {
        Make: 'BMW',
        Model: 'X5',
        Year: '2014',
        Specifications: {
            Mileage: '3287',
            Trim: 'M'
        }
    }
];

这会产生一个格式非常漂亮的 csv,如下所示:Example Perfect CSV From JSON

编辑 1: 我正在寻找的格式有点像, 捕获每个人员节点的所有父组织和组织单元详细信息非常重要。 例如,

organizationalUnit UUID "b3b05b77-a8a7-43ed-ab74-b7d898c60296" 应该 生成一个 CSV 行,如:

"Mr Shayne Howard","Howard","Shayne","Mr","","Branch Manager","(02) 6121 5492","","Level 1, 12 Mort Street, Canberra, ACT, 2601","Shayne.Howard@employment.gov.au","","b43e0864-1b9a-40f0-8049-c90af5f9141c","","GPO Box 9880 CANBERRA ACT 2601 Australia",1392,"","Department of Employment","","1300 488 064","","","","http://www.employment.gov.au","GPO Box 9880, Canberra ACT 2601","EMPLOYMENT"
"Mr Luke de Jong","De Jong","Luke","Mr","","Branch Manager, General Counsel","(02) 6240 0909",""(02) 6123 5100"","","Luke.deJong@employment.gov.au","","58a503a8-ce8b-41c0-b690-b9f9efd98a89","","GPO Box 9880 CANBERRA ACT 2601",1393,"","Department of Employment","","1300 488 064","","","","http://www.employment.gov.au","GPO Box 9880, Canberra ACT 2601","EMPLOYMENT"

编辑 2: 展平 json 是一个好主意,但它不能捕获整个数据。 使用带有以下模板的 camaro nodejs 模块

persons: ['//person', {
      root_organization_name: '../../../../name',
      main_organization_name: '../../../name',
      main_organization_website: '../../../website',
      fullName: 'fullName',
      familyName: 'familyName',
      firstName: 'firstName',
      personalTitle: 'personalTitle',
      title: 'title',
      person_phone: 'phone',
      person_location: 'location',
      person_fax: 'fax',
      otherRolesDN: 'otherRolesDN',
      person_mail: 'mail',
      informationPublicationScheme: '../informationPublicationScheme',
      publications: '../../publications',
      annualReport: '../../annualReport',
      mediaReleases: '../../mediaReleases',
      organizationUnit_1_name: '../../name',
      organizationUnit_1_description: '../../description',
      organizationUnit_1_location: '../../location',
      organizationUnit_1_phone: '../../phone',
      organizationUnit_1_fax: '../../fax',
      organizationUnit_1_website: '../../website',
      organizationUnit_2_name: '../name',
      organizationUnit_2_location: '../location',
      organizationUnit_2_phone: '../phone',
      organizationUnit_2_fax: '../fax',
      organizationUnit_2_website: '../website',
      occupantName: './role/occupantName',
      roleName: './role/roleName',
      occupantUUID: './role/occupantUUID',
      role_phone: './role/phone',
      role_fax: './role/fax',
      role_location: './role/location',
      role_mail: './role/ mail'
    }]

我怎样才能得到角色数组。 当前的 csv 也在错误的列中获取了一些数据行:

Wrong csv after camaro 有关如何使用我的输入进行此操作的任何提示。

【问题讨论】:

  • 你能说明你在寻找什么样的输出格式吗?
  • 用相同的@TuanAnhTran 更新问题
  • 试试下面的代码看看是否有帮助。

标签: json node.js xml csv


【解决方案1】:

因为json结构的输出比较深,如果想要正确的转换成csv,就必须将其展平。

似乎您只对最深层次感兴趣。这是一个示例,如果您想添加更多数据,请随时添加到模板中

const transform = require('camaro')
const tocsv = require('json2csv')
const fs = require('fs')

const xml = fs.readFileSync('so.xml', 'utf-8')
const template = {
    persons: ['//person', {
        root_organization_name: '../../../../name',
        main_organization_name: '../../../name',
        main_organization_website: '../../../website',
        fullName: 'fullName',
        familyName: 'familyName',
        firstName: 'firstName',
        personalTitle: 'personalTitle',
        title: 'title',
        person_phone: 'phone',
        person_location: 'location',
        person_fax: 'fax',
        otherRolesDN: 'otherRolesDN',
        person_mail: 'mail',
        informationPublicationScheme: '../informationPublicationScheme',
        publications: '../../publications',
        annualReport: '../../annualReport',
        mediaReleases: '../../mediaReleases',
        organizationUnit_1_name: '../../name',
        organizationUnit_1_description: '../../description',
        organizationUnit_1_location: '../../location',
        organizationUnit_1_phone: '../../phone',
        organizationUnit_1_fax: '../../fax',
        organizationUnit_1_website: '../../website',
        organizationUnit_2_name: '../name',
        organizationUnit_2_location: '../location',
        organizationUnit_2_phone: '../phone',
        organizationUnit_2_fax: '../fax',
        organizationUnit_2_website: '../website',
        roles: ['../role', {
            occupantName: 'occupantName',
            roleName: 'roleName',
            occupantUUID: 'occupantUUID',
            role_phone: 'phone',
            role_fax: 'fax',
            role_location: 'location',
            role_mail: ' mail'
        }]
    }]
}

const result = transform(xml, template)
console.log(JSON.stringify(result.roles, null, 4))

输出 json 示例(如果需要,使用 json2csv 转换为 csv)

【讨论】:

  • 这很好,但这只是给出了 json 的最深节点,你可以调整它以使 xml 的所有层成为一个平面结构吗?
  • 在我的示例代码中,我举例说明了如何获取上层节点的数据。基本上你可以使用../ 上一级。像这样organizationUnit: '../name'
  • 如果您看到电话号码中的数据泄漏到姓名列中。和这样的缩进错误请检查突出显示的字段
  • @SaleemAhmed 查看角色的更新答案。我看到角色和人在同一级别?
  • @SaleemAhmed 从我的输出中看不到任何类似的东西。你能检查一下我在这里得到的输出吗drive.google.com/file/d/0B5lnD5gfVd69emd5NjRpNWFyVUU/…
猜你喜欢
  • 1970-01-01
  • 2021-07-04
  • 1970-01-01
  • 2015-06-20
  • 2015-05-26
  • 1970-01-01
  • 2019-12-28
  • 2018-02-09
  • 2015-08-08
相关资源
最近更新 更多