【问题标题】:Convert nested/complex JSON to CSV not get actual output将嵌套/复杂 JSON 转换为 CSV 未获得实际输出
【发布时间】:2020-04-18 12:33:15
【问题描述】:

输入json是(json是真实数据的一小部分,真实json很长而且层次比较多。json行30k多)

  {
  "data": {
    "getUsers": [
      {
        "userProfileDetail": {
          "userStatus": {
            "name": "Expired"
          },
          "userStatusDate": "2017-04-04T07:48:25+00:00",
          "lastAttestationDate": "2019-02-01T03:50:42.6049634-05:00"
        },
        "userInformation": {
          "Id": 13610875,
          "lastName": "************",
          "suffix": null,
          "gender": "FEMALE",
          "birthDate": "1970-01-01T00:01:00+00:00",
          "ssn": "000000000",
          "ethnicity": "INVALID_REFERENCE_VALUE",
          "languagesSpoken": null,
          "personalEmail": null,
          "otherNames": null,
          "userType": {
            "name": "APN"
          },
          "primaryuserState": "CO",
          "otheruserState": [
            "CO"
          ],
          "practiceSetting": "INPATIENT_ONLY",
          "primaryEmail": "*****@*****.com"
        }
      },
      {
        "userProfileDetail": {
          "userStatus": {
            "name": "Expired newwwwwwwwwwww"
          },
          "userStatusDate": "2017-04-04T07:48:25+00:00",
          "lastAttestationDate": "2019-02-01T03:50:42.6049634-05:00"
        },
        "userInformation": {
          "Id": 13610875,
          "lastName": "************",
          "suffix": null,
          "gender": "FEMALE",
          "birthDate": "1970-01-01T00:01:00+00:00",
          "ssn": "000000000",
          "ethnicity": "INVALID_REFERENCE_VALUE",
          "languagesSpoken": null,
          "personalEmail": null,
          "otherNames": null,
          "userType": {
            "name": "APN"
          },
          "primaryuserState": "CO",
          "otheruserState": [
            "CO"
          ],
          "practiceSetting": "INPATIENT_ONLY",
          "primaryEmail": "*****@*****.com"
        }
      }
    ]
  }
}

代码是

var obj = JObject.Parse(json);
            // Collect column titles: all property names whose values are of type JValue, distinct, in order of encountering them.
            var jsonValues = obj.DescendantsAndSelf().OfType<JProperty>().Where(p => p.Value is JValue).GroupBy(p => p.Name).ToList();
            var jsonKey = jsonValues.Select(g => g.Key).ToArray();

            // Filter JObjects that have child objects that have values.
            var parentsWithChildren = jsonValues.SelectMany(g => g).SelectMany(v => v.AncestorsAndSelf().OfType<JObject>().Skip(1)).ToHashSet();

            // Collect all data rows: for every object, go through the column titles and get the value of that property in the closest ancestor or self that has a value of that name.
            var rows = obj
                .DescendantsAndSelf()
                .OfType<JObject>()
                .Where(o => o.PropertyValues().OfType<JValue>().Any() && (o == obj || !parentsWithChildren.Contains(o))) // Show a row for the root object + objects that have no children.
                .Select(o => jsonKey.Select(c => o.AncestorsAndSelf().OfType<JObject>().Select(parent => parent[c])
                    .Where(v => v is JValue).Select(v => (string)v).FirstOrDefault()).Reverse() // Trim trailing nulls
                    .SkipWhile(s => s == null).Reverse());

            // Convert to CSV
            var csvRows = new[] { jsonKey }.Concat(rows).Select(r => string.Join(",", r));
            var csv = string.Join("\n", csvRows);
            Console.WriteLine(csv);

这是我得到的输出:

getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail 已过期,04/04/2017 13:18:25,02/01/2019 14:20:42 APN,,,13610875,************,,FEMALE,01/01/1970 05:31:00,000000000,INVALID_REFERENCE_VALUE,,,,CO,INPATIENT_ONLY,*****@ *****.com

这里的 userType > name not column 不在正确的位置,otheruserState 数组没有出现在输出中。

谁能帮帮我?

【问题讨论】:

  • 也许有一些 LINQ 功夫可以挖掘那些嵌套项目以获得您的结果,但是将 json 反序列化为强类型并将转换器编码为 csv 不是更容易吗?
  • 基于此 JSON 创建一个类并将 JSON 转换为类并使用其属性
  • 您需要在花括号之后,但在 getUsers 的右方括号之前使用逗号吗?不确定...

标签: c# json csv


【解决方案1】:

我会推荐以下过程,因为它不会跳过空值,并且如果有空值也不会抛出错误。下面的过程为 json 中的每个用户创建一个 csv 格式的字符串,并为任何空值写下一个 string.empty。

字符串列表被转换为 |分隔,因为它采用逗号分隔格式。 您应该更新所有类并在属性名称中使用大写首字母。我只是粘贴从 json2csharp 网站获得的内容。

获取 Json 类

我使用json2csharp 站点将您的 json 转换为类。获得课程后,我在GetUser 上使用了覆盖方法将用户数据转换为字符串......然后使用该信息打印它。

Json 类


    public class UserStatus
    {
        public string name { get; set; }
    }

    public class UserProfileDetail
    {
        public UserStatus userStatus { get; set; }
        public DateTime userStatusDate { get; set; }
        public DateTime lastAttestationDate { get; set; }
    }

    public class UserType
    {
        public string name { get; set; }
    }

    public class UserInformation
    {
        public int Id { get; set; }
        public string lastName { get; set; }
        public string suffix { get; set; }
        public string gender { get; set; }
        public DateTime birthDate { get; set; }
        public string ssn { get; set; }
        public string ethnicity { get; set; }
        public List<string> languagesSpoken { get; set; }
        public string personalEmail { get; set; }
        public List<string> otherNames { get; set; }
        public UserType userType { get; set; }
        public string primaryuserState { get; set; }
        public List<string> otheruserState { get; set; }
        public string practiceSetting { get; set; }
        public string primaryEmail { get; set; }
    }

    public class GetUser
    {
        public override string ToString()
        {
            List<string> userData = new List<string>
            {
                userProfileDetail.userStatus.name,
                userProfileDetail.userStatusDate.ToString(),
                userProfileDetail.lastAttestationDate.ToString(),
                userInformation.Id.ToString(),
                userInformation.lastName,
                userInformation.suffix?? string.Empty ,
                userInformation.gender?? string.Empty ,
                userInformation.birthDate.ToString(),
                userInformation.ssn?? string.Empty ,
                userInformation.ethnicity?? string.Empty ,
                string.Join("|", userInformation.languagesSpoken?? new List<string>()),
                userInformation.personalEmail?? string.Empty ,
                string.Join("|", userInformation.otherNames?? new List<string>() ),
                userInformation.userType.name?? string.Empty ,
                userInformation.primaryuserState?? string.Empty ,
                string.Join("|", userInformation.otheruserState),
                userInformation.practiceSetting?? string.Empty ,
                userInformation.primaryEmail
            };

            return string.Join(",", userData);
        }
        public UserProfileDetail userProfileDetail { get; set; }
        public UserInformation userInformation { get; set; }
    }

    public class Data
    {
        public List<GetUser> getUsers { get; set; }
    }

    public class RootObject
    {
            public string GetHeader()
            {
                return "getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail";
            }
        public Data data { get; set; }
    }

如何使用上面的类

    string json = File.ReadAllLines("locationOfJson");
    var rootObject = JsonConvert.DeserializeObject<RootObject>(json);
    Console.WriteLine(rootObject.GetHeader()); // Prints Header
    foreach (var user in rootObject.data.getUsers)
    {
        Console.WriteLine(user.ToString()); // Print Each User.
    }

输出

getUsers_userProfileDetail_userStatus_name,getUsers_userProfileDetail_userStatusDate,getUsers_userProfileDetail_lastAttestationDate,getUsers_userInformation_Id,getUsers_userInformation_lastName,getUsers_userInformation_suffix,getUsers_userInformation_gender,getUsers_userInformation_birthDate,getUsers_userInformation_ssn,getUsers_userInformation_ethnicity,getUsers_userInformation_languagesSpoken,getUsers_userInformation_personalEmail,getUsers_userInformation_otherNames,getUsers_userInformation_userType_name,getUsers_userInformation_primaryuserState,getUsers_userInformation_otheruserState,getUsers_userInformation_practiceSetting,getUsers_userInformation_primaryEmail
Expired,4/4/2017 3:48:25 AM,2/1/2019 3:50:42 AM,13610875,************,,FEMALE,12/31/1969 7:01:00 PM,000000000,INVALID_REFERENCE_VALUE,,,,APN,CO,CO,INPATIENT_ONLY,*****@*****.com

我建议将数据复制粘贴到 excel 中,看看它是否适合。我对其进行了测试,似乎所有数据都在他们的标题下正确显示。

【讨论】:

  • 上面的json是真实数据的一小部分,真实的json很长,层次更多。 json 行超过 30k。
【解决方案2】:

您提供的案例的解决方案如下。它使用JsonTextReader 而不是 LINQ to JSON 来让您完全控制输出格式。例如,您没有指定字符串数组 (otheruserState) 的行为方式,因此在我的解决方案中,我用破折号分隔字符串值。我对空值使用空字符串。

string propertyName = "";
var isArray = false;
var arrayHeaderprinted = false;

var headers = new List<string>();
var data = new List<string>();
var arrayData = new List<string>();

using (var reader = new JsonTextReader(new StringReader(json)))
{
    while (reader.Read())
    {
        switch (reader.TokenType)
        {
            case JsonToken.PropertyName:
                propertyName = (string)reader.Value;
                break;
            case JsonToken.StartArray:
                isArray = true;
                break;
            case JsonToken.EndArray:
            case JsonToken.StartObject:
                isArray = false;
                if (arrayHeaderprinted)
                {
                    arrayHeaderprinted = false;
                    data.Add(string.Join("-", arrayData));
                }
                break;
            case JsonToken.Null:
            case JsonToken.String:
            case JsonToken.Boolean:
            case JsonToken.Date:
            case JsonToken.Float:
            case JsonToken.Integer:
                if (isArray)
                {
                    if (!arrayHeaderprinted)
                    {
                        arrayHeaderprinted = true;
                        headers.Add(propertyName);
                    }
                    arrayData.Add(reader.Value.ToString());
                }
                else
                {
                    headers.Add(propertyName);
                    data.Add(reader.Value?.ToString() ?? "");
                }
                break;
        }
    }
}

Console.WriteLine(string.Join(",", headers));
Console.WriteLine(string.Join(",", data));

它产生的输出:

name,userStatusDate,lastAttestationDate,Id,lastName,suffix,gender,birthDate,ssn,ethnicity,languagesSpoken,personalEmail,otherNames,name,primaryuserState,otheruserState,practiceSetting,primaryEmail
Expired,04.04.2017 09:48:25,01.02.2019 09:50:42,13610875,************,,FEMALE,01.01.1970 01:01:00,000000000,INVALID_REFERENCE_VALUE,,,,APN,CO,CO-PP,INPATIENT_ONLY,*****@*****.com

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2020-09-29
    • 2019-12-20
    • 1970-01-01
    • 2018-01-07
    • 2020-04-02
    • 2020-10-28
    相关资源
    最近更新 更多