【发布时间】:2018-05-30 18:45:00
【问题描述】:
这是我的架构
root
|-- DataPartition: string (nullable = true)
|-- TimeStamp: string (nullable = true)
|-- TRFCoraxData_instrumentId: long (nullable = true)
|-- TRFCoraxData_organizationId: long (nullable = true)
|-- Dividends: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- cr:AnnouncementDate: string (nullable = true)
| | |-- cr:CorporateActionAdjustedDividendGrossAmount: double (nullable = true)
| | |-- cr:CorporateActionAdjustedDividendNetAmount: double (nullable = true)
| | |-- cr:CurrencyId: long (nullable = true)
| | |-- cr:DividendEventId: long (nullable = true)
| | |-- cr:DividendGrossAmount: double (nullable = true)
| | |-- cr:DividendNetAmount: double (nullable = true)
| | |-- cr:DividendType: string (nullable = true)
| | |-- cr:ExDate: string (nullable = true)
| | |-- cr:PayDate: string (nullable = true)
| | |-- cr:PeriodDuration: string (nullable = true)
| | |-- cr:PeriodEndDate: string (nullable = true)
| | |-- cr:RecordDate: string (nullable = true)
|-- FFAction|!|: string (nullable = true)
我想分解并选择同一表达式中的所有列,以便 我不必通过单独给出列名来编写 Column 或 Select 。
这是我要爆炸的代码
val temp2 = temp1.select(getDataPartition($"DataPartition").as("DataPartition"), $"TimeStamp".as("TimeStamp"), $"TRFCoraxData_instrumentId".as("TRFCoraxData_instrumentId"), $"TRFCoraxData_organizationId".as("TRFCoraxData_organizationId"),explode($"Dividends"), $"FFAction|!|".as("FFAction|!|"))
val temp = temp2.select(temp2.columns.map(x => col(x).as(x.replace("cr:", ""))): _*)
temp.show(false)
这是我得到的输出,我得到了作为 Col 的爆炸列。
如何在同一个表达式中获得列名
+-----------------+-------------------------+-------------------------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+
|DataPartition |TimeStamp |TRFCoraxData_instrumentId|TRFCoraxData_organizationId|col |FFAction|!||
+-----------------+-------------------------+-------------------------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+
|ThirdPartyPrivate|2017-06-07T09:18:33+00:00|8590925624 |4296241518 |[2009-07-14T00:00:00+00:00,null,0.35,500110,73014469387,0.35,null,INTE,2009-08-13T00:00:00+00:00,2009-09-15T00:00:00+00:00,P3M,2009-09-30T00:00:00+00:00,2009-08-17T00:00:00+00:00] |O|!| |
|ThirdPartyPrivate|2017-06-07T09:18:33+00:00|8590925624 |4296241518 |[2008-02-05T00:00:00+00:00,null,0.3,500110,73015860528,0.3,null,INTE,2008-02-14T00:00:00+00:00,2008-03-17T00:00:00+00:00,P3M,2008-03-31T00:00:00+00:00,2008-02-19T00:00:00+00:00] |O|!| |
|ThirdPartyPrivate|2017-06-07T09:18:33+00:00|8590925624 |4296241518 |[2008-04-29T00:00:00+00:00,null,0.3,500110,73015864496,0.3,null,INTE,2008-05-14T00:00:00+00:00,2008-06-16T00:00:00+00:00,P3M,2008-06-30T00:00:00+00:00,2008-05-16T00:00:00+00:00] |O|!| |
+-----------------+-------------------------+-------------------------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+
【问题讨论】:
标签: scala apache-spark apache-spark-sql