【发布时间】:2018-03-18 18:20:56
【问题描述】:
请找到下面的代码并告诉我如何将列名更改为小写。我尝试了 withColumnRename 但我必须为每一列都这样做并键入所有列名。我只想在列上这样做,所以我不想提及所有列名,因为它们太多了。
Scala 版本:2.11 火花:2.2
import org.apache.spark.sql.SparkSession
import org.apache.log4j.{Level, Logger}
import com.datastax
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import com.datastax.spark.connector._
import org.apache.spark.sql._
object dataframeset {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("Sample1").setMaster("local[*]")
val sc = new SparkContext(conf)
sc.setLogLevel("ERROR")
val rdd1 = sc.cassandraTable("tdata", "map3")
Logger.getLogger("org").setLevel(Level.ERROR)
Logger.getLogger("akka").setLevel(Level.ERROR)
val spark1 = org.apache.spark.sql.SparkSession.builder().master("local").config("spark.cassandra.connection.host","127.0.0.1")
.appName("Spark SQL basic example").getOrCreate()
val df = spark1.read.format("csv").option("header","true").option("inferschema", "true").load("/Users/Desktop/del2.csv")
import spark1.implicits._
println("\nTop Records are:")
df.show(1)
val dfprev1 = df.select(col = "sno", "year", "StateAbbr")
dfprev1.show(1)
}
}
需要的输出:
|sno|year|stateabbr| statedesc|cityname|geographiclevel
All the Columns names should be in lower case.
实际输出:
Top Records are:
+---+----+---------+-------------+--------+---------------+----------+----------+--------+--------------------+---------------+---------------+--------------------+----------+--------------------+---------------------+--------------------------+-------------------+---------------+-----------+----------+---------+--------+---------+-------------------+
|sno|year|StateAbbr| StateDesc|CityName|GeographicLevel|DataSource| category|UniqueID| Measure|Data_Value_Unit|DataValueTypeID| Data_Value_Type|Data_Value|Low_Confidence_Limit|High_Confidence_Limit|Data_Value_Footnote_Symbol|Data_Value_Footnote|PopulationCount|GeoLocation|categoryID|MeasureId|cityFIPS|TractFIPS|Short_Question_Text|
+---+----+---------+-------------+--------+---------------+----------+----------+--------+--------------------+---------------+---------------+--------------------+----------+--------------------+---------------------+--------------------------+-------------------+---------------+-----------+----------+---------+--------+---------+-------------------+
| 1|2014| US|United States| null| US| BRFSS|Prevention| 59|Current lack of h...| %| AgeAdjPrv|Age-adjusted prev...| 14.9| 14.6| 15.2| null| null| 308745538| null| PREVENT| ACCESS2| null| null| Health Insurance|
+---+----+---------+-------------+--------+---------------+----------+----------+--------+--------------------+---------------+---------------+--------------------+----------+--------------------+---------------------+--------------------------+-------------------+---------------+-----------+----------+---------+--------+---------+-------------------+
only showing top 1 row
+---+----+---------+
|sno|year|StateAbbr|
+---+----+---------+
| 1|2014| US|
+---+----+---------+
only showing top 1 row
【问题讨论】:
标签: scala csv apache-spark apache-spark-sql