1.普通创建
case class Calllog(fromtel: String,totel: String,time: String,duration: Int)
val ds = sc.textFile("/user/data/calllog.csv").map(x=>x.split(","))
val log = ds.map(x=>Calllog(x(0),x(1),x(2),x(3).toInt))
val df = log.toDF
2.使用spark-seesion创建
//代码比较多,就不创建了,知道有这种方式就好了
3.根据带格式的文件创建,例如(json)
val df1 = spark.read.json("/user/data/people.json")
val df2 = spark.read.format("json").load("/user/data/people.json")
1.
val df2 = spark.read.format("json").load("/user/data/people.json")
val ds = Seq(mydata(1,"tom"),mydata(2,"jerry")).toDS
2.
case class People(name:String,age:BigInt)
val data = spark.read.json("/user/data/people.json")
data.as[People]
spark-sql创建DataFrame/DataSets的几种方式
原文:https://www.cnblogs.com/xxfxxf/p/12093873.html