spark sql 怎样处理日期类型
时间:2021-07-01 10:21:17
帮助过:2人阅读
```{ "name" : "Andy", "age" : 30, "time" :"2015-03-03T08:25:55.769Z"}{ "name" : "Justin", "age" : 19, "time" : "2015-04-04T08:25:55.769Z" }{ "name" : "pan", "age" : 49, "time" : "2015-05-05T08:25:55.769Z" }{ "name" : "penny", "age" : 29, "time" : "2015-05-05T08:25:55.769Z" }```默认推测的Schema:```root |-- _corrupt_record: string (nullable = true) |-- age: long (nullable = true) |-- name: string (nullable = true) |-- time200: string (nullable = true)```
测试代码``` val fileName = "person.json" val sc = SparkUtils.getScLocal("json file 测试") val sqlContext = new org.apache.spark.sql.SQLContext(sc) val jsonFile = sqlContext.read.json(fileName) jsonFile.printSchema()```##解决方案
### 方案一、
json数据 时间为 long 秒或毫秒### 方案二、自定义schema``` val fileName = "person.json"
val sc = SparkUtils.getScLocal("json file 测试") val sqlContext = new org.apache.spark.sql.SQLContext(sc) val schema: StructType = StructType(mutable.ArraySeq( StructField("name", StringType, true), StructField("age", StringType, true), StructField("time", TimestampType, true))); val jsonFile = sqlContext.read.schema(schema).json(fileName) jsonFile.printSchema() jsonFile.registerTempTable("person") val now: Timestamp = new Timestamp(System.currentTimeMillis()) val teenagers = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <=‘" +now+"‘") teenagers.foreach(println) val dataFrame = sqlContext.sql("SELECT * FROM person WHERE age >= 20 AND age <= 30 AND time <=‘2015-03-03 16:25:55.769‘") dataFrame.foreach(println)```###方案三、sql建表
创建表sql```CREATE TEMPORARY TABLE person IF NOT EXISTS [(age: long ,name:string ,time:Timestamp)] USING org.apache.spark.sql.json
OPTIONS ( path ‘person.json‘)语法CREATE [TEMPORARY] TABLE [IF NOT EXISTS]
[(col-name data-type [, …])]
USING [OPTIONS ...]
[AS ]```
### 方案四、
用textfile convert来自为知笔记(Wiz)
spark sql 怎样处理日期类型
标签: