首页 > 其他 > 详细

Scala Spark WordCount

时间:2019-12-04 22:00:08      阅读:92      评论:0      收藏:0      [点我收藏+]

Scala所需依赖

<dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.11.8</version>
</dependency>

Scala WordCount代码

val source: List[String] = Source.fromFile("./src/main/data/wordCount.txt").getLines().toList
source.flatMap(elem => elem.split(" "))
  .filter(_.nonEmpty)
  .groupBy(elem => elem.toLowerCase)
  .mapValues(elem => elem.size)
  .foreach(println)

Spark所需依赖

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.11</artifactId>
    <version>2.3.4</version>
</dependency>

Spark WordCount代码

val sparkContext = new SparkContext((new SparkConf).setAppName("SparkWordCount").setMaster("local[2]"))
sparkContext.setLogLevel("WARN")
val source: RDD[String] = sparkContext.textFile("./src/main/data/wordCount.txt")
source.flatMap(_.split(" "))
  .filter(_.nonEmpty)
  .map(elem => (elem.toLowerCase, 1))
  .reduceByKey(_+_)
  .foreach(println)
sparkContext.stop

Scala Spark WordCount

原文:https://www.cnblogs.com/JoshWill/p/11985930.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!