Flink--

时间：2020-06-14 21:58:03 阅读：72 评论：0 收藏：0 [点我收藏+]

Flink是什么？

Apache Flink 是一个框架和分布式处理引擎，用于处理无界和有界的数据流进行状态计算。

为什么选择Flink

流数据更真实地反应我们的生活方式（聊天、导航、转账）
传统的数据架构是基于有限数据集的
我们的目标

低延迟
高吞吐
结果的准确性和良好的容错性

哪些行业需要处理流数据

电商和市场营销

数据报表、广告投放、业务流程需要

物联网

　　传感器实时数据采集和显示、实时报警、交通运输业

电信业

　　基站流量调配

银行和金融业

　　实时结算和通知推送，实时监测异常行为

Flink下载安装

https://www.apache.org/dyn/closer.lua/flink/flink-1.10.1/flink-1.10.1-bin-scala_2.11.tgz

Flink word count 案例

maven依赖

 <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-scala_2.11</artifactId>
            <version>1.7.2</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-scala_2.11</artifactId>
            <version>1.7.2</version>
        </dependency>

    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.4.6</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>3.0.0</version>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

批处理 word count 程序

在resource目录下创建一个hello.txt文件

hello word
hello scala
hello fink
hello spark

/**
 * 批处理 word count 程序
 */
object WordCount {
  def main(args: Array[String]): Unit = {
    // 创建一个执行环境
    val env = ExecutionEnvironment.getExecutionEnvironment
    // 从文件中读取数据
    val inputPath = "/Users/qiunan/IdeaProjects/FlinkTutorial/src/main/resources/hello.txt"
    val inputDataSet = env.readTextFile(inputPath)
    // 切分数据得到word 然后再按word做分组聚合
    val wordCountDataSet = inputDataSet.flatMap(_.split(" "))
      .map((_,1))
      .groupBy(0)
      .sum(1)
     wordCountDataSet.print()
  }
}

打印结果

(scala,1)
(spark,1)
(word,1)
(fink,1)
(hello,4)

流处理word count 程序

/**
 * 流处理word count 程序
 */
object StreamWordCount {
  def main(args: Array[String]): Unit = {
    // 创建流处理的执行环境
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    // 接受一个socket文本流
    val dataStream = env.socketTextStream("localhost", 777);
    // 对每条数据进行处理
    val wordCountDataStream = dataStream.flatMap(_.split(" "))
        .filter(_.nonEmpty)
        .map((_, 1))
        .keyBy(0)
        .sum(1)
    
    wordCountDataStream.print()
    // 启动executor
    env.execute();
  }
}

启动程序在终端窗口输入 nc -lk 777

qiunan@qiunandeiMac ~ % nc -lk 777
hello word
hello flink
hello spark

打印结果

5> (word,1)
2> (hello,1)
5> (flink,1)
2> (hello,2)
2> (hello,3)
1> (spark,1)

Flink--

原文：https://www.cnblogs.com/south-pigeon/p/13127015.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)