【大数据实时计算框架】Storm框架_开发第一个Wordcount实时计算程序

  

  • Topology:Storm中运行的一个实时应用程序的名称。(拓扑)
  • Spout:在一个topology中获取源数据流的组件。
    •   通常情况下spout会从外部数据源中读取数据,然后转换为topology内部的源数据。
  • Bolt:接受数据然后执行处理的组件,用户可以在其中执行自己想要的操作。
  • Tuple:一次消息传递的基本单元,理解为一组消息就是一个Tuple。
  • Stream:表示数据的流向。
  • StreamGroup:数据分组策略
    • Shuffle Grouping :随机分组,尽量均匀分布到下游Bolt中
    • Fields Grouping :按字段分组,按数据中field值进行分组;相同field值的Tuple被发送到相同的Task
    • All grouping:广播
    • Global grouping :全局分组,Tuple被分配到一个Bolt中的一个Task,实现事务性的Topology。
    • None grouping :不分组
    • Direct grouping :直接分组 指定分组

二、流式计算一般框架图

【大数据实时计算框架】Storm框架_开发第一个Wordcount实时计算程序

  • Flume用来获取数据。
  • Kafka用来临时保存数据。
  • Strom用来计算数据。
  • Redis是个内存数据库,用来保存数据。

0.使用Maven管理工程,pom.xml需要添加的依赖

pom.xml

<!-- apache storm core -->
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>1.0.3</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-rename-hack</artifactId>
<version>1.0.3</version>
</dependency>

<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-hbase</artifactId>
<version>1.0.3</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-redis</artifactId>
<version>1.0.3</version>
</dependency>

完整的pom.xml

  1 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  2     xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  3     <modelVersion>4.0.0</modelVersion>
  4     <groupId>com.wulei</groupId>
  5     <artifactId>Bigdata</artifactId>
  6     <version>1.0.0</version>
  7 
  8 
  9     <properties>
 10         <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
 11         <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
 12         <maven.compiler.encoding>UTF-8</maven.compiler.encoding>
 13         <hadoop.version>2.7.3</hadoop.version>
 14     </properties>
 15 
 16     <dependencies>
 17         <!-- Start-SQL connector -->
 18         <dependency>
 19             <groupId>mysql</groupId>
 20             <artifactId>mysql-connector-java</artifactId>
 21             <version>5.1.43</version>
 22         </dependency>
 23 
 24         <!-- Hadoop 2.7.3 -->
 25         <dependency>
 26             <groupId>org.apache.hadoop</groupId>
 27             <artifactId>hadoop-client</artifactId>
 28             <version>${hadoop.version}</version>
 29         </dependency>
 30         <dependency>
 31             <groupId>org.apache.hadoop</groupId>
 32             <artifactId>hadoop-common</artifactId>
 33             <version>${hadoop.version}</version>
 34         </dependency>
 35         <dependency>
 36             <groupId>org.apache.hadoop</groupId>
 37             <artifactId>hadoop-hdfs</artifactId>
 38             <version>${hadoop.version}</version>
 39         </dependency>
 40 
 41         <!-- HBase -->
 42         <dependency>
 43             <groupId>org.apache.hbase</groupId>
 44             <artifactId>hbase</artifactId>
 45             <version>1.3.1</version>
 46             <type>pom</type>
 47         </dependency>
 48 
 49         <dependency>
 50             <groupId>org.apache.hbase</groupId>
 51             <artifactId>hbase-client</artifactId>
 52             <version>1.3.1</version>
 53         </dependency>
 54 
 55 
 56         <dependency>
 57             <groupId>org.apache.mrunit</groupId>
 58             <artifactId>mrunit</artifactId>
 59             <version>1.1.0</version>
 60             <classifier>hadoop2</classifier>
 61             <scope>test</scope>
 62         </dependency>
 63 
 64         <dependency>
 65             <groupId>org.mockito</groupId>
 66             <artifactId>mockito-all</artifactId>
 67             <version>1.10.19</version>
 68             <scope>test</scope>
 69         </dependency>
 70 
 71         <dependency>
 72             <groupId>junit</groupId>
 73             <artifactId>junit</artifactId>
 74             <version>4.12</version>
 75             <scope>test</scope>
 76         </dependency>
 77 
 78         <!-- apache storm core -->
 79         <dependency>
 80             <groupId>org.apache.storm</groupId>
 81             <artifactId>storm-core</artifactId>
 82             <version>1.0.3</version>
 83             <scope>provided</scope>
 84         </dependency>
 85 
 86         <dependency>
 87             <groupId>org.apache.storm</groupId>
 88             <artifactId>storm-rename-hack</artifactId>
 89             <version>1.0.3</version>
 90         </dependency>
 91 
 92         <dependency>
 93             <groupId>org.apache.storm</groupId>
 94             <artifactId>storm-hbase</artifactId>
 95             <version>1.0.3</version>
 96             <scope>test</scope>
 97         </dependency>
 98 
 99         <dependency>
100             <groupId>org.apache.storm</groupId>
101             <artifactId>storm-redis</artifactId>
102             <version>1.0.3</version>
103         </dependency>
104 
105 
106 
107     </dependencies>
108     <build>
109         <plugins>
110             <plugin>
111                 <groupId>org.apache.maven.plugins</groupId>
112                 <artifactId>maven-compiler-plugin</artifactId>
113                 <configuration>
114                     <source>1.8</source>
115                     <target>1.8</target>
116                 </configuration>
117             </plugin>
118 
119             <plugin>
120                 <groupId>org.apache.maven.plugins</groupId>
121                 <artifactId>maven-shade-plugin</artifactId>
122                 <version>2.4.1</version>
123                 <configuration>
124                     <createDependencyReducedPom>false</createDependencyReducedPom>
125                 </configuration>
126                 <executions>
127                     <execution>
128                         <phase>package</phase>
129                         <goals>
130                             <goal>shade</goal>
131                         </goals>
132                         <configuration>
133                             <transformers>
134                                 <transformer
135                                     implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
136                                     <mainClass>com.bigdata.storm.WordCountTopology</mainClass>
137                                 </transformer>
138                             </transformers>
139                         </configuration>
140                     </execution>
141                 </executions>
142             </plugin>
143 
144         </plugins>
145     </build>
146 </project>
pom.xml

相关文章: