您现在的位置是:首页 > 博文答疑 > Spark development in Windows博文答疑
Spark development in Windows
Zack2017-05-12【8】
简介迈出Spark开发第一步
Cover 4 major items in this doc:
1. Simulate Hadoop in Windows.
2. How to install Spark to Windows.
3. How to install Scala-IDE to Windows.
4. How to package Scala code into jar via SBT and run it on Spark.
Simulate Hadoop in Windows:
Download winutils.exe from official web:
https://sundog-spark.s3.amazonaws.com/winutils.exe
Install Spark to Windows:
1. Down load from office web:
http://spark.apache.org/downloads.html

2. Un-zip to folder C:\spark

3. Set SPARK_HOME and PATH in SYSTEM user variables:

4. Create the user PATH:

5. Verify the install is successful:
Command in CMD folder C:\spark\bin
‘ spark-shell’


Install Scala IDE eclipse to Windows:
1. Download zip from official web:

2. Un-zip it to C:\eclipse

3. You should have proper JRE/JDK installed properly. Then open ‘eclipse.exe’.
Choose the workspace new created folder ‘C:\SparkScala’

4. Then you can create do ff.
a, new Scala Project:


b, new create Package under src

c, new scala code file


Install SBT to your PC:
1. Download SBT from official web:
Download ZIP or TGZ package and expand it.
2. 将下载的包解压到你指定的目录, 比如解压到d:\sbt
3. 在sbt\bin目录下创建sbtconfig.txt文件
4. Set SBT_HOME and PATH in SYSTEM user variables:

5. Create the user PATH:

6. First run to download jar packages, which will take a quite long time.
Command: sbt command in the lib d:\sbt

Ctrl + C to stop if any issue.

7. Creates a jar file using command ‘sbt package’
写好的scala代码,放到如下的文件结构里:
\test\src\main\scala\SimpleApp.scala
sbt配置文件放到根文件里:
\test\simple.sbt

c. can find the jar location from log:
D:\sbt\test\target\scala-2.11\simple-project_2.11-1.0.jar

d. run the jar in Spark:
a) Copy the jar into Spark bin lib:
C:\spark\bin\simple-project_2.11-1.0
b) Command in CMD C:\spark\bin\:
‘spark-submit simple-project_2.11-1.0.jar’
c) Show below results:
找到文件中有几个a和几个b。
Lines with a: 62. Lines with b: 30
