Blog Archives

07: spark-xml to split & read very large XML files

Processing very large XML files can be a bit tricky as they cannot be processed line by line in parallel as you would do with CSV files. The xml file has to be intact whilst matching the start and end entity tags, and if the tags are distributed in parts…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

01B: Spark tutorial – writing to HDFS from Spark using Hadoop API

Step 1: The “pom.xml” that defines the dependencies for Spark & Hadoop APIs.

Step 2: The Spark job that writes numbers 1 to 10 to 10 different files on HDFS.

Step 3: Build the “jar” file.

Step 4: Run the “spark-submit” job.

Step 5: You can…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

06: Spark Streaming with Flume Avro Sink Tutorial

This extends Running a Simple Spark Job in local & cluster modes and Apache Flume with JMS source (Websphere MQ) and HDFS sink. In this tutorial a Flume sink will ingest the data from a source like JMS, HDFS, etc and pass it to an “Avro Sink” that pushes data…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

05: Spark SQL & CSV with DataFrame Tutorial

Step 1: Create a simple maven project.

Step 2: Import the “simple-spark” maven project into eclipse or IDE of your choice. Step 3: Modify the pom.xml file include 1) relevant Spark libraries 2) The shade plugin to create a single jar (i.e. uber jar) with the spark and other…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

04: Running a Simple Spark Job in local & cluster modes

Step 1: Create a simple maven Spark project using “-B” for non-interactive mode.

Step 2: Import the maven project “simple-spark” into eclipse. Step 3: The pom.xml file should have the relevant dependency jars as shown below.

Step 4: Write the simple Spark job “SimpleSparkJob.java” that prints numbers from…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

03: Spark tutorial – reading a Sequence File from HDFS

This extends Spark submit – reading a file from HDFS. A SequenceFile is a flat file consisting of binary key/value pairs. It is extensively used in MapReduce as input/output formats. Like CSV, Sequence files do not store meta data, hence only schema evolution is appending new fields to the end…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials

02: ♥ Spark tutorial – reading a file from HDFS

This extends Spark tutorial – writing a file from a local file system to HDFS. This tutorial assumes that you have set up Cloudera as per “cloudera quickstart vm tutorial installation” YouTube videos that you can search Google or YouTube. You can install it on VMWare (non commercial use) or

Read more ›

Posted in Spark Tutorials

01: ♥ Spark tutorial- writing a file from a local file system to HDFS

This tutorial assumes that you have set up Cloudera as per “cloudera quickstart vm tutorial installation” YouTube videos that you can search Google or YouTube. You can install it on VMWare (non commercial use) or on VirtualBox. I am using VMWare. Cloudera requires at least 8GB RAM and 16GB is…...

Members Only Content

This content is for the members with any one of the following paid subscriptions:

30-Day-Java-JEE-Career-Training, 90-Day-Java-JEE-Career-Training, 180-Day-Java-JEE-Career-Training, 365-Day-Java-JEE-Career-Training, 60-Day-Java-JEE-Career-Training and 2-Year-Java-JEE-Career-Training

Want to evaluate the quality of the contents to see if they will add value to you?

Click Here and check the contents with Try.

Log In | Register
Posted in member-paid, Spark Tutorials
1100+ paid subscribers. Reviews | Free Contents. Monthly 260K+ views & 40k+ visitors. 9 tips to earn more.

Java Developer Training – 800+ Q&As ♥Free|♦FAQ (Mouse Hover for Tooltip)

open all | close all

200+ Java Developer Job Interview FAQs

open all | close all

16 Java Programmer Key Areas to be a top-notch

open all | close all

80+ Java Tutorials – Step by step

open all | close all

100+ Java Developer Coding Exercises

open all | close all

How good are your …..Java job hunting & career fast-tracking skills?

open all | close all