Blog Archives
1 2

01a: Convert XML file To Sequence File – writing & reading – Local File System

Sequence files are good for saving raw data into HDFS. Sequence files are compressible and splittable. It is also useful for combining a number of smaller files into a single say 64MB or larger sequence file as HDFS is more suited for larger files. … Read more ›...



01b: Convert XML file To Sequence File – writing & reading – Hadoop File System (i.e HDFS)

This extends Convert XML file To Sequence File – writing & reading – Local File System. Step 1: Upload “report.xml” onto HDFS. E.g using the Cloudera HUE on to path “/user/cloudera/report-data”. You need to create the “report-data” folder. The uploaded file on Hue: Step 2: Change the code to read...



02: Convert XML file To Sequence File with Apache Spark – writing & reading

This extends the Convert XML file To Sequence File With Hadoop libaries, by using Apache Spark. Step 1: The pom.xml file should include the Apache Spark libraries as shown below. Step 2: The XML file report.xml. Step 3: The Java class “ … Read more ›...



03: Convert XML file To an Avro File – writing & reading

This extends the Convert XML file To Sequence File With Hadoop libaries. Avro files are schema driven & support schema evolution, which means you can add new columns & modify existing columns. Step 1: The pom.xml file should include the Apache Spark libraries as shown below. … Read more ›...



04: Convert XML file To an Avro File with Apache Spark – writing & reading

This extends Convert XML file To an Avro File – writing & reading. Step 1: The pom.xml file should include the Apache Spark & Avro libraries as shown below. Step 2: The report.xml file under “src/main/resources/data”. … Read more ›...



05: Convert XML file To an Avro File with avro-maven-plugin & Apache Spark

This extends 04: Convert XML file To an Avro File with Apache Spark – writing & reading. Instead of using the GenericRecord, let’s generate an avro schema object from the avro schema. Step 1: The pom.xml file should include the Apache Spark & … Read more ›...



06: Avro Schema evolution tutorial

Q1. What do you understand by the term “AVRO schema evolution“?
A1. Schema evolution is the term used for how the store behaves when Avro schema is changed after data has been written to the store using an older version of that schema.

Read more ›



1 2

Java Interview FAQs

800+ Java Interview Q&As

Top