Blog Archives
1 2

01a: Convert XML file To Sequence File – writing & reading – Local File System

Sequence files are good for saving raw data into HDFS. Sequence files are compressible and splittable. It is also useful for combining a number of smaller files into a single say 64MB or larger sequence file as HDFS is more suited for larger files. … Read more ›...

Members Only Content
Log In Register Home


01b: Convert XML file To Sequence File – writing & reading – Hadoop File System (i.e HDFS)

This extends Convert XML file To Sequence File – writing & reading – Local File System. Step 1: Upload “report.xml” onto HDFS. E.g using the Cloudera HUE on to path “/user/cloudera/report-data”. You need to create the “report-data” folder. The uploaded file on Hue: Step 2: Change the code to read...

Members Only Content
Log In Register Home


02: Convert XML file To Sequence File with Apache Spark – writing & reading

This extends the Convert XML file To Sequence File With Hadoop libaries, by using Apache Spark. Step 1: The pom.xml file should include the Apache Spark libraries as shown below. Step 2: The XML file report.xml. Step 3: The Java class “ … Read more ›...

Members Only Content
Log In Register Home


03: Convert XML file To an Avro File – writing & reading

This extends the Convert XML file To Sequence File With Hadoop libaries. Avro files are schema driven & support schema evolution, which means you can add new columns & modify existing columns. Step 1: The pom.xml file should include the Apache Spark libraries as shown below. … Read more ›...

Members Only Content
Log In Register Home


04: Convert XML file To an Avro File with Apache Spark – writing & reading

This extends Convert XML file To an Avro File – writing & reading. Step 1: The pom.xml file should include the Apache Spark & Avro libraries as shown below. Step 2: The report.xml file under “src/main/resources/data”. … Read more ›...

Members Only Content
Log In Register Home


05: Convert XML file To an Avro File with avro-maven-plugin & Apache Spark

This extends 04: Convert XML file To an Avro File with Apache Spark – writing & reading. Instead of using the GenericRecord, let’s generate an avro schema object from the avro schema. Step 1: The pom.xml file should include the Apache Spark & … Read more ›...

Members Only Content
Log In Register Home


06: Avro Schema evolution tutorial

Q1. What do you understand by the term “AVRO schema evolution“?
A1. Schema evolution is the term used for how the store behaves when Avro schema is changed after data has been written to the store using an older version of that schema.

Read more ›



1 2

800+ Java Interview Q&As Menu

Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, Architecture – Distributed, Big Data – Spark, etc. Some posts belong to multiple categories.
Top