Blog Archives

07: Q62 – Q70 HDFS blocks Vs. splits & Spark partitions Interview Questions & Answers

Q62. Can you explain the difference between HDFS blocks and input splits? A62. A block is a physical representation of data, and a Split is a logical division of your data or records. For example, an input split might split a large text file at the end of a record...



11: Q88 – Q91 Read-Write Vs Append-Only File Systems

Q88. How will you modify a portion of an HDFS file? A88. HDFS is an “append-only” file system. The most common use case of Hadoop data ingestion is to append new sets of event-based and/or sub-transactional data. The large data processing applications are typically built around the premise that things...



12: Q92 – Q97 Hadoop file formats and how to choose

Q92. What are the criteria for choosing storage file formats in Hadoop? A92. Choosing the wrong file formats can significantly increase the query times and storage spaces. Choosing a format that does not support flexible schema evolution may cost massive re-processing just to add a new field. … Read more...



18: Q121 – Q124 Choosing between HDFS & AWS S3 for BigData projects interview Q&As

More and more organisations are adopting the policy of “Cloud first architecture” where Cloud based storages like AWS S3 plays a major role. A data lake is a storage repository that holds a vast amount of structured, semi-structured, and unstructured raw data in its native format (aka pristine condition). …...



More HDFS & NameNode interview Q&As

Q1. What are the different ways to access files or data stored in HDFS? A1. You can access files or data stored in HDFS in many different ways. HDFS Command-Line Interface The HDFS command-line interface (CLI), called hdfs (or hadoop) shell, … Read more ›...



800+ Java Interview Q&As

Java & Big Data Tutorials

Top