Installing & getting started with Cloudera QuickStart on VMWare for windows in 17 steps

Prerequisite: At least 12GB+ RAM (i.e. 4GB+ for operating system & 8GB+ for Cloudera), although 16 GB+ is preferred. 80GB Hard Disk. Cloudera runs on CentOS, which is the community edition of the Linux. Windows system must support 64-bit.

Install VMWare for Windows

Step 1: Download the VMWare player for Windows from https://my.vmware.com/web/vmware/free and then select VMWare Workstation Plyer. The URL used for this download is “https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_workstation_player/14_0”.

Step 2: Install the downloaded VMWare by double clicking on the downloaded “.exe” file. Restart windows after installing.

VMWare workstation player

Install Cloudera for VMWare

Step 1: Download the Cloudera quickstart-vm for VMWare (e.g. cloudera-quickstart-vm-5.12.0-0-vmware) for Windows from “https://www.cloudera.com/downloads/quickstart_vms/5-12.html“.

Cloudera Quickstart forVMWare

Fill in the form and mark the purpose as self-learning, and download the “zip” file, which will take some time to download.

Cloudera quickstart VMWare zip file

Step 2: Extract the downloaded “cloudera-quickstart-vm-5.12.0-0-vmware.zip”.

Step 3: Open up the installed VMWare Workstation Plyer for non-commercial use to learn Cloudera.

Step 4: Open a virtual machine by selecting previously downloaded and extracted file “cloudera-quickstart-vm-5.12.0-0-vmware.vmx” as shown below.

Open existing Cloudera VM

Step 5: Click on edit settings and allocate at least “2 CPU Cores” and “8 GB” RAM. You need “4GB” RAM for the operating system and the remaining for the VM. You can find out your windows system info via start -> run and then type “msinfo32.exe”.

Minimum 2 cores and 8GB RAM

Step 6: Click on play.

Cloudera on VMWare

Step 7: Once it has started, you will see the following screen.

Cloudera Quickstart on VMWare

Step 8: Click on the Launch Cloudera Express.

Launch Cloudera Express

Step 9: Login to Cloudera Manager with “cloudera” and “cloudera” as username/password.

Cloudera Manager Login

Step 10: Start all the services.

Start Cloudera Services

Step 11: Start Hue, which is web interface for HDFS, HBase, Spark UI, Hive, etc.

Cloudera Services started and click on Hue

Step 12: Login to Hue with “cloudera/cloudera”. The HaDoop File System (i.e HDFS) is shown below.

Hue showing HDFS

Step 13: Logout of Hue and Cloudera Manager, and then shut down VM.

Logout & Shutdown the VM


BigData on Cloudera
Module 1 Installing & getting started with Cloudera Quick Start-
Unit 1 Installing & getting started with Cloudera QuickStart on VMWare for windows in 17 steps  - Preview
Unit 2 ⏯ Cloudera Hue, Terminal Window (on edge node) & Cloudera Manager overview  - Preview
Unit 3 Understanding Cloudera Hadoop users  - Preview
Unit 4 Upgrading Java version to JDK 8 in Cloudera Quickstart  - Preview
Module 2 Getting started with HDFS on Cloudera+
Unit 1 ⏯ Hue and terminal window to work with HDFS  - Preview
Unit 2 Java program to list files in HDFS & write to HDFS using Hadoop API  - Preview
Unit 3 ⏯ Java program to list files on HDFS & write to a file in HDFS  - Preview
Unit 4 Write to & Read from a csv file in HDFS using Java & Hadoop API  - Preview
Unit 5 ⏯ Write to & read from HDFS using Hadoop API in Java  - Preview
Module 3 Running an Apache Spark job on Cloudera+
Unit 1 Before running a Spark job on a YARN cluster in Cloudera  - Preview
Unit 2 Running a Spark job on YARN cluster in Cloudera  - Preview
Unit 3 ⏯ Running a Spark job on YARN cluster  - Preview
Unit 4 Write to HDFS from Spark in YARN mode & local mode  - Preview
Unit 5 ⏯ Write to HDFS from Spark in YARN & local modes  - Preview
Unit 6 Spark running on YARN and Local modes reading from HDFS  - Preview
Unit 7 ⏯ Spark running on YARN and Local modes reading from HDFS  - Preview
Module 4 Hive on Cloudera+
Unit 1 Getting started with Hive  - Preview
Unit 2 ⏯ Getting started with Hive  - Preview
Module 5 HBase on Cloudera+
Unit 1 Write to HBase from Java  - Preview
Unit 2 Read from HBase in Java  - Preview
Unit 3 HBase shell commands to get, scan, and delete  - Preview
Unit 4 ⏯ Write to & read from HBase  - Preview
Module 6 Writing to & reading from Avro in Spark+
Unit 1 Write to an Avro file from a Spark job in local mode  - Preview
Unit 2 Read an Avro file from HDFS via a Spark job running in local mode  - Preview
Unit 3 ⏯ Write to & read from an Avro file on HDFS using Spark  - Preview
Unit 4 Write to HDFS as Avro from a Spark job using Avro IDL  - Preview
Unit 5 ⏯ Write to Avro using Avro IDL from a Spark job  - Preview
Unit 6 Create a Hive table over Avro data  - Preview
Unit 7 ⏯ Hive table over an Avro folder & avro-tools to generate the schema  - Preview
Module 7 Writing to & reading from Parquet in Spark+
Unit 1 Write to a Parquet file from a Spark job in local mode  - Preview
Unit 2 Read from a Parquet file in a Spark job running in local mode  - Preview
Unit 3 ⏯ Write to and read from Parquet data on HDFS via Spark  - Preview
Unit 4 Create a Hive table over Parquet data  - Preview
Unit 5 ⏯ Hive over Parquet data  - Preview
Module 8 Spark SQL+
Unit 1 Spark SQL read a Hive table  - Preview
Unit 2 Write to Parquet using Spark SQL & Dataframe  - Preview
Unit 3 Read from Parquet with Spark SQL & Dataframe  - Preview
Unit 4 ⏯ Spark SQL basics video tutorial  - Preview
Module 9 Spark streaming+
Unit 1 Spark streaming text files  - Preview
Unit 2 Spark file streaming in Java  - Preview
Unit 3 ⏯ Spark streaming video tutorial  - Preview
Learn by categories on the go...
Learn by categories such as FAQs – Core Java, Key Area – Low Latency, Core Java – Java 8, JEE – Microservices, Big Data – NoSQL, Architecture – Distributed, Big Data – Spark, etc. Some posts belong to multiple categories.
Top