Understanding Cloudera Hadoop users

Step 1: A number of special users are created by default when installing and using CDH and Cloudera Manager. For example

Unix user id: hdfs
groups: hdfs hadoop

Unix user id: spark
groups: spark

Unix user id: hive
groups: hive

and so on.

Cloudera Manager

The Cloudera manager processes will use the Unix Id of “cloudera-scm” and group of “cloudera-scm”. Cloudera Manager Server runs as a central server which hosts the UI Web Server and the application logic for managing CDH (i.e. Cloudera Distribution including Hadoop). Everything related to installing CDH, configuring services, and starting and stopping services is managed by the Cloudera Manager Server.

The Cloudera Manager Agents are installed on every managed host. They are responsible for starting and stopping Linux processes, unpacking configurations, triggering various installation paths, and monitoring the host.

Unix Terminal on VMWare

Open a command-line by clicking on the terminal window icon or via “Applications –> System Tools –> Terminal”

The default user is “cloudera

You can list the “groups” as shown below:

or as

Switch to “root” user with

Switch back to “cloudera” user with

Step 2: You can list HDFS files as

You can recursively list all the files in HDFS recursively as a “cloudera” user as shown below:

Step 3: Go to Hue UI via a web browser, and go to “Files” and inspect the HDFS folders and users in the “/users” folder as shown below.

HDFS folders with user, group, and permissions

You can change permissions by selecting a folder and then picking the drop down “Actions” followed by “Change permissions“.

Step 4: Superusers are defined by a group named in hdfs-site.xml, “dfs.permissions.superusergroup“, which is the UNIX group containing users that will be treated as superusers by HDFS. The default is supergroup if installing with Cloudera Manager, and can be changed in the “Cloudera Manager UI” as shown below.

Cloudera supergroup setting


800+ Java Interview Q&As

Top