We follow the following steps for installing CDH from login into Cloudera Manager to setting up the cluster.
- Login to cloudera manager
- Choose Version
- Adding hosts to cluster environment
- Selecting repository
- Installing Java
- Single User Mode
- SSH credentials for Cloudera Manager to login other hosts
- Cluster installation
Login to Cloudera Manager
Cloudera manager will provide web console to monitor and perform actions on cluster. We can access console
The default login and password will be admin and admin.
The first time you login, you will be prompted to select the Cloudera Manager edition. We will go ahead and install the Cloudera Enterprise Data Hub Edition Trial version, which can be evaluated for 60 days. This gives us enough time to test out all the features of the full version of Cloudera Manager. If you are interested in getting the Cloudera Manager license, you will need to contact Cloudera directly.
Adding hosts to cluster environment
In this step, we will need to enter the hostnames or IP addresses of all machines that are going to be part of cluster. As shown in the following screenshot, you can enter all the addresses and click on Search to check whether they are available:
After you perform the search, all the machines will be listed as shown in the following screenshot along with the response time from each machine. Once you are satisfied with the results, select the required nodes and click on Continue.
After selecting hosts, we will be getting “Cluster Installation” with following steps.
- Select Repository – Packages or Parcels
- Single User Mode
- SSH credentials for all other hosts
- Inspecting the hosts
- Successful installation of cluster
Select Repository – Packages or Parcels
you will be presented with a few options to perform the cluster installation as shown in the following screenshot. The cluster installation is a five-step process. The installer provides two types of installation options: packages and parcels. Cloudera recommends the use of parcels. After selecting the required options, click on Continue.
Single User Mode
Note: Don’t enable Single User Mode
The Cloudera Manager versions higher than 5.3 supports single user mode, where cloudera manager agent and all processes managed by single configured user and group. This option applied to all nodes in cluster that are managed that cloudera manager. Single user mode prioritizes isolation between Hadoop and the rest of the system over isolation between Hadoop processes running on the system. Suggest to skip the single user mode option.
SSH credentials for Cloudera Manager to login other hosts
In this step we will provide credentials that are required by Cloudera Manager to access other nodes. Either we can use the root account or we will be using any other user with passwordless sudo access.
Once all the required files copied to all nodes, Cloudera Manager will inspect all hosts managed by it to gather information related to below points.
- System time
- User and group configuration
- HDFS settings
- Component versions
The two common issues that can be identified are THP issue and swappiness issue. You can refer Prerequisites topic on how to avoid these issues.
Successful installation of cluster
At this stage, we are almost done with installation. Next step is to continue installing CDH componets that are needed for you cluster.
So far we have installed Cloudera manager, Here we will be adding one component after another to understand the architecture. In production environment we can add all components together or at a time by planning appropriately.