An improved upgrade wizard in Cloudera Manager 5.3 makes it easy to upgrade CDH on your clusters.
Upgrades can be hard, and any downtime to mission-critical workloads can have a direct impact on revenue. Upgrading the software that powers these workloads can often be an overwhelming and uncertain task that can create unpredictable issues. Apache Hadoop can be especially complex as it consists of dozens of components running across multiple machines. That’s why an enterprise-grade administration tool is necessary for running Hadoop in production, and is especially important when taking the upgrade plunge.
Cloudera Manager makes it easy to upgrade to the latest version of CDH. Not only does Cloudera Manager have a built-in upgrade wizard to make your CDH upgrades simple and predictable, it also features rolling-restart capability that enables zero-downtime upgrades under certain conditions.
This post illustrates how to leverage Cloudera Manager to upgrade your Cloudera cluster using the upgrade wizard, and also highlights some of the new features in Cloudera Enterprise 5.3.
Why a Wizard?
Upgrading can involve many steps that can depend on the services installed and the start/end versions. A wizard to upgrade across major versions (CDH 4 to CDH 5) has been available since Cloudera Manager 5. Cloudera Manager 5.3 introduces an enhanced CDH upgrade wizard that adds support for minor (CDH 5.x to CDH 5.y) and maintenance (CDH 5.b.x to CDH 5.b.y) version upgrades. The enhanced upgrade wizard performs service-specific upgrade steps that you would have had to run manually in the past.
Parcel and package installations are both supported by the CDH upgrade wizard. Using parcels is the preferred and recommended way, as packages must be manually installed, whereas parcels are installed by Cloudera Manager. Consider upgrading from packages to parcels so that the process is more automated, supports rolling upgrades, and provides an easier upgrade experience. (See this FAQ and this blog post to learn more about parcels.)
If you use parcels, have a Cloudera Enterprise license, and have enabled HDFS high availability, you can perform a rolling upgrade for non-major upgrades. This enables you to upgrade your cluster software and restart the upgraded services without incurring any cluster downtime. Note that it is not possible to perform a rolling upgrade from CDH 4 to CDH 5 because of incompatibilities between the two major versions.
Running the Upgrade Wizard
The Cloudera Manager version must always be equal to or greater than the CDH version you upgrade to. For example, to upgrade to CDH 5.3, you must be on Cloudera Manager 5.3 or higher.
- Log in to the Cloudera Manager Admin Console.
- To access the wizard, on the Home page, click the cluster’s drop down menu, and select “Upgrade Cluster.”
- Alternately, you can trigger the wizard from the Parcels page, by first downloading and distributing a parcel to upgrade to, and then selecting the “Upgrade” button for this parcel.
- When starting from the cluster’s Upgrade option, if the option to pick between packages and parcels is provided, click the “Use Parcels” radio button. Select the CDH version.
If there are no qualifying parcels, the location of the parcel repository will need to be added under “Parcel Configuration Settings.”
- The wizard will now prompt you to backup existing databases. It will provide examples of additional steps to prepare your cluster for upgrade. Please read the upgrade documentation for a more complete list of actions to be taken at this stage, before proceeding with the upgrade. Check “Yes” for all required actions to be able to “Continue.”
- The wizard now performs consistency and health checks on all hosts in the cluster. This feature is particularly helpful if you have mismatched versions of packages across cluster hosts. If any problems are found, you will be prompted to fix these before continuing.
- The selected parcel is downloaded and distributed to all hosts.
- For major upgrades, the wizard will warn that the services are about to be shut down for the upgrade.
For minor and maintenance upgrades, if you are using parcels and have HDFS high availability enabled, you will have the option to select “Rolling Upgrade” on this page. Supported services will undergo a rolling restart, while the rest will undergo a normal restart, with some downtime. Check “Rolling Upgrade” to proceed with this option.
Until this point, you can exit and resume the wizard without impacting any running services.
- The Command Progress screen displays the results of the commands run by the wizard as it shuts down all services, activates the new parcel, upgrades services, deploys client configuration files, and restarts services.
The service commands include upgrading HDFS metadata, upgrading the Apache Oozie database and installing ShareLib, upgrading the Apache Sqoop server and Hive Metastore, among other things.
- The Host Inspector runs to validate all hosts, as well as report CDH versions running on them.
- At the end of the wizard process, you are prompted to finalize the HDFS metadata upgrade. It is recommended at this stage to refer to the upgrade documentation for additional steps that might be relevant to your cluster configuration and upgrade path.
For major (CDH 4 to CDH 5) upgrades, you have the option of importing your MapReduce configurations into your YARN service. Additional steps in the wizard will assist with this migration. On completion, Cloudera recommends that you review the YARN configurations for any additional tuning you might need.
- Your upgrade is now complete!
Cloudera Enterprise 5 provides additional enterprise-ready capabilities and marks the next step in the evolution of the Hadoop-based data management platform. Any enhancements are ineffective if the benefits of the enterprise data hub are not easily accessible to existing users. That’s why Cloudera has placed an increased emphasis on the upgrade experience, to make it easier to upgrade to the latest version of the software. The team will continue to work on making improvements to this experience.
To ensure the highest level of functionality and stability, consider upgrading to the most recent version of CDH.
Please refer to the upgrade documentation for more comprehensive details on using the CDH upgrade wizard. Also, register for the “Best Practices for Upgrading Hadoop in Production” webinar that will occur live on Feb. 12, 2015.
Jayita Bhojwani is a Software Engineer at Cloudera.
Vala Dormiani is a Product Manager at Cloudera.