Cloudera Manager(CM) 6.0 introduces new Python API client cm_client based
on Swagger. This new API client supports all CM API versions.
Older Python client will still be supported for
API version less than 30. So older Python API client can still be used
against Cloudera Manager version 6.0 and later as long as API version 19 or earlier is used.
For e.g. customer can use old CM API client version 5.14 against CM version 6.0
which by default will invoke API version 19. If customer wants to use new features that were
introduced in Cloudera Manager 6.0 i.e. API version 30 like “Fine Grained Access Control” then
customer must use this new API client.
Older Python client and new Swagger based Python client can co-exist in an application
to allow for incremental transition to new Swagger based python client.
To install the Python API client, simply:
$ sudo pip install cm_client
If your system does not have pip, you can get it from your distro:
$ sudo apt-get install python-pip
## ... or use `yum install` if you are on CentOS
Alternatively, you can also install from source:
$ wget http://archive.cloudera.com/cm6/6.3.0/generic/jar/cm_api/cloudera-manager-api-swagger-6.3.0.tar
$ tar xvf cloudera-manager-api-swagger-6.3.0.tar
$ cd swagger/python/
$ sudo python setup.py install
Here is the latest SDK doc,
for API version 33 (CM 6.3.0).
Each subsection continues from the previous one.
TLS configuration can be specified in cm_client.configuration using parameters verify_ssl, ssl_ca_cert.
Inspecting a Service
Now we have identified a CDH6 cluster, find the HDFS service:
Inspect the HDFS service health and status:
Inspecting a Role
Find the NameNode and get basic info:
Similar to the service example, roles also expose their health checks.
First we look at what metrics are available:
Reading a metric: Suppose we are interested in the files_total and
dfs_capacity_used metrics, over the last 30 minutes.
This example uses the new-style /cm/timeseries endpoint (which uses
tsquery) to get metric data points.
Even though the example is querying HDFS metrics, the processing logic is the
same for all queries.
The old-style .../metrics endpoints (which exists under host, service
and role objects) are mostly useful for exploring what metrics are available.
Service Lifecycle and Commands
Restart HDFS. Start and stop work similarly:
Example of wait() method to poll and wait for asynchronous command like restart
Restart the NameNodes. Commands on roles are issued at the RoleCommands endpoint
under service and can be done in bulk.
Configuring Services and Roles
First, lets look at all possible service configs. For legacy reasons, this is a
2-tuple of service configs and an empty dictionary of role_type_configs (as of API v3).
Now let’s change dfs_replication to 2. We use “dfs_replication” and not
dfs.replication” because we must match the keys of the config view. This is
also the same value as ApiConfig.name.
Configuring roles is done similarly. Normally you want to modify groups instead
of modifying each role one by one.
First, find the group(s).
See all possible role configuration. It’s the same for all groups of the same
role type in clusters with the same CDH version.
Let’s configure our data nodes to auto-restart:
To reset a config to default, pass in a value of None:
These examples cover how to get a new parcel up and running on
a cluster. Normally you would pick a specific parcel repository
and parcel version you want to install.
Add a CDH parcel repository. Note that in CDH 4, Impala and Solr are
in separate parcels. They are included in the CDH 5 parcel.
These examples require v5 of the CM API or higher.
Download the parcel to the CM server:
Distribute the parcel so all agents on that cluster have a local copy of the parcel.
Activate the parcel so services pick up the new binaries upon next restart:
Restart your cluster to pick up the new parcel:
These examples cover how to export and import cluster template.
These examples requires v30 of the CM API or higher.
Import following modules:
Export the cluster template as a json file:
Make the required changes in the template file manually or using the python API. User needs to map the hosts in the target cluster with right host templates and provide information about all the variables,like database information in the target cluster.
Invoking import cluster template on the target cluster:
User can use this command to track the progress. The progress can be tracked by command details page in UI
or wait using the method mentioned above.
Project maintained by Cloudera,
and released under Apache License v2.
Contributions, bug reports and feature suggestions welcome.
Please post any questions to the
user mailing list
or the community forum.