Build or download, and expand the Llama distribution tarball, llama-1.0.0-cdh5.1.3-SNAPSHOT.tar.gz.
Llama requires a local Hadoop client installed and configured to access the Hadoop cluster. The HADOOP_HOME environment variable must be defined.
Within Llama AM root directory:
NOTE: These files have all configuration properties with their default values.
Llama AM Leverages Hadoop Yarn Fair Scheduler user to queue mapping and queue ACL enforcement.
For Llama to work properly, Hadoop Yarn must be configured with the Fair Scheduler and the fair-scheduler.xml configuration file used by Hadoop Yarn must be available in Llama AMs configuration directory or in a directory in Llama's CLASSPATH.
Llama AM, as Hadoop Yarn Fair Scheduler, detects changes and reloads the fair-scheduler.xml configuration without requiring a restart. For Llama to work properly, the fair-scheduler.xml files in Hadoop Yarn and in Llama should be kept in sync.
Llama uses only CPU (zero memory) or only Memory containers (zero vcores). Yarn needs to be configured to allow container allocations with minimum memory of zero MB and minimum CPU of zero vcores.
Hadoop's yarn-site.xml must include the following 2 configuration properties:
<property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>0</value> </property> <property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>0</value> </property>
The user running LlamaAM must be configured in Hadoop as a proxy user for itself.
Without Hadoop security enabled, the user running the LlamaAM server is Unix user starting the LlamaAM server.
With Hadoop security enabled, the user running the LlamaAM server is the short name of the Kerberos Principal used by the LlamaAM server (i.e. for a llama/HOSTNAME Kerberos principal, the short name is llama).
Hadoop's core-site.xml must include the following 2 configuration properties:
<property> <name>hadoop.proxyuser.#LLAMAUSER#.hosts</name> <value>#HOSTNAME_RUNNING_LLAMA#</value> </property> <property> <name>hadoop.proxyuser.#LLAMAUSER#.groups</name> <value>#GROUP_LLAMA_USER_BELONGS_TO#</value> </property>
For example if the user running the LlamaAM server is llama, the LlamaAM server is running in host foo.com and the llama user belongs to the llamagroup group, the 2 configuration properties would be:
<property> <name>hadoop.proxyuser.llama.hosts</name> <value>foo.com</value> </property> <property> <name>hadoop.proxyuser.llama.groups</name> <value>llamagroup</value> </property>
NOTE: For development and testing, if the values are set to * the LlamaAM server can be running from any host and there is no need to create a llamagroup.
To start Llama:
$ bin/llama 2013-08-02 06:57:32,743 INFO Main - ----------------------------------------------------------------- 2013-08-02 06:57:32,746 INFO Main - Java runtime version : 1.6.0_51-b11-457-11M4509 2013-08-02 06:57:32,747 INFO Main - Llama version : 1.0.0-cdh5.1.3-SNAPSHOT 2013-08-02 06:57:32,747 INFO Main - Llama built date : 2013-08-02T13:43Z 2013-08-02 06:57:32,747 INFO Main - Llama built by : tucu 2013-08-02 06:57:32,747 INFO Main - Llama revision : ba875da60c9865cceb70c352eb062f4fd1dfa309 2013-08-02 06:57:32,784 INFO Main - Hadoop version : 2.1.0-cdh5.1.3-SNAPSHOT 2013-08-02 06:57:32,784 INFO Main - ----------------------------------------------------------------- 2013-08-02 06:57:32,784 INFO Main - Configuration directory: /Users/tucu/llama/conf 2013-08-02 06:57:32,878 INFO Main - Server: com.cloudera.llama.am.LlamaAMServer 2013-08-02 06:57:32,879 INFO Main - ----------------------------------------------------------------- 2013-08-02 06:57:33,790 INFO LlamaAMThriftServer - Server listening on: 0.0.0.0:15000 2013-08-02 06:57:33,790 INFO LlamaAMThriftServer - Llama started!
Llama will run in the foreground.
To stop Llama do a CTRL-C on the terminal running llama or do a kill on the PID, Llama will shutdown gracefully on a SIGINT:
... 2013-08-02 07:06:28,434 INFO LlamaAMThriftServer - Llama started! ^C 2013-08-02 07:06:29,653 INFO LlamaAMThriftServer - Llama shutdown! $
If configuring Llama AM with security Enabled (Thrift with Kerberos SASL) you need a running KDC, a service keytab (llama/HOSTNAME) and the corresponding keytab file.
The properties to configure in the llama-site.xml file are (shown with default values):
The localhost in the principal name must match the the hostname in the Kerberos service principal.
If the path specified in llama.am.server.thrift.kerberos.keytab.file is a relative path, the keytab file will be expected in the Llama configuration directory.
When specifying the principal names use the short name only, do not include the service hostname. The Thrift SASL implementation composes the complete service principal name (shortName/hostname).
The llama.am.server.thrift.security.QOP property indicates the quality of protection if security is enabled. Valid values are:
IMPORTANT: Authorization is only active when security is enabled.
Llama supports Access Control Lists (ACL) to restrict client and admin access.
And ACL is either the * wildcard or a comma-separated list of users and groups. Users are separated from groups by a whitespace.
There are 2 ACL configuration properties, client and admin. The client ACL is applied to calls in the regular thrift service endpoint. The admin ACL is applied to calls in the admin thrift service endpoint.
ACL are defined in the llama-site.xml configuration file. The ACL configuration properties and their default values are:
Llama uses Hadoop's Groups class to retrieve user group information. The configuration for the Groups class must be set in the llama-site.xml configuration file, the default values are:
For alternate group.mapping implementations and their configuration refer to the Hadoop documentation.
Llama AM exposes a HTTP JSON JMX endpoint (it uses Hadoop's HTTP JSON JMX servlet).
The llama.am.server.thrift.http.jmx.address configuration property defines the address the JMX servlet is bound to, by default is 0.0.0.0:15001.
The JMX servlet is available at /jmx, for example http://localhost:15001/jmx.
Llama implements client side gang scheduling by waiting for all resources of a gang reservation to be granted before notifying Impala.
To avoid deadlocks among multiple reservations waiting for resources held by each other, Llama implements the following anti-deadlock logic. If no new resources are allocated for all gang reservations in a configured amount of time, a back off policy is triggered. The backoff policy will transparently cancel random gang reservations until a configured percentage of canceled resources is reached. The canceled reservations will be backed off for a random delay between a configured minimum and maximum delay. Once the random elapses, the reservation will be automatically submitted.
The anti-deadlock detection logic is completely transparent to Llama clients.
The Llama configuration properties for anti-deadlock detection and their default values are
Llama provides the llamadmin command-line tool to release reservations, handles and queues.
usage: llamaadmin help : display usage for all commands or specified command llamaadmin release <OPTIONS> : release queues, handles or reservations -donotcache do not cache resources of released resources -handles <arg> client handles (comma separated) -llama <arg> <HOST>:<PORT> of llama -queues <arg> queues (comma separated) -reservations <arg> reservations (comma separated) -secure uses kerberos llamaadmin errorcodes : list error codes llamaadmin emptycache <OPTIONS> : empty cached resources not in use -allqueues empty cache for all queues -llama <arg> <HOST>:<PORT> of llama -queues <arg> queues (comma separated) -secure uses kerberos
If the -llama or -secure options are not specified, they are looked up in the llamaadmin-site.xml configuration file under the following configuration properties (default values shown):
The Llama server will only enforce the admin ACLS if security is enabled.
Expand the Llama TARBALL distribution file.
The Llama installation has a lib/ directory. The JARs within this directory must be added to the classpath of all NodeManagers in the Yarn cluster.
When configure, the Llama NM auxiliary service is started and stopped by the Hadoop Yarn NodeManager services when they start and stop.
Llama NM auxiliary service configuration file, llama-site.xml, must be copied to the NodeManager configuration directory.
Any changes to the llama-site.xml file take effect on re-start.
The following properties must be set in all NodeManager yarn-site.xml configuration files:
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle,llama_nm_plugin</value> </property> <property> <name>yarn.nodemanager.aux-services.llama_nm_plugin.class</name> <value>com.cloudera.llama.nm.LlamaNMAuxiliaryService</value> </property>
NOTE: The mapreduce_shuffle value is not required by Llama NM auxiliary service but for Map-Reduce to work in Yarn.
The security configuration for Llama NM auxiliary service is identical to Llama AM security configuration (see above). The only difference is that the property names have .nm. instead of .am.:
The llama.nm.server.thrift.security.QOP property indicates the quality of protection if security is enabled. Valid values are:
IMPORTANT: Authorization is only active when security is enabled.
Llama supports Access Control Lists (ACL) to restrict client and admin access.
And ACL is either the * wildcard or a comma-separated list of users and groups. Users are separated from groups by a whitespace.
There are 2 ACL configuration properties, client and admin. The client ACL is applied to calls in the regular thrift service endpoint. The admin ACL is applied to calls in the admin thrift service endpoint.
ACL are defined in the llama-site.xml configuration file. The ACL configuration properties and their default values are:
Llama uses Hadoop's Groups class to retrieve user group information. The configuration for the Groups class must be set in the llama-site.xml configuration file, the default values are:
For alternate group.mapping implementations and their configuration refer to the Hadoop documentation.