Resource AM - Llama, Running

Running Llama

Llama Application Master
Llama Admin Command Line tool
Llama Node Manager Auxiliary Service
Notes on Thrift

Llama Application Master

Install Llama AM

Build or download, and expand the Llama distribution tarball, llama-1.0.0-cdh5.1.3-SNAPSHOT.tar.gz.

Llama requires a local Hadoop client installed and configured to access the Hadoop cluster. The HADOOP_HOME environment variable must be defined.

Llama AM Configuration

Within Llama AM root directory:

Llama AM startup configuration: libexec/llama-env.sh.
Llama AM server configuration: conf/llama-site.xml (changes take effect on re-start).
Llama AM server logging: conf/llama-log4j.properties (changes take effect within 1 second).

NOTE: These files have all configuration properties with their default values.

Llama AM Requires Hadoop Yarn Fair Scheduler and Its Configuration File

Llama AM Leverages Hadoop Yarn Fair Scheduler user to queue mapping and queue ACL enforcement.

For Llama to work properly, Hadoop Yarn must be configured with the Fair Scheduler and the fair-scheduler.xml configuration file used by Hadoop Yarn must be available in Llama AMs configuration directory or in a directory in Llama's CLASSPATH.

Llama AM, as Hadoop Yarn Fair Scheduler, detects changes and reloads the fair-scheduler.xml configuration without requiring a restart. For Llama to work properly, the fair-scheduler.xml files in Hadoop Yarn and in Llama should be kept in sync.

Hadoop Yarn Configuration for Llama AM (Required)

Llama uses only CPU (zero memory) or only Memory containers (zero vcores). Yarn needs to be configured to allow container allocations with minimum memory of zero MB and minimum CPU of zero vcores.

Hadoop's yarn-site.xml must include the following 2 configuration properties:

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>0</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>0</value>
  </property>

The user running LlamaAM must be configured in Hadoop as a proxy user for itself.

Without Hadoop security enabled, the user running the LlamaAM server is Unix user starting the LlamaAM server.

With Hadoop security enabled, the user running the LlamaAM server is the short name of the Kerberos Principal used by the LlamaAM server (i.e. for a llama/HOSTNAME Kerberos principal, the short name is llama).

Hadoop's core-site.xml must include the following 2 configuration properties:

  <property>
    <name>hadoop.proxyuser.#LLAMAUSER#.hosts</name>
    <value>#HOSTNAME_RUNNING_LLAMA#</value>
  </property>
  <property>
    <name>hadoop.proxyuser.#LLAMAUSER#.groups</name>
    <value>#GROUP_LLAMA_USER_BELONGS_TO#</value>
  </property>

For example if the user running the LlamaAM server is llama, the LlamaAM server is running in host foo.com and the llama user belongs to the llamagroup group, the 2 configuration properties would be:

  <property>
    <name>hadoop.proxyuser.llama.hosts</name>
    <value>foo.com</value>
  </property>
  <property>
    <name>hadoop.proxyuser.llama.groups</name>
    <value>llamagroup</value>
  </property>

NOTE: For development and testing, if the values are set to * the LlamaAM server can be running from any host and there is no need to create a llamagroup.

Run Llama AM

To start Llama:

$ bin/llama
2013-08-02 06:57:32,743 INFO  Main - -----------------------------------------------------------------
2013-08-02 06:57:32,746 INFO  Main -   Java runtime version : 1.6.0_51-b11-457-11M4509
2013-08-02 06:57:32,747 INFO  Main -   Llama version        : 1.0.0-cdh5.1.3-SNAPSHOT
2013-08-02 06:57:32,747 INFO  Main -   Llama built date     : 2013-08-02T13:43Z
2013-08-02 06:57:32,747 INFO  Main -   Llama built by       : tucu
2013-08-02 06:57:32,747 INFO  Main -   Llama revision       : ba875da60c9865cceb70c352eb062f4fd1dfa309
2013-08-02 06:57:32,784 INFO  Main -   Hadoop version       : 2.1.0-cdh5.1.3-SNAPSHOT
2013-08-02 06:57:32,784 INFO  Main - -----------------------------------------------------------------
2013-08-02 06:57:32,784 INFO  Main - Configuration directory: /Users/tucu/llama/conf
2013-08-02 06:57:32,878 INFO  Main - Server: com.cloudera.llama.am.LlamaAMServer
2013-08-02 06:57:32,879 INFO  Main - -----------------------------------------------------------------
2013-08-02 06:57:33,790 INFO  LlamaAMThriftServer - Server listening on: 0.0.0.0:15000
2013-08-02 06:57:33,790 INFO  LlamaAMThriftServer - Llama started!

Llama will run in the foreground.

To stop Llama do a CTRL-C on the terminal running llama or do a kill on the PID, Llama will shutdown gracefully on a SIGINT:

...
2013-08-02 07:06:28,434 INFO  LlamaAMThriftServer - Llama started!
^C
2013-08-02 07:06:29,653 INFO  LlamaAMThriftServer - Llama shutdown!
$

Security Configuration for Llama AM

If configuring Llama AM with security Enabled (Thrift with Kerberos SASL) you need a running KDC, a service keytab (llama/HOSTNAME) and the corresponding keytab file.

The properties to configure in the llama-site.xml file are (shown with default values):

llama.am.server.thrift.security=false
llama.am.server.thrift.security.QOP=auth
llama.am.server.thrift.kerberos.keytab.file=llama.keytab
llama.am.server.thrift.kerberos.server.principal.name=llama/localhost
llama.am.server.thrift.kerberos.notification.principal.name=impala

The localhost in the principal name must match the the hostname in the Kerberos service principal.

If the path specified in llama.am.server.thrift.kerberos.keytab.file is a relative path, the keytab file will be expected in the Llama configuration directory.

When specifying the principal names use the short name only, do not include the service hostname. The Thrift SASL implementation composes the complete service principal name (shortName/hostname).

The llama.am.server.thrift.security.QOP property indicates the quality of protection if security is enabled. Valid values are:

auth: authentication
auth-int : authentication and integrity
auth-conf: authentication, integrity and confidentiality

Authorization Configuration for Llama AM

IMPORTANT: Authorization is only active when security is enabled.

Llama supports Access Control Lists (ACL) to restrict client and admin access.

And ACL is either the * wildcard or a comma-separated list of users and groups. Users are separated from groups by a whitespace.

There are 2 ACL configuration properties, client and admin. The client ACL is applied to calls in the regular thrift service endpoint. The admin ACL is applied to calls in the admin thrift service endpoint.

ACL are defined in the llama-site.xml configuration file. The ACL configuration properties and their default values are:

llama.am.server.thrift.client.acl=*
llama.am.server.thrift.admin.acl=*

Llama uses Hadoop's Groups class to retrieve user group information. The configuration for the Groups class must be set in the llama-site.xml configuration file, the default values are:

hadoop.security.group.mapping=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
hadoop.security.groups.cache.secs=300

For alternate group.mapping implementations and their configuration refer to the Hadoop documentation.

HTTP JSON JMX Endpoint

Llama AM exposes a HTTP JSON JMX endpoint (it uses Hadoop's HTTP JSON JMX servlet).

The llama.am.server.thrift.http.jmx.address configuration property defines the address the JMX servlet is bound to, by default is 0.0.0.0:15001.

The JMX servlet is available at /jmx, for example http://localhost:15001/jmx.

Gang Scheduling Anti-Deadlock Detection

Llama implements client side gang scheduling by waiting for all resources of a gang reservation to be granted before notifying Impala.

To avoid deadlocks among multiple reservations waiting for resources held by each other, Llama implements the following anti-deadlock logic. If no new resources are allocated for all gang reservations in a configured amount of time, a back off policy is triggered. The backoff policy will transparently cancel random gang reservations until a configured percentage of canceled resources is reached. The canceled reservations will be backed off for a random delay between a configured minimum and maximum delay. Once the random elapses, the reservation will be automatically submitted.

The anti-deadlock detection logic is completely transparent to Llama clients.

The Llama configuration properties for anti-deadlock detection and their default values are

llama.am.gang.anti.deadlock.enabled = true
llama.am.gang.anti.deadlock.no.allocation.limit.ms = 30000
llama.am.gang.anti.deadlock.backoff.percent = 30
llama.am.gang.anti.deadlock.backoff.min.delay.ms = 10000
llama.am.gang.anti.deadlock.backoff.max.delay.ms = 30000

Llama Admin Command Line tool

Llama provides the llamadmin command-line tool to release reservations, handles and queues.

usage:


      llamaadmin help : display usage for all commands or specified command

      llamaadmin release <OPTIONS> : release queues, handles or reservations

                         -donotcache           do not cache resources of released resources
                         -handles <arg>        client handles (comma separated)
                         -llama <arg>          <HOST>:<PORT> of llama
                         -queues <arg>         queues (comma separated)
                         -reservations <arg>   reservations (comma separated)
                         -secure               uses kerberos

      llamaadmin errorcodes : list error codes

      llamaadmin emptycache <OPTIONS> : empty cached resources not in use

                            -allqueues      empty cache for all queues
                            -llama <arg>    <HOST>:<PORT> of llama
                            -queues <arg>   queues (comma separated)
                            -secure         uses kerberos

If the -llama or -secure options are not specified, they are looked up in the llamaadmin-site.xml configuration file under the following configuration properties (default values shown):

llamaadmin.server.thrift.address = localhost:15002 * llamaadmin.server.thrift.secure = false

The Llama server will only enforce the admin ACLS if security is enabled.

Llama Node Manager Auxiliary Service

Install Llama NM Auxiliary Service

Expand the Llama TARBALL distribution file.

The Llama installation has a lib/ directory. The JARs within this directory must be added to the classpath of all NodeManagers in the Yarn cluster.

When configure, the Llama NM auxiliary service is started and stopped by the Hadoop Yarn NodeManager services when they start and stop.

Llama NM Auxiliary Service Configuration

Llama NM auxiliary service configuration file, llama-site.xml, must be copied to the NodeManager configuration directory.

Any changes to the llama-site.xml file take effect on re-start.

Hadoop Yarn Configuration for Llama NM Auxiliary Service (Required)

The following properties must be set in all NodeManager yarn-site.xml configuration files:

    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle,llama_nm_plugin</value>
    </property>
    <property>
      <name>yarn.nodemanager.aux-services.llama_nm_plugin.class</name>
      <value>com.cloudera.llama.nm.LlamaNMAuxiliaryService</value>
    </property>

NOTE: The mapreduce_shuffle value is not required by Llama NM auxiliary service but for Map-Reduce to work in Yarn.

Security Configuration for Llama NM Auxiliary Service

The security configuration for Llama NM auxiliary service is identical to Llama AM security configuration (see above). The only difference is that the property names have .nm. instead of .am.:

llama.nm.server.thrift.security=false
llama.nm.server.thrift.security.QOP=auth
llama.nm.server.thrift.kerberos.keytab.file=llama.keytab
llama.nm.server.thrift.kerberos.server.principal.name=llama/localhost
llama.nm.server.thrift.kerberos.notification.principal.name=impala

The llama.nm.server.thrift.security.QOP property indicates the quality of protection if security is enabled. Valid values are:

auth: authentication
auth-int : authentication and integrity
auth-conf: authentication, integrity and confidentiality

Authorization Configuration for Llama NM Auxiliary Service

IMPORTANT: Authorization is only active when security is enabled.

Llama supports Access Control Lists (ACL) to restrict client and admin access.

And ACL is either the * wildcard or a comma-separated list of users and groups. Users are separated from groups by a whitespace.

ACL are defined in the llama-site.xml configuration file. The ACL configuration properties and their default values are:

llama.nm.server.thrift.client.acl=*
llama.nm.server.thrift.admin.acl=*

Llama uses Hadoop's Groups class to retrieve user group information. The configuration for the Groups class must be set in the llama-site.xml configuration file, the default values are:

hadoop.security.group.mapping=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
hadoop.security.groups.cache.secs=300

For alternate group.mapping implementations and their configuration refer to the Hadoop documentation.

Notes on Thrift

Llama Thrift server supports unframed transport only (per Impala requirement). Because of this Llama must use synchronous IO (TThreadPoolServer).

Kerberos Thrift SASL is supported.