Contrib

Those modules are not active enough to be officially maintained in the core Hue but those are pretty functional and should still fit your needs. Any contribution is welcomed!

SDK

Check the Developer guide or contact the community about how to build your own custom app.

HBase Browser

We’ll take a look at the HBase Browser App.

Note: With just a few changes in the Python API, the HBase browser could be compatible with Apache Kudu.

SmartView

The smartview is the view that you land on when you first enter a table. On the left hand side are the row keys and hovering over a row reveals a list of controls on the right. Click a row to select it, and once selected you can perform batch operations, sort columns, or do any amount of standard database operations. To explore a row, simple scroll to the right. By scrolling, the row should continue to lazily-load cells until the end.

Adding Data

To initially populate the table, you can insert a new row or bulk upload CSV/TSV/etc. type data into your table.

On the right hand side of a row is a ‘+’ sign that lets you insert columns into your row

Mutating Data

To edit a cell, simply click to edit inline.

If you need more control or data about your cell, click “Full Editor” to edit.

In the full editor, you can view cell history or upload binary data to the cell. Binary data of certain MIME Types are detected, meaning you can view and edit images, PDFs, JSON, XML, and other types directly in your browser!

Hovering over a cell also reveals some more controls (such as the delete button or the timestamp). Click the title to select a few and do batch operations:

If you need some sample data to get started and explore, check out this howto create HBase table tutorial.

The “Smart Searchbar” is a sophisticated tool that helps you zero-in on your data. The smart search supports a number of operations. The most basic ones include finding and scanning row keys. Here I am selecting two row keys with:

domain.100, domain.200

Submitting this query gives me the two rows I was looking for. If I want to fetch rows after one of these, I have to do a scan. This is as easy as writing a ‘+’ followed by the number of rows you want to fetch.

domain.100, domain.200 +5

Fetches domain.100 and domain.200 followed by the next 5 rows. If you’re ever confused about your results, you can look down below and the query bar and also click in to edit your query.

The Smart Search also supports column filtering. On any row, I can specify the specific columns or families I want to retrieve. With:

domain.100[column_family:]   

I can select a bare family, or mix columns from different families like so:

domain.100[family1:, family2:, family3:column_a]

Doing this will restrict my results from one row key to the columns I specified. If you want to restrict column families only, the same effect can be achieved with the filters on the right. Just click to toggle a filter.

Finally, let’s try some more complex column filters. I can query for bare columns:

domain.100[column_a]

This will multiply my query over all column families. I can also do prefixes and scans:

    domain.100[family: prefix* +3]

This will fetch me all columns that start with prefix* limited to 3 results. Finally, I can filter on range:

domain.100[family: column1 to column100]

This will fetch me all columns in ‘family:’ that are lexicographically >= column1 but <= column100. The first column (‘column1’) must be a valid column, but the second can just be any string for comparison.

The Smart Search also supports prefix filtering on rows. To select a prefixed row, simply type the row key followed by a star *. The prefix should be highlighted like any other searchbar keyword. A prefix scan is performed exactly like a regular scan, but with a prefixed row.

domain.10* +10

Finally, as a new feature, you can also take full advantage of the HBase filteringlanguage, by typing your filter string between curly braces. HBase Browser autocompletes your filters for you so you don’t have to look them up every time. You can apply filters to rows or scans.

domain.1000 {ColumnPrefixFilter('100-') AND ColumnCountGetFilter(3)}

This doc only covers a few basic features of the Smart Search. You can take advantage of the full querying language by referring to the help menu when using the app. These include column prefix, bare columns, column range, etc. Remember that if you ever need help with the searchbar, you can use the help menu that pops up while typing, which will suggest next steps to complete your query.

Sqoop 1 Importer

Iport data from relational databases to HDFS file or Hive table using Apache Sqoop 1. It enables us to bring large amount of data into the cluster in just few clicks via interactive UI. This Sqoop connector was added to the existing import data wizard of Hue.

In the past, importing data using Sqoop command line interface could be a cumbersome and inefficient process. The task expected users to have a good knowledge of Sqoop . For example they would need put together a series of required parameters with specific syntax that would result in errors easy to make. Often times getting those correctly can take a few hours of work. Now with Hue’s new feature you can submityour Sqoop job in minutes. The imports run on YARN and are scheduled by Oozie. This tutorial offers a step by step guide on how to do it.

Learn more about it on the Importing data from traditional databases into HDFS/Hive in just a few clicks post.

Sqoop 2 Editor

The Sqoop UI enables transfering data from a relational database to Hadoop and vice versa. The UI lives uses Apache Sqoop to do this. See the Sqoop Documentation for more details on Sqoop.

Creating a New Job

  1. Click the New job button at the top right.
  2. In the Name field, enter a name.
  3. Choose the type of job: import or export. The proceeding form fields will change depending on which type is chosen.
  4. Select a connection, or create one if it does not exist.
  5. Fill in the rest of the fields for the job. For importing, the “Table name”, “Storage type”, “Output format”, and “Output directory” are necessary at a minimum. For exporting, the “Table name” and “Input directory” are necessary at a minimum.
  6. Click save to finish.

Running a Job

There’s a status on each of the items in the job list indicating the last time a job was ran. The progress of the job should dynamically update. There’s a progress bar at the bottom of each item on the job list as well.

  1. In the list of jobs, click on the name of the job.
  2. On the left hand side of the job editor, there should be a panel containing actions. Click Run.

Creating a New Connection

  1. Click the New job button at the top right.
  2. At the connection field, click the link titled Add a new connection.
  3. Fill in the displayed fields.
  4. Click save to finish.

Editing a Connection

  1. Click the New job button at the top right.
  2. At the connection field, select the connection by name that should be edited.
  3. Click Edit.
  4. Edit the any of the fields.
  5. Click save to finish.

Removing a Connection

  1. Click the New job button at the top right.
  2. At the connection field, select the connection by name that should be deleted.
  3. Click Delete.

NOTE: If this does not work, it’s like because a job is using that connection. Make sure not jobs are using the connection that will be deleted.

Filtering Sqoop Jobs

The text field in the top, left corner of the Sqoop Jobs page enables fast filtering of sqoop jobs by name.

ZooKeeper Browser

The main two features are:

  • Listing of the ZooKeeper cluster stats and clients
  • Browsing and edition of the ZNode hierarchy

ZooKeeper Browser requires the ZooKeeper REST service to be running. Here is how to setup this one:

First get and build ZooKeeper:

git clone https://github.com/apache/zookeeper
cd zookeeper
ant
Buildfile: /home/hue/Development/zookeeper/build.xml

init:
       [mkdir] Created dir: /home/hue/Development/zookeeper/build/classes
       [mkdir] Created dir: /home/hue/Development/zookeeper/build/lib
       [mkdir] Created dir: /home/hue/Development/zookeeper/build/package/lib
       [mkdir] Created dir: /home/hue/Development/zookeeper/build/test/lib

   ...

And start the REST service:

cd src/contrib/rest
nohup ant run&

If ZooKeeper and the REST service are not on the same machine as Hue, go update the Hue settings and specify the correct hostnames and ports:

    [zookeeper]

      [[clusters]]

        [[[default]]]
          # Zookeeper ensemble. Comma separated list of Host/Port.
          # e.g. localhost:2181,localhost:2182,localhost:2183
          ## host_ports=localhost:2181

          # The URL of the REST contrib service
          ## rest_url=http://localhost:9998

Git

A basic read only version is done HUE-951.