com.cloudera.cdk.morphline.saxon
Class ConvertHTMLBuilder

java.lang.Object
  extended by com.cloudera.cdk.morphline.saxon.ConvertHTMLBuilder
All Implemented Interfaces:
CommandBuilder

public final class ConvertHTMLBuilder
extends Object
implements CommandBuilder

Command that converts HTML to XHTML using the TagSoup library. Instead of parsing well-formed or valid XML, this command parses HTML as it is found in the wild: poor, nasty and brutish, though quite often far from short. TagSoup (and hence this command) is designed for people who have to process this stuff using some semblance of a rational application design. By providing this converter, it allows standard XML tools to be applied to even the worst HTML.


Constructor Summary
ConvertHTMLBuilder()
           
 
Method Summary
 Command build(com.typesafe.config.Config config, Command parent, Command child, MorphlineContext context)
          Creates and returns a command rooted at the given morphline JSON config.
 Collection<String> getNames()
          Returns the names with which this command can be invoked.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ConvertHTMLBuilder

public ConvertHTMLBuilder()
Method Detail

getNames

public Collection<String> getNames()
Description copied from interface: CommandBuilder
Returns the names with which this command can be invoked. The returned set can contain synonyms to enable backwards compatible name changes.

Specified by:
getNames in interface CommandBuilder

build

public Command build(com.typesafe.config.Config config,
                     Command parent,
                     Command child,
                     MorphlineContext context)
Description copied from interface: CommandBuilder
Creates and returns a command rooted at the given morphline JSON config. The command will feed records into child. The command will have parent as it's parent. Additional parameters can be passed via the morphline context.

Specified by:
build in interface CommandBuilder


Copyright © 2013 Cloudera. All rights reserved.