com.cloudera.crunch
Class MapFn<S,T>

java.lang.Object
  extended by com.cloudera.crunch.DoFn<S,T>
      extended by com.cloudera.crunch.MapFn<S,T>
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
CompositeMapFn, IdentityFn, PairMapFn, PGroupedTableType.PairIterableMapFn, PTypes.JacksonInputMapFn, PTypes.JacksonOutputMapFn, PTypes.ProtoInputMapFn, PTypes.ProtoOutputMapFn, PTypes.SmileInputMapFn, PTypes.SmileOutputMapFn, PTypes.ThriftInputMapFn, PTypes.ThriftOutputMapFn

public abstract class MapFn<S,T>
extends DoFn<S,T>

A DoFn for the common case of emitting exactly one value for each input record.

See Also:
Serialized Form

Constructor Summary
MapFn()
           
 
Method Summary
abstract  T map(S input)
          Maps the given input into an instance of the output type.
 void process(S input, Emitter<T> emitter)
          Processes the records from a PCollection.
 float scaleFactor()
          Returns an estimate of how applying this function to a PCollection will cause it to change in side.
 
Methods inherited from class com.cloudera.crunch.DoFn
cleanup, configure, getConfiguration, getCounter, getCounter, getStatus, getTaskAttemptID, initialize, progress, setConfigurationForTest, setContext, setStatus
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MapFn

public MapFn()
Method Detail

map

public abstract T map(S input)
Maps the given input into an instance of the output type.


process

public void process(S input,
                    Emitter<T> emitter)
Description copied from class: DoFn
Processes the records from a PCollection.

Specified by:
process in class DoFn<S,T>
Parameters:
input - The input record
emitter - The emitter to send the output to

scaleFactor

public float scaleFactor()
Description copied from class: DoFn
Returns an estimate of how applying this function to a PCollection will cause it to change in side. The optimizer uses these estimates to decide where to break up dependent MR jobs into separate Map and Reduce phases in order to minimize I/O.

Subclasses of DoFn that will substantially alter the size of the resulting PCollection should override this method.

Overrides:
scaleFactor in class DoFn<S,T>


Copyright © 2012. All Rights Reserved.