com.cloudera.crunch.io.text
Class CBZip2InputStream
java.lang.Object
java.io.InputStream
com.cloudera.crunch.io.text.CBZip2InputStream
- All Implemented Interfaces:
- Closeable, org.apache.hadoop.io.compress.bzip2.BZip2Constants
public class CBZip2InputStream
- extends InputStream
- implements org.apache.hadoop.io.compress.bzip2.BZip2Constants
An input stream that decompresses from the BZip2 format (without the file
header chars) to be read as any other stream.
- Author:
- Keiron Liddle
Fields inherited from interface org.apache.hadoop.io.compress.bzip2.BZip2Constants |
baseBlockSize, END_OF_BLOCK, END_OF_STREAM, G_SIZE, MAX_ALPHA_SIZE, MAX_CODE_LEN, MAX_SELECTORS, N_GROUPS, N_ITERS, NUM_OVERSHOOT_BYTES, rNums, RUNA, RUNB |
Constructor Summary |
CBZip2InputStream(org.apache.hadoop.fs.FSDataInputStream zStream,
int blockSize,
long end)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CBZip2InputStream
public CBZip2InputStream(org.apache.hadoop.fs.FSDataInputStream zStream,
int blockSize,
long end)
throws IOException
- Throws:
IOException
getReadLimit
public long getReadLimit()
setReadLimit
public void setReadLimit(long readLimit)
getReadCount
public long getReadCount()
read
public int read()
throws IOException
- Specified by:
read
in class InputStream
- Throws:
IOException
getPos
public long getPos()
throws IOException
- getPos is used by the caller to know when the processing of the current
InputSplit
is complete. In this method, as we read each bzip
block, we keep returning the beginning of the InputSplit
as the
return value until we hit a block which starts at a position >= end of
current split. At that point we should set up retpos such that after a
record is read, future getPos() calls will get a value > end of current
split - this way we will read only one record out of that bzip block -
the rest of the records from that bzip block should be read by the next
map task while processing the next split
- Returns:
-
- Throws:
IOException
Copyright © 2012. All Rights Reserved.