com.senseidb.indexing.hadoop.keyvalueformat
Class IntermediateForm

java.lang.Object
  extended by com.senseidb.indexing.hadoop.keyvalueformat.IntermediateForm
All Implemented Interfaces:
org.apache.hadoop.io.Writable

public class IntermediateForm
extends Object
implements org.apache.hadoop.io.Writable

An intermediate form for one or more parsed Lucene documents and/or delete terms. It actually uses Lucene file format as the format for the intermediate form by using RAM dir files. Note: If process(*) is ever called, closeWriter() should be called. Otherwise, no need to call closeWriter().


Constructor Summary
IntermediateForm()
          Constructor
 
Method Summary
 void closeWriter()
          Close the Lucene index writer associated with the intermediate form, if created.
 void configure(org.apache.hadoop.conf.Configuration iconf)
          Configure using an index update configuration.
 org.apache.lucene.store.Directory getDirectory()
          Get the ram directory of the intermediate form.
 void process(org.apache.lucene.document.Document doc, org.apache.lucene.analysis.Analyzer analyzer)
          This method is used by the index update mapper and process a document operation into the current intermediate form.
 void process(IntermediateForm form)
          This method is used by the index update combiner and process an intermediate form into the current intermediate form.
 void readFields(DataInput in)
           
 String toString()
           
 long totalSizeInBytes()
          The total size of files in the directory and ram used by the index writer.
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

IntermediateForm

public IntermediateForm()
                 throws IOException
Constructor

Throws:
IOException
Method Detail

configure

public void configure(org.apache.hadoop.conf.Configuration iconf)
Configure using an index update configuration.

Parameters:
iconf - the index update configuration

getDirectory

public org.apache.lucene.store.Directory getDirectory()
Get the ram directory of the intermediate form.

Returns:
the ram directory

process

public void process(org.apache.lucene.document.Document doc,
                    org.apache.lucene.analysis.Analyzer analyzer)
             throws IOException
This method is used by the index update mapper and process a document operation into the current intermediate form.

Parameters:
doc - input document operation
analyzer - the analyzer
Throws:
IOException

process

public void process(IntermediateForm form)
             throws IOException
This method is used by the index update combiner and process an intermediate form into the current intermediate form. More specifically, the input intermediate forms are a single-document ram index and/or a single delete term.

Parameters:
form - the input intermediate form
Throws:
IOException

closeWriter

public void closeWriter()
                 throws IOException
Close the Lucene index writer associated with the intermediate form, if created. Do not close the ram directory. In fact, there is no need to close a ram directory.

Throws:
IOException

totalSizeInBytes

public long totalSizeInBytes()
                      throws IOException
The total size of files in the directory and ram used by the index writer. It does not include memory used by the delete list.

Returns:
the total size in bytes
Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException


Copyright © 2010-2012. All Rights Reserved.