org.apache.poi.xssf.extractor
Class XSSFEventBasedExcelExtractor

java.lang.Object
  extended by org.apache.poi.POITextExtractor
      extended by org.apache.poi.POIXMLTextExtractor
          extended by org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor
All Implemented Interfaces:
java.io.Closeable, ExcelExtractor

public class XSSFEventBasedExcelExtractor
extends POIXMLTextExtractor
implements ExcelExtractor

Implementation of a text extractor from OOXML Excel files that uses SAX event based parsing.


Nested Class Summary
protected  class XSSFEventBasedExcelExtractor.SheetTextExtractor
           
 
Constructor Summary
XSSFEventBasedExcelExtractor(OPCPackage container)
           
XSSFEventBasedExcelExtractor(java.lang.String path)
           
 
Method Summary
 void close()
          Allows to free resources of the Extractor as soon as it is not needed any more.
 POIXMLProperties.CoreProperties getCoreProperties()
          Returns the core document properties
 POIXMLProperties.CustomProperties getCustomProperties()
          Returns the custom document properties
 POIXMLProperties.ExtendedProperties getExtendedProperties()
          Returns the extended document properties
 OPCPackage getPackage()
          Returns the opened OPCPackage container.
 java.lang.String getText()
          Processes the file and returns the text
static void main(java.lang.String[] args)
           
 void processSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, StylesTable styles, CommentsTable comments, ReadOnlySharedStringsTable strings, java.io.InputStream sheetInputStream)
          Processes the given sheet
 void setFormulasNotResults(boolean formulasNotResults)
          Should we return the formula itself, and not the result it produces? Default is false
 void setIncludeCellComments(boolean includeCellComments)
          Should cell comments be included? Default is false
 void setIncludeHeadersFooters(boolean includeHeadersFooters)
          Should headers and footers be included? Default is true
 void setIncludeSheetNames(boolean includeSheetNames)
          Should sheet names be included? Default is true
 void setIncludeTextBoxes(boolean includeTextBoxes)
          Should text from textboxes be included? Default is true
 void setLocale(java.util.Locale locale)
           
 
Methods inherited from class org.apache.poi.POIXMLTextExtractor
getDocument, getMetadataTextExtractor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XSSFEventBasedExcelExtractor

public XSSFEventBasedExcelExtractor(java.lang.String path)
                             throws org.apache.xmlbeans.XmlException,
                                    OpenXML4JException,
                                    java.io.IOException
Throws:
org.apache.xmlbeans.XmlException
OpenXML4JException
java.io.IOException

XSSFEventBasedExcelExtractor

public XSSFEventBasedExcelExtractor(OPCPackage container)
                             throws org.apache.xmlbeans.XmlException,
                                    OpenXML4JException,
                                    java.io.IOException
Throws:
org.apache.xmlbeans.XmlException
OpenXML4JException
java.io.IOException
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception

setIncludeSheetNames

public void setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is true

Specified by:
setIncludeSheetNames in interface ExcelExtractor

setFormulasNotResults

public void setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is false

Specified by:
setFormulasNotResults in interface ExcelExtractor

setIncludeHeadersFooters

public void setIncludeHeadersFooters(boolean includeHeadersFooters)
Should headers and footers be included? Default is true

Specified by:
setIncludeHeadersFooters in interface ExcelExtractor

setIncludeTextBoxes

public void setIncludeTextBoxes(boolean includeTextBoxes)
Should text from textboxes be included? Default is true


setIncludeCellComments

public void setIncludeCellComments(boolean includeCellComments)
Should cell comments be included? Default is false

Specified by:
setIncludeCellComments in interface ExcelExtractor

setLocale

public void setLocale(java.util.Locale locale)

getPackage

public OPCPackage getPackage()
Returns the opened OPCPackage container.

Overrides:
getPackage in class POIXMLTextExtractor

getCoreProperties

public POIXMLProperties.CoreProperties getCoreProperties()
Returns the core document properties

Overrides:
getCoreProperties in class POIXMLTextExtractor

getExtendedProperties

public POIXMLProperties.ExtendedProperties getExtendedProperties()
Returns the extended document properties

Overrides:
getExtendedProperties in class POIXMLTextExtractor

getCustomProperties

public POIXMLProperties.CustomProperties getCustomProperties()
Returns the custom document properties

Overrides:
getCustomProperties in class POIXMLTextExtractor

processSheet

public void processSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor,
                         StylesTable styles,
                         CommentsTable comments,
                         ReadOnlySharedStringsTable strings,
                         java.io.InputStream sheetInputStream)
                  throws java.io.IOException,
                         org.xml.sax.SAXException
Processes the given sheet

Throws:
java.io.IOException
org.xml.sax.SAXException

getText

public java.lang.String getText()
Processes the file and returns the text

Specified by:
getText in interface ExcelExtractor
Specified by:
getText in class POITextExtractor
Returns:
All the text from the document

close

public void close()
           throws java.io.IOException
Description copied from class: POITextExtractor
Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.

Specified by:
close in interface java.io.Closeable
Overrides:
close in class POIXMLTextExtractor
Throws:
java.io.IOException


Copyright 2015 The Apache Software Foundation or its licensors, as applicable.