org.apache.poi.hwpf
Class HWPFDocument

java.lang.Object
  extended by org.apache.poi.POIDocument
      extended by org.apache.poi.hwpf.HWPFDocumentCore
          extended by org.apache.poi.hwpf.HWPFDocument

public final class HWPFDocument
extends HWPFDocumentCore

This class acts as the bucket that we throw all of the Word data structures into.

Author:
Ryan Ackley

Field Summary
protected  Bookmarks _bookmarks
          Holds the bookmarks
protected  BookmarksTables _bookmarksTables
          Holds the bookmarks tables
protected  ComplexFileTable _cft
          Contains text of the document wrapped in a obfuscated Word data structure
protected  byte[] _dataStream
          data stream buffer
protected  DocumentProperties _dop
          Document wide Properties
protected  Notes _endnotes
          Holds the footnotes
protected  NotesTables _endnotesTables
          Holds the ending notes tables
protected  EscherRecordHolder _escherRecordHolder
          Escher Drawing Group information
protected  Fields _fields
          Holds the fields
protected  FieldsTables _fieldsTables
          Holds the fields PLCFs
protected  Notes _footnotes
          Holds the footnotes
protected  NotesTables _footnotesTables
          Holds the footnotes tables
protected  ShapesTable _officeArts
          Deprecated. 
protected  OfficeDrawingsImpl _officeDrawingsHeaders
          Holds Office Art objects
protected  OfficeDrawingsImpl _officeDrawingsMain
          Holds Office Art objects
protected  PicturesTable _pictures
          Holds pictures table
protected  RevisionMarkAuthorTable _rmat
          Holds the revision mark authors for this document.
protected  SavedByTable _sbt
          Holds the save history for this document.
protected  byte[] _tableStream
          table stream buffer
protected  java.lang.StringBuilder _text
          Contains text buffer linked directly to single-piece document text piece
 
Fields inherited from class org.apache.poi.hwpf.HWPFDocumentCore
_cbt, _fib, _ft, _lt, _mainStream, _objectPool, _pbt, _ss, _st, STREAM_OBJECT_POOL, STREAM_WORD_DOCUMENT
 
Fields inherited from class org.apache.poi.POIDocument
directory
 
Constructor Summary
protected HWPFDocument()
           
  HWPFDocument(DirectoryNode directory)
          This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default.
  HWPFDocument(DirectoryNode directory, POIFSFileSystem pfilesystem)
          Deprecated. Use HWPFDocument(DirectoryNode) instead
  HWPFDocument(java.io.InputStream istream)
          This constructor loads a Word document from an InputStream.
  HWPFDocument(POIFSFileSystem pfilesystem)
          This constructor loads a Word document from a POIFSFileSystem
 
Method Summary
 int characterLength()
          Returns the character length of a document.
 void delete(int start, int length)
           
 Bookmarks getBookmarks()
           
 Range getCommentsRange()
          Returns the Range which covers all annotations.
 byte[] getDataStream()
           
 DocumentProperties getDocProperties()
           
 Range getEndnoteRange()
          Returns the Range which covers all endnotes.
 Notes getEndnotes()
           
 EscherRecordHolder getEscherRecordHolder()
           
 Fields getFields()
          Returns user-friendly interface to access document Fields
 FieldsTables getFieldsTables()
          Deprecated.  
 Range getFootnoteRange()
          Returns the Range which covers all the Footnotes.
 Notes getFootnotes()
           
 Range getHeaderStoryRange()
          Returns the range which covers all "Header Stories".
 Range getMainTextboxRange()
          Returns the Range which covers all textboxes.
 OfficeDrawings getOfficeDrawingsHeaders()
           
 OfficeDrawings getOfficeDrawingsMain()
           
 Range getOverallRange()
          Returns the range that covers all text in the file, including main text, footnotes, headers and comments
 PicturesTable getPicturesTable()
           
 Range getRange()
          Returns the range which covers the whole of the document, but excludes any headers and footers.
 RevisionMarkAuthorTable getRevisionMarkAuthorTable()
          Gets a reference to the revision mark author table, which holds the revision mark authors for the document.
 SavedByTable getSavedByTable()
          Gets a reference to the saved -by table, which holds the save history for the document.
 ShapesTable getShapesTable()
          Deprecated. use getOfficeDrawingsMain() instead
 byte[] getTableStream()
           
 java.lang.StringBuilder getText()
          Internal method to access document text
 TextPieceTable getTextTable()
           
 int registerList(HWPFList list)
           
 void write(java.io.OutputStream out)
          Writes out the word file that is represented by an instance of this class.
 
Methods inherited from class org.apache.poi.hwpf.HWPFDocumentCore
getCharacterTable, getDocumentText, getFileInformationBlock, getFontTable, getListTables, getObjectsPool, getParagraphTable, getSectionTable, getStyleSheet, verifyAndBuildPOIFS
 
Methods inherited from class org.apache.poi.POIDocument
createInformationProperties, getDocumentSummaryInformation, getPropertySet, getPropertySet, getSummaryInformation, readProperties, writeProperties, writeProperties, writeProperties, writePropertySet
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_tableStream

protected byte[] _tableStream
table stream buffer


_dataStream

protected byte[] _dataStream
data stream buffer


_dop

protected DocumentProperties _dop
Document wide Properties


_cft

protected ComplexFileTable _cft
Contains text of the document wrapped in a obfuscated Word data structure


_text

protected java.lang.StringBuilder _text
Contains text buffer linked directly to single-piece document text piece


_sbt

protected SavedByTable _sbt
Holds the save history for this document.


_rmat

protected RevisionMarkAuthorTable _rmat
Holds the revision mark authors for this document.


_escherRecordHolder

protected EscherRecordHolder _escherRecordHolder
Escher Drawing Group information


_pictures

protected PicturesTable _pictures
Holds pictures table


_officeArts

@Deprecated
protected ShapesTable _officeArts
Deprecated. 
Holds Office Art objects


_officeDrawingsHeaders

protected OfficeDrawingsImpl _officeDrawingsHeaders
Holds Office Art objects


_officeDrawingsMain

protected OfficeDrawingsImpl _officeDrawingsMain
Holds Office Art objects


_bookmarksTables

protected BookmarksTables _bookmarksTables
Holds the bookmarks tables


_bookmarks

protected Bookmarks _bookmarks
Holds the bookmarks


_endnotesTables

protected NotesTables _endnotesTables
Holds the ending notes tables


_endnotes

protected Notes _endnotes
Holds the footnotes


_footnotesTables

protected NotesTables _footnotesTables
Holds the footnotes tables


_footnotes

protected Notes _footnotes
Holds the footnotes


_fieldsTables

protected FieldsTables _fieldsTables
Holds the fields PLCFs


_fields

protected Fields _fields
Holds the fields

Constructor Detail

HWPFDocument

protected HWPFDocument()

HWPFDocument

public HWPFDocument(java.io.InputStream istream)
             throws java.io.IOException
This constructor loads a Word document from an InputStream.

Parameters:
istream - The InputStream that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in InputStream.

HWPFDocument

public HWPFDocument(POIFSFileSystem pfilesystem)
             throws java.io.IOException
This constructor loads a Word document from a POIFSFileSystem

Parameters:
pfilesystem - The POIFSFileSystem that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.

HWPFDocument

@Deprecated
public HWPFDocument(DirectoryNode directory,
                               POIFSFileSystem pfilesystem)
             throws java.io.IOException
Deprecated. Use HWPFDocument(DirectoryNode) instead

This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embedded documents.

Parameters:
pfilesystem - The POIFSFileSystem that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.

HWPFDocument

public HWPFDocument(DirectoryNode directory)
             throws java.io.IOException
This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embeded documents.

Parameters:
directory - The DirectoryNode that contains the Word document.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.
Method Detail

getTextTable

@Internal
public TextPieceTable getTextTable()
Specified by:
getTextTable in class HWPFDocumentCore

getText

@Internal
public java.lang.StringBuilder getText()
Description copied from class: HWPFDocumentCore
Internal method to access document text

Specified by:
getText in class HWPFDocumentCore

getDocProperties

public DocumentProperties getDocProperties()

getOverallRange

public Range getOverallRange()
Description copied from class: HWPFDocumentCore
Returns the range that covers all text in the file, including main text, footnotes, headers and comments

Specified by:
getOverallRange in class HWPFDocumentCore

getRange

public Range getRange()
Returns the range which covers the whole of the document, but excludes any headers and footers.

Specified by:
getRange in class HWPFDocumentCore

getFootnoteRange

public Range getFootnoteRange()
Returns the Range which covers all the Footnotes.

Returns:
the Range which covers all the Footnotes.

getEndnoteRange

public Range getEndnoteRange()
Returns the Range which covers all endnotes.

Returns:
the Range which covers all endnotes.

getCommentsRange

public Range getCommentsRange()
Returns the Range which covers all annotations.

Returns:
the Range which covers all annotations.

getMainTextboxRange

public Range getMainTextboxRange()
Returns the Range which covers all textboxes.

Returns:
the Range which covers all textboxes.

getHeaderStoryRange

public Range getHeaderStoryRange()
Returns the range which covers all "Header Stories". A header story contains a header, footer, end note separators and footnote separators.


characterLength

public int characterLength()
Returns the character length of a document.

Returns:
the character length of a document

getSavedByTable

@Internal
public SavedByTable getSavedByTable()
Gets a reference to the saved -by table, which holds the save history for the document.

Returns:
the saved-by table.

getRevisionMarkAuthorTable

@Internal
public RevisionMarkAuthorTable getRevisionMarkAuthorTable()
Gets a reference to the revision mark author table, which holds the revision mark authors for the document.

Returns:
the saved-by table.

getPicturesTable

public PicturesTable getPicturesTable()
Returns:
PicturesTable object, that is able to extract images from this document

getEscherRecordHolder

@Internal
public EscherRecordHolder getEscherRecordHolder()

getShapesTable

@Deprecated
@Internal
public ShapesTable getShapesTable()
Deprecated. use getOfficeDrawingsMain() instead

Returns:
ShapesTable object, that is able to extract office are shapes from this document

getOfficeDrawingsHeaders

public OfficeDrawings getOfficeDrawingsHeaders()

getOfficeDrawingsMain

public OfficeDrawings getOfficeDrawingsMain()

getBookmarks

public Bookmarks getBookmarks()
Returns:
user-friendly interface to access document bookmarks

getEndnotes

public Notes getEndnotes()
Returns:
user-friendly interface to access document endnotes

getFootnotes

public Notes getFootnotes()
Returns:
user-friendly interface to access document footnotes

getFieldsTables

@Deprecated
@Internal
public FieldsTables getFieldsTables()
Deprecated. 

Returns:
FieldsTables object, that is able to extract fields descriptors from this document

getFields

public Fields getFields()
Returns user-friendly interface to access document Fields

Returns:
user-friendly interface to access document Fields

write

public void write(java.io.OutputStream out)
           throws java.io.IOException
Writes out the word file that is represented by an instance of this class.

Specified by:
write in class POIDocument
Parameters:
out - The OutputStream to write to.
Throws:
java.io.IOException - If there is an unexpected IOException from the passed in OutputStream.

getDataStream

@Internal
public byte[] getDataStream()

getTableStream

@Internal
public byte[] getTableStream()

registerList

public int registerList(HWPFList list)

delete

public void delete(int start,
                   int length)


Copyright 2015 The Apache Software Foundation or its licensors, as applicable.