Package org.apache.poi.hwpf.extractor
Class Word6Extractor
java.lang.Object
org.apache.poi.extractor.POITextExtractor
org.apache.poi.extractor.POIOLE2TextExtractor
org.apache.poi.hwpf.extractor.Word6Extractor
- All Implemented Interfaces:
Closeable
,AutoCloseable
Class to extract the text from old (Word 6 / Word 95) Word Documents.
This should only be used on the older files, for most uses you
should call
WordExtractor
which deals properly
with HWPF.- Author:
- Nick Burch
-
Field Summary
Fields inherited from class org.apache.poi.extractor.POIOLE2TextExtractor
document
-
Constructor Summary
ConstructorsConstructorDescriptionCreate a new Word ExtractorCreate a new Word ExtractorWord6Extractor
(DirectoryNode dir, POIFSFileSystem fs) Deprecated.Create a new Word Extractor -
Method Summary
Methods inherited from class org.apache.poi.extractor.POIOLE2TextExtractor
getDocSummaryInformation, getDocument, getMetadataTextExtractor, getRoot, getSummaryInformation
Methods inherited from class org.apache.poi.extractor.POITextExtractor
close, setFilesystem
-
Constructor Details
-
Word6Extractor
Create a new Word Extractor- Parameters:
is
- InputStream containing the word file- Throws:
IOException
-
Word6Extractor
Create a new Word Extractor- Parameters:
fs
- POIFSFileSystem containing the word file- Throws:
IOException
-
Word6Extractor
Deprecated.UseWord6Extractor(DirectoryNode)
instead- Throws:
IOException
-
Word6Extractor
- Throws:
IOException
-
Word6Extractor
Create a new Word Extractor- Parameters:
doc
- The HWPFOldDocument to extract from
-
-
Method Details
-
getParagraphText
Deprecated.Get the text from the word file, as an array with one String per paragraph -
getText
Description copied from class:POITextExtractor
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Specified by:
getText
in classPOITextExtractor
- Returns:
- All the text from the document
-
Word6Extractor(DirectoryNode)
instead