Class XWPFWordExtractor

All Implemented Interfaces:
Closeable, AutoCloseable

public class XWPFWordExtractor extends POIXMLTextExtractor
Helper class to extract text from an OOXML Word file
  • Field Details

    • SUPPORTED_TYPES

      public static final XWPFRelation[] SUPPORTED_TYPES
  • Constructor Details

  • Method Details

    • main

      public static void main(String[] args) throws Exception
      Throws:
      Exception
    • setFetchHyperlinks

      public void setFetchHyperlinks(boolean fetch)
      Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contents
    • setConcatenatePhoneticRuns

      public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)
      Should we concatenate phonetic runs in extraction. Default is true
      Parameters:
      concatenatePhoneticRuns -
    • getText

      public String getText()
      Description copied from class: POITextExtractor
      Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.
      Specified by:
      getText in class POITextExtractor
      Returns:
      All the text from the document
    • appendBodyElementText

      public void appendBodyElementText(StringBuilder text, IBodyElement e)
    • appendParagraphText

      public void appendParagraphText(StringBuilder text, XWPFParagraph paragraph)