WordToText.Net - Extract text from MS Office Documents
What is WordToText.Net?
WordToText.Net is MS .Net conversion of java based Jakarta POI - Java API To Access Microsoft Format Files library. We have
tried to keep all the features of this library same as original java implementation. In this first public release we have only
focused on utilties that extract text content from the MS Office documents. We have not emphasized on image and other embedded object's rendering
and extraction. Our goal is to compliment .Net implementation of Lucene search engine library avaiable from
- Conversion of MS Word Documents To Text
- Conversion of MS Excel Documents To Text
- Conversion of MS Power Point PResenation To Text
- 100% managed .Net implementation
- Complete source code availability
V2.1 Release - 1/22/2006
- Added HWPF project to the library which gives you more control on Word documents.
- Bug fixes and performance enhancements.
V1.2 Release - 7/7/2005
- First public release of the component.