我正在尝试使用此代码提取单词内容:
// Open a doc file.
Application application = new Application();
Document document = application.Documents.Open("d:\\a.doc");
// Loop through all words in the document.
int count = document.Words.Count;
for (int i = 1; i <= count; i++)
{
// Write the word.
string text = document.Words[i].Text;
Console.WriteLine("Word {0} = {1}", i, text);
}
// Close word.
application.Quit();
但是在跑完之后我得到了这个错误:
Unable to cast COM object of type 'Microsoft.Office.Interop.Word.ApplicationClass' to interface type
'Microsoft.Office.Interop.Word._Application'. This operation failed
because the QueryInterface call on the COM component for the
interface with IID '{00020970-0000-0000-C000-000000000046}'
failed due to the following error:
No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)).
我已安装Office 2013
答案 0 :(得分:0)
我终于下载aspire.doc
nuget并使用它来提取word文件的内容,如你所见:
Document document = new Document();
document.LoadFromFile(@"d:\a.docx");
//Initialzie StringBuilder Instance
StringBuilder sb = new StringBuilder();
//Extract Text from Word and Save to StringBuilder Instance
foreach (Section section in document.Sections)
{
foreach (Paragraph paragraph in section.Paragraphs)
{
sb.AppendLine(paragraph.Text);
}
}
//Create a New TXT File to Save Extracted Text
Console.WriteLine(sb.ToString());
Console.ReadLine();