使用表格读取docx文件

时间:2018-09-24 18:13:18

标签: c# docx xceed

我有一个简单的文档,里面有一张桌子。我想阅读它的单元格内容。我找到了许多写作教程,但没有阅读教程。

我想我应该枚举部分,但是如何知道哪个包含一个表呢?

var document = DocX.Create(@"mydoc.docx");

var s = document.GetSections();
foreach (var item in s)
{

}

2 个答案:

答案 0 :(得分:0)

我正在使用以下名称空间别名:

using excel = Microsoft.Office.Interop.Excel;
using word = Microsoft.Office.Interop.Word; 

您可以使用以下代码专门获取表格:

        private void WordRunButton_Click(object sender, EventArgs e)
        {

            var excelApp = new excel.Application();
            excel.Workbooks workbooks = excelApp.Workbooks;
            var wordApp = new word.Application();
            word.Documents documents = wordApp.Documents;
            wordApp.Visible = false; 
            excelApp.Visible = false;
// You don't want your computer to actually load each one visibly; would ruin performance.

            string[] fileDirectories = Directory.GetFiles("Some Directory", "*.doc*",
                   SearchOption.AllDirectories);

            foreach (var item in fileDirectories)
            {
                word._Document document = documents.Open(item);

                foreach (word.Table table in document.Tables)
                {
                        string wordFile = item;
                        appendName = Path.GetFileNameWithoutExtension(wordFile) + " Table " + tableCount + ".xlsx"; 
                       //Not needed if you're not going to save each table individually

                        var workbook = excelApp.Workbooks.Add(1);
                        excel._Worksheet worksheet = (excel.Worksheet)workbook.Sheets[1];

                        for (int row = 1; row <= table.Rows.Count; row++)
                        {
                            for (int col = 1; col <= table.Columns.Count; col++)
                            {

                                var cell = table.Cell(row, col);
                                var range = cell.Range;
                                var text = range.Text;

                                var cleaned = excelApp.WorksheetFunction.Clean(text);

                                worksheet.Cells[row, col] = cleaned;
                            }
                        }
                        workbook.SaveAs(Path.Combine("Some Directory", Path.GetFileName(appendName)), excel.XlFileFormat.xlWorkbookDefault); 
                        //Last arg can be whatever file extension you want 
                        //just make sure it matches what you set above.

                        workbook.Close();
                        Marshal.ReleaseComObject(workbook);

                    tableCount++;
                }

                document.Close();
                Marshal.ReleaseComObject(document);
            }
//Microsoft apps are picky with memory. Make sure you close and release each instance once you're done with it.
//Failure to do so will result in many lingering apps in the background
            excelApp.Application.Quit();
            workbooks.Close();
            excelApp.Quit();

            Marshal.ReleaseComObject(workbooks);
            Marshal.ReleaseComObject(excelApp);

            wordApp.Application.Quit();
            wordApp.Quit();

            Marshal.ReleaseComObject(documents);
            Marshal.ReleaseComObject(wordApp);
        }

文档是实际的word文档类型(word.Document)。确保您已检查是否有分裂的细胞!

希望这会有所帮助!

答案 1 :(得分:0)

如果文档中只有一个表,那应该很简单。试试这个:

DocX doc = DocX.Load("C:\\Temp\\mydoc.docx");
Table t = doc.Table[0];
//read cell content
string someText = t.Rows[0].Cells[0].Paragraps[0].Text;

如果有更多的段落,则可以遍历表行和每行内的表单元格,也可以遍历每个单元格[i]内的段落。您可以使用简单的for循环来做到这一点:

for (int i = 0; i < t.Rows.Count; i++)
{
someText = t.Rows[i].Cells[0].Paragraphs[0].Text;
}

希望有帮助。