从Access数据库的“附件”字段中提取文件

时间:2014-09-16 08:33:52

标签: .net database ms-access

我们正在开发一个项目,我们需要将存储在Access数据库中的数据迁移到缓存数据库。 Access数据库包含数据类型为Attachment的列;一些元组包含多个附件。我可以使用.FileName获取这些文件的文件名,但我不确定如何确定一个文件何时结束而另一个文件何时从.FileData开始。

我使用以下内容获取此数据:

System.Data.OleDb.OleDbCommand command= new System.Data.OleDb.OleDbCommand();
command.CommandText = "select [Sheet1].[pdf].FileData,* from [Sheet1]";
command.Connection = conn;
System.Data.OleDb.OleDbDataReader rdr = command.ExecuteReader();

3 个答案:

答案 0 :(得分:9)

(我对此问题的原始回答是误导性的。对于随后使用Adobe Reader打开的PDF文件,它可以正常工作,但对于其他类型的文件并不总是正常工作。以下是更正后的版本。 )

不幸的是,我们无法使用OleDb直接检索Access Attachment字段中文件的内容。 Access数据库引擎会将一些元数据预先添加到文件的二进制内容中,如果我们通过OleDb检索.FileData,则会包含元数据。

为了说明,一个名为" Document1.pdf"使用Access UI保存到“附件”字段。该PDF文件的开头如下所示:

Original.png

如果我们使用以下代码尝试将PDF文件解压缩到磁盘

using (OleDbCommand cmd = new OleDbCommand())
{
    cmd.Connection = con;
    cmd.CommandText = 
            "SELECT Attachments.FileData " +
            "FROM AttachTest " +
            "WHERE Attachments.FileName='Document1.pdf'";
    using (OleDbDataReader rdr = cmd.ExecuteReader())
    {
        rdr.Read();
        byte[] fileData = (byte[])rdr[0];
        using (var fs = new FileStream(
                @"C:\Users\Gord\Desktop\FromFileData.pdf", 
                FileMode.Create, FileAccess.Write))
        {
            fs.Write(fileData, 0, fileData.Length);
            fs.Close();
        }
    }
}

然后生成的文件将包含文件开头的元数据(本例中为20字节)

FromFileData.png

Adob​​e Reader能够打开此文件,因为它足够强大,可以忽略任何"垃圾"可能会出现在'%PDF-1.4'之前的文件中签名。遗憾的是,并非所有文件格式和应用程序都对文件开头的无关字节都很宽容。

从Access中的Attachment字段中提取文件的 only Official™方式是使用ACE DAO .SaveToFile对象的Field2方法,例如这样:

// required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
//
// using Microsoft.Office.Interop.Access.Dao; ...
var dbe = new DBEngine();
Database db = dbe.OpenDatabase(@"C:\Users\Public\Database1.accdb");
Recordset rstMain = db.OpenRecordset(
        "SELECT Attachments FROM AttachTest WHERE ID=1",
        RecordsetTypeEnum.dbOpenSnapshot);
Recordset2 rstAttach = rstMain.Fields["Attachments"].Value;
while ((!"Document1.pdf".Equals(rstAttach.Fields["FileName"].Value)) && (!rstAttach.EOF))
{
    rstAttach.MoveNext();
}
if (rstAttach.EOF)
{
    Console.WriteLine("Not found.");
}
else
{
    Field2 fld = (Field2)rstAttach.Fields["FileData"];
    fld.SaveToFile(@"C:\Users\Gord\Desktop\FromSaveToFile.pdf");
}
db.Close();

请注意,如果您尝试使用Field2对象的.Value,您仍将在字节序列的开头获取元数据; .SaveToFile进程就是剥离它的。

答案 1 :(得分:0)

//您好,

//我花了一段时间拼凑信息来检索从附件字段中存储的文件,所以我只想让id分享它。

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Data.OleDb;
using System.IO;
using System.Diagnostics;

namespace AttachCheck
{
public partial class Form1 : Form
{
    DataSet Set1 = new DataSet();
    int ColId;

    public Form1()
    {
        InitializeComponent();

        OleDbConnection connect = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source='db/Adb.accdb'"); //set up connection
        //CL_ID is a fk so attachments can be linked to users
        OleDbCommand sql = new OleDbCommand("SELECT at_ID, [at_Name].[FileData], [at_Name].[FileName], [at_Name].[FileType] FROM Attachments WHERE at_ID =1;", connect);
        //adding sql to addapter to be ran

        OleDbDataAdapter OleDA = new OleDbDataAdapter(sql);
        //attempting to open connection
        try { connect.Open(); }
        catch (Exception err) { System.Console.WriteLine(err); }


        OleDA.Fill(Set1); //create and fill dataset
        connect.Close();for (int i = 0; i < Set1.Tables[0].Rows.Count; i++)
        {
            System.Console.WriteLine(Set1.Tables[0].Rows[i]["at_Name.FileName"].ToString() + "This is the file name");


        // by using a datagrid it allows you to display the attachments and select which to open, the open should be a button.
        dataGridView1.Rows.Add(new object[] { Set1.Tables[0].Rows[i]["at_ID"].ToString(), Set1.Tables[0].Rows[i]["at_Name.FileName"].ToString(), "Open" });
        }
    }

    private void dataGridView1_CellContentClick(object sender, DataGridViewCellEventArgs e)
    {

        DataGridViewCell cell = (DataGridViewCell)
        dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex];

        System.Console.WriteLine(dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex]);
        string FullRow = dataGridView1.Rows[e.RowIndex].ToString(); //data retrieved from click on datagrid 
        //need to sub string to cut away row index and leave number
        string SubRow = FullRow.Substring(24, 1); //cutting string down from position 24 for 1 character

        System.Console.WriteLine(SubRow + " This is Row"); //

        int RowId = int.Parse(SubRow); //turn row number from string into integer that can be used

        string FullRow2 = dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex].ToString(); //data retrieved from click on datagrid 
        //need to sub string to cut away row index and leave number
        string SubRow2 = FullRow2.Substring(37, 1); //cutting string down from position 24 for 1 character
        System.Console.WriteLine(SubRow2 + " This is Column"); //
        int ColId = int.Parse(SubRow2); //turn row number from string into integer that can be used


        if (ColId == 2)
        {
            string fileName = Set1.Tables[0].Rows[RowId]["at_Name.FileName"].ToString(); //assign the file to variable

            //retrieving the file contents from the database as an array of bytes
            byte[] fileContents = (byte[])Set1.Tables[0].Rows[RowId]["at_Name.FileData"];


            fileContents = GetFileContents(fileContents); //send filecontents array to be decrypted

            string fileType = Set1.Tables[0].Rows[RowId]["at_Name.FileType"].ToString();


            DisplayTempFile(fileName, fileContents, fileType); //forward the file type to display file contents   
        }
    }

    private const int CONTENT_START_INDEX_DATA_OFFSET = 0; //values used for decoding 
    private const int UNKNOWN_DATA_OFFSET = 4; //the files
    private const int EXTENSION_LENGTH_DATA_OFFSET = 8; //storedw within the access database
    private const int EXTENSION_DATA_OFFSET = 12; //and this one


    private byte[] GetFileContents(byte[] fileContents)
    {

        int contentStartIndex = BitConverter.ToInt32(fileContents, CONTENT_START_INDEX_DATA_OFFSET);

        //'The next four bytes represent a value whose meaning is unknown at this stage, although it may represent a Boolean value indicating whether the data is compressed or not.
        int unknown = BitConverter.ToInt32(fileContents, UNKNOWN_DATA_OFFSET);

        //'The next four bytes contain the the length, in characters, of the file extension.
        int extensionLength = BitConverter.ToInt32(fileContents, EXTENSION_LENGTH_DATA_OFFSET);

        //'The next field in the header is the file extension, not including a dot but including a null terminator.
        //'Characters are Unicode so double the character count to get the byte count.
        string extension = Encoding.Unicode.GetString(fileContents, EXTENSION_DATA_OFFSET, extensionLength * 2);
        return fileContents.Skip(contentStartIndex).ToArray();


    }


    private void DisplayTempFile(string fileName, byte[] fileContents, string fileType)
    {

        // System.Console.WriteLine(fileName + "File Name");
        // System.Console.WriteLine(fileType + "File Type");
        // System.Console.WriteLine(fileContents + "File Contents");

        string tempFolderPath = Path.GetTempPath(); //creating a temperary path for file to be opened from
        string tempFilePath = Path.Combine(tempFolderPath, fileName); // assigning the file to the path

        if (!string.IsNullOrEmpty(tempFilePath)) //checking the temp file exists
        {
            tempFilePath = Path.Combine(tempFolderPath, //combines the strings 0 and 1 below
            String.Format("{0}{1}",
            Path.GetFileNameWithoutExtension(fileName),      //0                                                    
            Path.GetExtension(fileName))); //1
        }

        //System.Console.WriteLine(tempFolderPath + " tempFolderPath");
        //System.Console.WriteLine(tempFilePath + " tempFilePath");

        //'Save the file and open it.
        File.WriteAllBytes(tempFilePath, fileContents);
        //creates new file, writes bytes array to it then closes the file
        //File.ReadAllBytes(tempFilePath);

        //'Open the file.
        System.Diagnostics.Process attachmentProcess = Process.Start(tempFilePath);
        //chooses the program to open the file if available on the computer

    }
}

}

//希望这有助于某人

答案 2 :(得分:0)

以下代码遍历Microsoft Access数据库数据表的所有记录,并将每行分配给记录集。浏览保存在字段&#34; Docs&#34;中的所有附件。然后在磁盘上提取并保存这些文件。 此代码是&#34; Gord Thompson&#34;引入的代码的扩展。以上。 我唯一做的就是为Visual Basic.NET编写了代码。

Imports Microsoft.Office.Interop.Access.Dao

使用上面的代码行来引用Dao。

'Visual Basic.NET
Private Sub ReadAttachmentFiles()
    'required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
    'define a new database engine and a new database
    Dim dbe = New DBEngine
    Dim db As Database = dbe.OpenDatabase("C:\Users\Meisam\Documents\Databases\myDatabase.accdb")
    'define the main recordset object for each row
    Dim rstMain As Recordset = db.OpenRecordset( _
            "SELECT * FROM Companies", _
            RecordsetTypeEnum.dbOpenSnapshot)
    'evaluate whether the recordset is empty of records
    If Not (rstMain.BOF And rstMain.EOF) Then
        'if not empty, then move to the first record
        rstMain.MoveFirst()
        'do until the end of recordset is not reached
        Do Until rstMain.EOF
            Dim myID As Integer = -1
            ' ID is the name of primary field with uniqe values field 
            myID = CInt(rstMain.Fields("ID").Value)
            'define the secondary recordset object for the attachment field "Docs"
            Dim rstAttach As Recordset2 = rstMain.Fields("Docs").Value
            'evaluate whether the recordset is empty of records
            If Not (rstAttach.BOF And rstAttach.EOF) Then
                'if not empty, then move to the first record
                rstAttach.MoveFirst()
                'do until the end of recordset is not reached
                Do Until rstAttach.EOF
                    'get the filename for each attachment in the field "Docs"
                    Dim fileName As String = rstAttach.Fields("FileName").Value
                    Dim fld As Field2 = rstAttach.Fields("FileData")
                    fld.SaveToFile("C:\Users\Meisam\Documents\test\" & myID & "_" & fileName)
                    rstAttach.MoveNext()
                Loop
            End If
            rstMain.MoveNext()
        Loop
    End If
    'close the database
    db.Close()
End Sub