自动填写I-9 PDF XFA表格

时间:2017-01-06 16:39:37

标签: xml vb.net pdf itext xfa

早上好。我希望有人可以帮我解决这个问题。去年我使用iTextSharp设置了一个VB.NET程序,用户可以输入信息来填充I9,并且该信息将填入PDF和​​打印。随着新的I9,我遇到了不明身份的困难。

首先,代码没有错误或任何错误。我只是得到一个糟糕的结果,因为我得到一个PDF,上面写着“你要加载的文档需要Adobe Reader 8或更高版本。你可能没有安装Adobe Reader ......”等等,而不是填充表格。所以,我确保我有最新的Reader版本,再次尝试和相同的结果。

考虑到字段名称结构可能发生了变化,我试图像第一次那样阅读格式/字段。 (以下代码)。但是,现在它告诉我没有要阅读的字段(AcroFields.Fields.Count = 0)。

Private Sub ListFieldNames(pdfTemplate As String)
    Dim pdfTemplate As String = "c:\Temp\PDF\fw4.pdf"
    Dim pdfReader As PdfReader = New PdfReader(pdfTemplate)
    Dim de As KeyValuePair(Of String, iTextSharp.text.pdf.AcroFields.Item)

    For Each de In pdfReader.AcroFields.Fields
        Console.WriteLine(de.Key.ToString())
    Next
End Sub

所以,我开始做一些搜索,发现他们可以切换到的另一种PDF结构的引用; XFA。老实说,我还没有找到任何令人满意的文档/样本,但我确实找到了一些看起来应该用于阅读XFA PDF结构的代码。 (以下代码)。我试过这里有两种不同的方法。第一个基本上表明xfaFields中没有xmlNodes。第二个确实找到一个名为“data”的节点(这是它找到的唯一节点),但没有找到任何子节点。

Private Sub ReadXfa(pdfTemplate As String)
    pdfReader.unethicalreading = True
    Dim readerPDF As New PdfReader(pdfTemplate)

    Dim xfaFields = readerPDF.AcroFields.Xfa.DatasetsSom.Name2Node

    For Each xmlNode In xfaFields
        Console.WriteLine(xmlNode.Value.Name + ":" + xmlNode.Value.InnerText)
    Next
    'Example of how to get a field value
    '   Dim lastName = xfaFields.First(Function(a) a.Value.Name = "textFieldLastNameGlobal").Value.InnerText


    Dim reader As New PdfReader(pdfTemplate)
    Dim xfa As New XfaForm(reader)
    Dim node As XmlNode = xfa.DatasetsNode()
    Dim list As XmlNodeList = node.ChildNodes()
    For i As Integer = 0 To list.Count - 1
        Console.WriteLine(list.Item(i).LocalName())
        If "data".Equals(list.Item(i).LocalName()) Then
            node = list.Item(i)
            Exit For
        End If
    Next
    list = node.ChildNodes()
    For i As Integer = 0 To list.Count - 1
        Console.WriteLine(list.Item(i).LocalName())
    Next
    reader.Close()
End Sub

https://www.uscis.gov/system/files_force/files/form/i-9.pdf?download=1

以上链接转到政府提供的i9 PDF。

所以......我猜我有很多问题。最简单的是,如果有人做了这个过程/他们可以帮助我。除此之外,如果有人可以指出我正确的方向如何从这个新的PDF文件读/写,这将是惊人的。坦率地说,我甚至不确定如何确定他们使用的表单的“类型” - AcroFieldXFA,还有其他什么?

非常感谢您的时间/帮助!

2 个答案:

答案 0 :(得分:2)

首先,抱歉,我不再做vb.net,但您应该能够转换后面的代码。

您已经发现新表单是XFA。这是一种简单的非编程方式来查看表单字段和数据。您注意到您升级了Adobe Reader版本,所以我猜您正在使用Reader DC。从菜单选项:

Edit => Form Options => Export Data...

将表单导出到您可以检查的XML文件。 XML文件为您提供了填写表单所需的相应XML文档的提示,这与使用AcroForm完成的方式完全不同。

这里有一些简单的代码可以帮助您入门。首先是读取空白XML文档并更新它的方法:

public string FillXml(Dictionary<string, string> fields)
{
    // XML_INFILE => physical path to XML file exported from I-9
    XDocument xDoc = XDocument.Load(XML_INFILE);
    foreach (var kvp in fields)
    {
        // handle multiple elements in I-9 form
        var elements = xDoc.XPathSelectElements(
            string.Format("//{0}", kvp.Key)
        );
        if (elements.Count() > 0)
        {
            foreach (var e in elements) { e.Value = kvp.Value; }
        }
    }

    return xDoc.ToString();
}

现在我们有了一个创建有效XML的方法,请在表单字段中填入一些示例数据:

var fields = new Dictionary<string, string>()
{
    { "textFieldLastNameGlobal", "Doe" },
    { "textFieldFirstNameGlobal", "Jane" }
};
var filledXml = FillXml(fields);

using (var ms = new MemoryStream())
{
    // PDF_READER => I-9 PdfReader instance
    using (PDF_READER)
    {
        // I-9 has password security
        PdfReader.unethicalreading = true;
        // maintain usage rights on output file
        using (var stamper = new PdfStamper(PDF_READER, ms, '\0', true))
        {
            XmlDocument doc = new XmlDocument();
            doc.LoadXml(filledXml);
            stamper.AcroFields.Xfa.FillXfaForm(doc.DocumentElement);
        }
    }
    File.WriteAllBytes(OUTFILE, ms.ToArray());
}

要回答您的上一个问题,如何确定表单&#39;键入&#39;,请使用PdfReader实例,如下所示:

PDF_READER.AcroFields.Xfa.XfaPresent

true表示XFA,false表示AcroForm。

答案 1 :(得分:1)

这是我的最终代码,以防有人在那里使用它...我确实有一个On Error Resume Next,因为i9是一个非常挑剔的形式,我选择填写一些事情与他们想要的方式略有不同。我也在切断了我设置一些变量以便缩短变量的地方。再次感谢kuujinbo的帮助!

Private Sub ExportI9()
    Dim pdfTemplate As String = Path.Combine(Application.StartupPath, "PDFs\2017-I9.pdf")
    pdfTemplate = Replace(pdfTemplate, "bin\Debug\", "")


    Dim fields = New Dictionary(Of String, String)() From {
    {"textFieldLastNameGlobal", Me.tbLast.Text},
    {"textFieldFirstNameGlobal", Me.tbFirst.Text},
    {"textFieldMiddleInitialGlobal", Mid(Me.tbMiddle.Text, 1, 1)},
    {"textFieldOtherNames", Me.tbOtherName.Text},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldAddress", addr1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldAptNum", ""},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldCityOrTown", city1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/State", state1},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Top/subEmployeeInfo/subSection1Row2/textFieldZipCode", zip1},
    {"dateFieldBirthDate", Me.dtpBirth.Value},
    {"SSN", Me.tbSSN.Text},
    {"fieldEmail", ""},
    {"fieldPhoneNum", sphone},
    {"radioButtonListCitizenship", citizenship},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subCitizenshipStatus/textFieldResidentType", alienuscis},
    {"dateAlienAuthDate", dauth},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subAuthorizedAlien/numFormI94Admission", Me.tbi94.Text},
    {"numForeignPassport", Me.tbPassport.Text},
    {"CountryofIssuance", Me.tbPassportCountry.Text},
    {"numAlienOrUSCIS", usc},
    {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subAuthorizedAlien/textFieldResidentType", alienuscis},
    {"rbListPerparerOrTranslator", 3},
    {"dropdownMultiPreparerOrTranslator", 1},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow2/textFieldFirstName", prepfirst},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow2/textFieldLastName", preplast},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldAddress", Replace(prepadd, "#", "No. ")},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldCityOrTown", prepcity},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/State", prepstate},
        {"form1/section1Page1/subSection1PositionWrapper/subSection1Bottom/subPreparerTranslator/subPrepererTranslator1/subTranslatorSignature/subRow3/textFieldZipCode", prepzip},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/selectListA1DocumentTitle", doctitle1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/selectListBDocumentTitle", doctitle2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/selectListCDocumentTitle", doctitle3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/textFieldIssuingAuthority", issued1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/textFieldIssuingAuthority", issued2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/textFieldIssuingAuthority", issued3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/dateExpiration", expdate1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/dateExpiration", expdate2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/dateExpiration", expdate3},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subDocListA1/textFieldDocumentNumber", docnum1},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListB/textFieldDocumentNumber", docnum2},
    {"form1/section2and3Page2/subSection2/subVerificationListsBorder/subListBandCBorder/subDocListC/textFieldDocumentNumber", docnum3},
        {"form1/section2and3Page2/subSection2/subCertification/subAttest/dateEmployeesFirstDay", CDate(Me.dtpHired.Value).ToShortDateString},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow2/textFieldLastName", certlast},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow2/textFieldFirstName", certfirst},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldAddress", orgadd},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldCityOrTown", orgcity},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/State", orgstate},
        {"form1/section2and3Page2/subSection2/subCertification/subEmployerInformation/subEmployerInfoRow3/textFieldZipCode", orgzip},
        {"textBusinessOrgName", orgname}
    }


    Dim PDFUpdatedFile As String = pdfTemplate
    PDFUpdatedFile = Replace(PDFUpdatedFile, "I9", Me.tbSSN.Text & "-I9")
    If System.IO.File.Exists(PDFUpdatedFile) Then System.IO.File.Delete(PDFUpdatedFile)
    Dim readerPDF As New PdfReader(pdfTemplate)


    Dim filledXml = FillXml(fields)
    Using ms = New MemoryStream()
        Using readerPDF
            ' I-9 has password security
            PdfReader.unethicalreading = True
            Dim stamper As New PdfStamper(readerPDF, ms, ControlChars.NullChar, True)
            Using stamper
                Dim doc As New XmlDocument()
                doc.LoadXml(filledXml)
                stamper.AcroFields.Xfa.FillXfaForm(doc.DocumentElement)
            End Using
        End Using
        File.WriteAllBytes(PDFUpdatedFile, ms.ToArray())
    End Using
End Sub


Public Function FillXml(fields As Dictionary(Of String, String)) As String
    ' XML_INFILE => physical path to XML file exported from I-9
    Dim xmlfile As String

    xmlfile = Path.Combine(Application.StartupPath, "PDFs\2017-I9_data.xml")
    xmlfile = Replace(xmlfile, "bin\Debug\", "")
    Dim kvp As KeyValuePair(Of String, String)

    Dim xDoc As XDocument = XDocument.Load(xmlfile)
    For Each kvp In fields
        ' handle multiple elements in I-9 form
        Dim elements = xDoc.XPathSelectElements(String.Format("//{0}", kvp.Key))
        If elements.Count() > 0 Then
            For Each e As XElement In elements
                On Error Resume Next
                e.Value = kvp.Value
            Next
        End If
    Next

    Return xDoc.ToString()
End Function