验证pdf完整性失败

时间:2015-01-12 10:40:28

标签: security pdf pkcs#7

我试图通过bash命令验证pdf文件的完整性。

使用dd我提取了pdf的signedContent和pkcs7分离对象。

然后我通过

解码了pkcs
xxd -r -p pkcs7_extracted > pkcs7_extracted.bin

openssl asn1parse -inform DER <pkcs7_extracted.bin >pkcs7_extracted_decoded

从解码的pkcs7我得到了一些有用的信息

 0:d=0  hl=4 l=5498 cons: SEQUENCE         
 4:d=1  hl=2 l=   9 prim: OBJECT            :pkcs7-signedData
 15:d=1  hl=4 l=5483 cons: cont [ 0 ]        
 19:d=2  hl=4 l=5479 cons: SEQUENCE          
 23:d=3  hl=2 l=   1 prim: INTEGER           :01
 26:d=3  hl=2 l=  15 cons: SET               
 28:d=4  hl=2 l=  13 cons: SEQUENCE          
 30:d=5  hl=2 l=   9 prim: OBJECT            :sha256
 41:d=5  hl=2 l=   0 prim: NULL              
 43:d=3  hl=2 l=  11 cons: SEQUENCE          
 ...
 5154:d=7  hl=2 l=   9 prim: OBJECT            :contentType
 5165:d=7  hl=2 l=  11 cons: SET               
 5167:d=8  hl=2 l=   9 prim: OBJECT            :pkcs7-data
 5178:d=6  hl=2 l=  47 cons: SEQUENCE          
 5180:d=7  hl=2 l=   9 prim: OBJECT            :messageDigest
 5191:d=7  hl=2 l=  34 cons: SET               
 5193:d=8  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2
 5227:d=5  hl=2 l=  13 cons: SEQUENCE          
 5229:d=6  hl=2 l=   9 prim: OBJECT            :sha256WithRSAEncryption
 5240:d=6  hl=2 l=   0 prim: NULL              
 5242:d=5  hl=4 l= 256 prim: OCTET STRING      [HEX DUMP]:8F4B21914173EC57E6B0533BB5E04FB7054F23AC299C1BDBF589ED164A3EABB611727BE9117AAC3161D9C18DCA08BD113DD3AA90E5922009FA12BA59E7F6587E81CD79BDED09F862C2C76F35D950926F1A31A3DCCE999A52DCE0C7F67D081E81A44397E8AF96A1051B8E51F2E2271221B06D05C9895E1846B1DBE02B558F5B9EF97C7EB0FF9A7C71A9764D5E205900818F07E82027D79D3F9A5AA72B3A0CF131F1B890D0BCBF3C4DD8A0229FABE15F6C2CA0CE079EB925B3998A1A6190596A88D8F07C1C12B8750636E69108E30E643A653B285A400080C9C5590C112451F6D69BAFC2686D6F1107B37A5DB36B9F797C49E61D4B44E62E17DD541778DE763AC5
 5502:d=0  hl=2 l=   0 prim: EOC              

特别是我注意到messageDigest字段等于使用ByteRange获得的signedContent的计算摘要。

我已经提取了加密的哈希值,用我的公钥解密,然后用asn1命令再次解码。

dd if=pkcs7_extracted.bin of=extracted.sign.bin bs=1 skip=$[ 5242 + 4 ] count=256

#decrypt

openssl rsautl -verify -pubin -inkey publickey.pem < extracted.sign.bin > verified.bin

#decode of result
openssl asn1parse -inform der -in verified.bin

结果就是这个对象

0:d=0  hl=2 l=  49 cons: SEQUENCE          
2:d=1  hl=2 l=  13 cons: SEQUENCE          
4:d=2  hl=2 l=   9 prim: OBJECT            :sha256
15:d=2  hl=2 l=   0 prim: NULL              
17:d=1  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:EBAA31519CD0CCA793FEC34AA6BDD8DFA5E4D5F63BA4711F6C8ECE5D20FEF393

我非常确定解密是有效的,因为对象被正确解码并且我预期包含sha256对象,但正如您所看到的,摘要值不同 ......

我在寻找错误的地方吗?我不知道如何验证完整性。

此外,Acrobat当然会验证此签名文档的完整性。

提前感谢!

1 个答案:

答案 0 :(得分:0)

请注意,在SignedData对象中,需要考虑多个哈希值,这些哈希值通常不相等。

RFC 3852中查看加密消息语法(CMS)对象的定义。

RFC 3852是从当前PDF规范ISO 32000-1引用的RFC;因此,即使它被 RFC 5652 废弃,新RFC中的更改也可能不适用于此背景。)

  SignedData ::= SEQUENCE {
    version CMSVersion,
    digestAlgorithms DigestAlgorithmIdentifiers,
    encapContentInfo EncapsulatedContentInfo,
    certificates [0] IMPLICIT CertificateSet OPTIONAL,
    crls [1] IMPLICIT RevocationInfoChoices OPTIONAL,
    signerInfos SignerInfos }

...

  SignerInfo ::= SEQUENCE {
    version CMSVersion,
    sid SignerIdentifier,
    digestAlgorithm DigestAlgorithmIdentifier,
    signedAttrs [0] IMPLICIT SignedAttributes OPTIONAL,
    signatureAlgorithm SignatureAlgorithmIdentifier,
    signature SignatureValue,
    unsignedAttrs [1] IMPLICIT UnsignedAttributes OPTIONAL }

...

  SignedAttributes ::= SET SIZE (1..MAX) OF Attribute

...

  signedAttrs is a collection of attributes that are signed.  The
  field is optional, but it MUST be present if the content type of
  the EncapsulatedContentInfo value being signed is not id-data.
  SignedAttributes MUST be DER encoded, even if the rest of the
  structure is BER encoded.  Useful attribute types, such as signing
  time, are defined in Section 11.  If the field is present, it MUST
  contain, at a minimum, the following two attributes:

     A content-type attribute having as its value the content type
     of the EncapsulatedContentInfo value being signed.  Section
     11.1 defines the content-type attribute.  However, the
     content-type attribute MUST NOT be used as part of a
     countersignature unsigned attribute as defined in section 11.4.

     A message-digest attribute, having as its value the message
     digest of the content.  Section 11.2 defines the message-digest
     attribute.

...

  The result of the message digest calculation process depends on
  whether the signedAttrs field is present.  When the field is absent,
  the result is just the message digest of the content as described
  above.  When the field is present, however, the result is the message
  digest of the complete DER encoding of the SignedAttrs value
  contained in the signedAttrs field.  Since the SignedAttrs value,
  when present, must contain the content-type and the message-digest
  attributes, those values are indirectly included in the result.

因此,你的观察

  

messageDigest字段等于使用ByteRange获得的signedContent的计算摘要。

 5178:d=6  hl=2 l=  47 cons: SEQUENCE          
 5180:d=7  hl=2 l=   9 prim: OBJECT            :messageDigest
 5191:d=7  hl=2 l=  34 cons: SET               
 5193:d=8  hl=2 l=  32 prim: OCTET STRING      [HEX DUMP]:18B399D208A08815DDF23C93B1B63B13757A6AA24B1932569D7A69D0DB3A34C2

表示正确的数据已签名,因为 message-digest属性值应为内容的消息摘要

但是你也可以在这里阅读,实际内部签名字节(你解密的)签名的数据不是内容的消息摘要而是 signedAttrs属性集合

因此,您不得针对内容哈希验证这些签名字节,而是针对签名属性哈希,如RFC。

PS:OP同时在CMS签名数据验证主题上找到了this other answer,其中还说明了如何更加图形化地识别哪些属性已签名,哪些属性未签名。

PPS:OP通过解密签名字节进行验证,提取包含的哈希值,并将其与实际哈希值进行比较。这适用于基于RSA的签名。但是,基于DSA或ECDSA的签名无法解密,因此无法提取哈希值。必须使用特殊验证程序进行验证。

PPPS:有不同风格的集成PDF签名。虽然这里使用的样式(PKCS7 / CAdES分离)是最常见和推荐的样式,但在通用解决方案中,必须事先检查并相应地进行验证。