如何从这样的字符串中获取字段?

时间:2009-11-20 14:46:48

标签: .net regex format serialization

我正在从数据库导入一些数据。数据已经由用PHP编写的CMS存储,我无法控制。这是数据(来自paypal响应的密集报告):

a:56:{
s:8:"business";s:19:"abcd@abcdefghij.com";
s:14:"receiver_email";s:19:"abcd@abcdefghij.com";
s:11:"receiver_id";s:13:"KVBRSDFJKLWYE";
s:9:"item_name";s:4:"ABCD";
s:11:"item_number";s:1:"7";
s:8:"quantity";s:1:"1";
s:7:"invoice";s:0:"";
s:6:"custom";s:3:"800";
s:4:"memo";s:0:"";
s:3:"tax";s:4:"0.00";
s:12:"option_name1";s:0:"";
s:17:"option_selection1";s:0:"";
s:12:"option_name2";s:0:"";
s:17:"option_selection2";s:0:"";
s:14:"num_cart_items";s:1:"1";
s:8:"mc_gross";s:6:"255.00";
s:6:"mc_fee";s:5:"19.75";
s:11:"mc_currency";s:3:"USD";
s:13:"payment_gross";s:6:"255.00";
s:11:"payment_fee";s:5:"19.75";
s:14:"payment_status";s:9:"Completed";
s:14:"pending_reason";s:0:"";
s:11:"reason_code";s:0:"";
s:12:"payment_date";s:25:"02:11:51 Sep 15, 2006 PDT";
s:6:"txn_id";s:17:"1EG20446283704116";
s:8:"txn_type";s:4:"cart";
s:12:"payment_type";s:7:"instant";
s:10:"first_name";s:5:"abcde";
s:9:"last_name";s:6:"Abcdef";
s:19:"payer_business_name";s:0:"";
s:12:"address_name";s:12:"abcde Abcdef";
s:14:"address_street";s:24:"asdkjhgfs;lkefh sdfkj 21";
s:12:"address_city";s:15:"agflkjsgkjhsddg";
s:13:"address_state";s:3:"HDJ";
s:11:"address_zip";s:5:"64525";
s:20:"address_country_code";s:2:"DE";
s:15:"address_country";s:7:"Germany";
s:14:"address_status";s:11:"unconfirmed";
s:11:"payer_email";s:15:"thgjk@sjghjk.de";
s:8:"payer_id";s:13:"U89LQDFJGKCJG";
s:12:"payer_status";s:8:"verified";
s:9:"member_id";s:3:"800";
s:11:"verify_sign";s:56:"A1JC72dfgkljhdghjwlQocysUrWOAXNp57t4TP6QkJgCt9.qk7A4UuEq";
s:8:"test_ipn";s:0:"";
s:12:"item_number1";s:1:"7";
s:7:"charset";s:12:"windows-1252";
s:11:"mc_shipping";s:4:"0.00";
s:11:"mc_handling";s:4:"0.00";
s:14:"notify_version";s:3:"2.1";
s:12:"mc_handling1";s:4:"0.00";
s:12:"mc_shipping1";s:4:"0.00";
s:10:"item_name1";s:50:"sdlkjgsdfghlsdkgdhlkjsdggkljdfhlkjsddflkhlkdldfkgj";
s:9:"quantity1";s:1:"1";
s:10:"mc_gross_1";s:6:"255.00";
s:17:"residence_country";s:2:"DE";
s:11:"screen_name";s:8:"dfglkjlf";
}

正如您所看到的,这很容易理解。在我的代码中,我想抓住一些字段(比如说pay_fee的值)。我怎样才能做到这一点?我想最好的是使用正则表达式,但我是Regexps的真正新秀。当然,我不想计算进入该领域的冒号和报价数量。我更喜欢自动方式。

注意:我不关心s:xx。你猜它意味着一个带有xx字符的字符串,我不需要验证它。

感谢您的帮助。

5 个答案:

答案 0 :(得分:1)

这是一个用于php字符串的c#反序列化库:http://sourceforge.net/projects/csphpserial/

我不是C#家伙,所以你的里程可能会有所不同,但看起来它已经存在了一段时间。

答案 1 :(得分:1)

此正则表达式应该允许您查找任何字段值。根据需要调整字符转义

var regex = fieldName + "\";s:\\d*:\"([^\"]*)\"'

(这是c#)

请注意,如果字符串包含“character ...

,则会返回不完整的值

答案 2 :(得分:0)

这样的事情怎么样:

 string fieldName = "address_status";
 string pattern = String.Format(@".*\"{0}\";s:[0-9]+:(\"[^\"]*\").*", fieldName);
 string value = Regex.Replace(line, pattern, @"$1");

答案 3 :(得分:0)

此正则表达式将分组付款费用。

'payment_fee\";s:\d*:"(\d*\.\d*)'
Python中的

s = 's:11:"payment_fee";s:5:"19.75";'
regex = 'payment_fee\";s:\d*:\"(\d*\.\d*)'

payment_fee = re.search(regex, s).groups[0] # returns '19.75'

答案 4 :(得分:0)

这似乎是序列化的PHP对象。可能有一些Python软件包可以用来反序列化这些数据 - 我能够找到一个名为phpserialize的软件包可能很有意思,但我从来没有使用它,所以我不能评论如何它有效。那里可能还有其他人。