当值为null或NA时,我需要从给定的xml中提取xml元素。
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry revision="21754">
<author>Madhu</author>
<date>2015-05-12</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description>HotFix_MaxConnectionReduction.dBAssembly.xml file in Release- Branch</Description>
<HP_Code_ReviewID>CR1234</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
<logentry revision="21779">
<author>sudha</author>
<date>2015-05-19</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> Adding Release-Branch</Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
<logentry revision="21808">
<author>sudha</author>
<date>2015-05-25</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> modifying 15.6.1 in PP Release-Branch to bring new spaces in modules </Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
</log>
当值为null或NA
时,我需要提取xml元素并创建新的xml来处理
上述示例的预期输出为(“HP_Code_ReviewID标记值为NA”)
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry revision="21808">
<author>sudha</author>
<date>2015-05-25</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> modifying 15.6.1 in PP Release-Branch to bring new spaces in modules </Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
</log>
答案 0 :(得分:0)
Perl有一个出色的XML::Twig
库,可用于XML解析和解决问题。我将为您提供一个入门示例,但请注意 - Stack Overflow是帮助您解决代码问题,而不是为您编写代码。
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new(
'pretty_print' => 'indented_a',
'twig_handlers' => {
'HP_Code_ReviewID' => sub {
if ( $_->text =~ m/NA/ ) { $_->parent->print }
}
}
)->parse( \*DATA );
__DATA__
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry revision="21754">
<author>Madhu</author>
<date>2015-05-12</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description>HotFix_MaxConnectionReduction.dBAssembly.xml file in Release- Branch</Description>
<HP_Code_ReviewID>CR1234</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
<logentry revision="21779">
<author>sudha</author>
<date>2015-05-19</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> Adding Release-Branch</Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
<logentry revision="21808">
<author>sudha</author>
<date>2015-05-25</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> modifying 15.6.1 in PP Release-Branch to bring new spaces in modules </Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
</log>
这为HP_Code_ReviewID' and if the text in it contains
NA`设置了一个枝条处理程序,打印出父元素。
请注意,这不是 - 显式 - 有效的XML,因为它只捕获logentry
元素。但是,您可以使用XML :: Twig执行操作,例如删除不匹配的元素,然后显示文档的其余部分。
打印:
<logentry revision="21779">
<author>sudha</author>
<date>2015-05-19</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> Adding Release-Branch</Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
</logentry>
<logentry revision="21808">
<author>sudha</author>
<date>2015-05-25</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> modifying 15.6.1 in PP Release-Branch to bring new spaces in modules </Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
</logentry>
要在上下文中保留这些内容,您基本上必须删除未匹配的内容:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new(
'pretty_print' => 'indented_a',
'twig_handlers' => {
'logentry' => sub {
if ( not $_->first_child_text('HP_Code_ReviewID') =~ m/NA/ ) {
$_->delete;
}
}
}
)->parse( \*DATA )->print;
(与上述DATA
块相同)。
这将打印:
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry revision="21779">
<author>sudha</author>
<date>2015-05-19</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> Adding Release-Branch</Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
<logentry revision="21808">
<author>sudha</author>
<date>2015-05-25</date>
<QC_ID>NA</QC_ID>
<Rally_ID>US4940</Rally_ID>
<Description> modifying 15.6.1 in PP Release-Branch to bring new spaces in modules </Description>
<HP_Code_ReviewID> NA</HP_Code_ReviewID>
<Deployment_Change_Needed>No</Deployment_Change_Needed>
<Deployment_Change_Description>NA
</Deployment_Change_Description>
</logentry>
</log>