.NET:阻止XmlDocument.LoadXml检索DTD

时间:2010-12-14 23:30:40

标签: .net xml dtd

我有以下代码(C#),它需要太长时间并抛出异常:

new XmlDocument().
LoadXml("<?xml version='1.0' ?><!DOCTYPE note SYSTEM 'http://someserver/dtd'><note></note>");

我明白为什么会那样做。我的问题是如何让它停止?我不关心DTD验证。我想我可以正则代替它,但我正在寻找更优雅的解决方案。

背景:
实际的XML是从我不拥有的网站收到的。当站点正在进行维护时,它返回带有DOCTYPE的XML,该DOCTYPE指向维护期间不可用的DTD。因此,我的服务变得不必要,因为它试图为我需要解析的每个XML获取DTD。

这是异常堆栈:

Unhandled Exception: System.Net.WebException: The remote name could not be resolved: 'someserver'
at System.Net.HttpWebRequest.GetResponse()
at System.Xml.XmlDownloadManager.GetNonFileStream(Uri uri, ICredentials credentials)
at System.Xml.XmlDownloadManager.GetStream(Uri uri, ICredentials credentials)
at System.Xml.XmlUrlResolver.GetEntity(Uri absoluteUri, String role, Type ofObjectToReturn)
at System.Xml.XmlTextReaderImpl.OpenStream(Uri uri)
at System.Xml.XmlTextReaderImpl.DtdParserProxy_PushExternalSubset(String systemId, String publicId)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.System.Xml.IDtdParserAdapter.PushExternalSubset(String systemId, String publicId)
at System.Xml.DtdParser.ParseExternalSubset()
at System.Xml.DtdParser.ParseInDocumentDtd(Boolean saveInternalSubset)
at System.Xml.DtdParser.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.DtdParserProxy.Parse(Boolean saveInternalSubset)
at System.Xml.XmlTextReaderImpl.ParseDoctypeDecl()
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
at System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.LoadXml(String xml)
at ConsoleApplication36.Program.Main(String[] args) in c:\Projects\temp\ConsoleApplication36\Program.cs:line 11

2 个答案:

答案 0 :(得分:10)

好吧,在.NET 4.0中,XmlTextReader有一个名为DtdProcessing的属性。当设置为DtdProcessing.Ignore时,它应该禁用DTD处理。

答案 1 :(得分:1)

在.net 4.5.1中我没有运气将doc.XmlResolver设置为null。

对我来说最简单的解决方法是使用字符串替换来改变&#34; xmlns =&#34;到&#34;忽略=&#34;在调用LoadXml()之前,例如

var responseText = await response.Content.ReadAsStringAsync();
responseText = responseText.Replace("xmlns=", "ignore=");
try
{
    var doc = new XmlDocument();
    doc.LoadXml(responseText);
    ...
}