Question

我使用以下函数来检索Web服务响应：

private String getSoapResponse (String url, String host, String encoding, String soapAction, String soapRequest) throws MalformedURLException, IOException, Exception {         
    URL wsUrl = new URL(url);     
    URLConnection connection = wsUrl.openConnection();     
    HttpURLConnection httpConn = (HttpURLConnection)connection;     
    ByteArrayOutputStream bout = new ByteArrayOutputStream(); 

    byte[] buffer = new byte[soapRequest.length()];     
    buffer = soapRequest.getBytes();     
    bout.write(buffer);     
    byte[] b = bout.toByteArray();          

    httpConn.setRequestMethod("POST");
    httpConn.setRequestProperty("Host", host);

    if (encoding == null || encoding == "")
        encoding = UTF8;

    httpConn.setRequestProperty("Content-Type", "text/xml; charset=" + encoding);
    httpConn.setRequestProperty("Content-Length", String.valueOf(b.length));
    httpConn.setRequestProperty("SOAPAction", soapAction);

    httpConn.setDoOutput(true);
    httpConn.setDoInput(true);

    OutputStream out = httpConn.getOutputStream();
    out.write(b); 
    out.close();

    InputStreamReader is = new InputStreamReader(httpConn.getInputStream());
    StringBuilder sb = new StringBuilder();
    BufferedReader br = new BufferedReader(is);
    String read = br.readLine();

    while(read != null) {
        sb.append(read);
        read = br.readLine();
    }

    String response = decodeHtmlEntityCharacters(sb.toString());    

    return response = decodeHtmlEntityCharacters(response);
}

但是我对这段代码的问题是它返回了许多特殊字符，并使XML的结构无效回复示例：

&lt;PLANT&gt;A565&lt;/PLANT&gt;
          &lt;PLANT&gt;A567&lt;/PLANT&gt;
          &lt;PLANT&gt;A585&lt;/PLANT&gt;
          &lt;PLANT&gt;A921&lt;/PLANT&gt;
          &lt;PLANT&gt;A938&lt;/PLANT&gt;
        &lt;/PLANT_GROUP&gt;
      &lt;/KPI_PLANT_GROUP_KEYWORD&gt;
      &lt;MSU_CUSTOMERS/&gt;
    &lt;/DU&gt;
    &lt;DU&gt;

所以为了解决这个问题，我使用下面的方法并传递整个响应，用相应的标点符号替换所有特殊字符。

private final static Hashtable htmlEntitiesTable = new Hashtable();
static {
    htmlEntitiesTable.put("&amp;","&");
    htmlEntitiesTable.put("&quot;","\"");
    htmlEntitiesTable.put("&lt;","<");
    htmlEntitiesTable.put("&gt;",">");  
}

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    return inputString;
}

但又出现了另一个问题。如果响应包含此段<VALUE>< 0.5 </VALUE<，并且该方法将对此进行评估，则输出将为：

<VALUE>< 0.5</VALUE>

这使得XML的结构再次无效。数据是正确且有效的“＆lt; 0.5”但在VALUE元素中使用它会导致XML结构出现问题。

你能帮忙解决这个问题吗？也许我可以改进获得或建立响应的方式。有没有更好的方法来调用Web服务并获得响应？

我如何处理包含“＆lt;”的元素或“＆gt;”？

Answer 1

您知道如何使用第三方开源库吗？

你应该尝试使用apache commons-lang：

StringEscapeUtils.unescapeXml(xml)

以下堆栈溢出帖子中提供了更多详细信息：

how to unescape XML in java

文档：

http://commons.apache.org/proper/commons-lang/javadocs/api-release/index.html http://commons.apache.org/proper/commons-lang/userguide.html#lang3

Answer 2

你使用SOAP错了。

特别是，您不需要以下代码行：

     String response = decodeHtmlEntityCharacters(sb.toString());

返回sb.toString()。为$ DEITY，不要使用字符串方法来解析检索到的字符串，使用XML解析器或完整的SOAP堆栈......

Answer 3

＆gt;或者＆lt;字符总是出现在值的开头？然后你可以使用正则表达式来处理＆amp; gt;或＆amp; lt;后跟一个数字（或点，就此而言）。

示例代码，假设其中使用的替换字符串不会出现在XML中的任何其他位置：

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    // Replaces &gt; or &lt; followed by dot or digit (while keeping the dot/digit)
    inputString = inputString.replaceAll("&gt;(\\.?\\d)", "Valuegreaterthan$1");
    inputString = inputString.replaceAll("&lt;(\\.?\\d)", "Valuelesserthan$1");

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    inputString = inputString.replaceAll("Valuelesserthan", "&lt;");
    inputString = inputString.replaceAll("Valuegreaterthan", "&gt;");

    return inputString;
}

请注意，最恰当的答案（对每个人来说都更容易）是在发送方正确编码XML（这也会使我的解决方案无法工作）。

Answer 4

很难应对所有情况，但是你可以通过假设任何少于空格后面的数据来添加更多规则来覆盖最常见的情况，并且大于前面有一个空格它是数据，需要再次编码。

private final static Hashtable htmlEntitiesTable = new Hashtable();
static {
    htmlEntitiesTable.put("&amp;","&");
    htmlEntitiesTable.put("&quot;","\"");
    htmlEntitiesTable.put("&lt;","<");
    htmlEntitiesTable.put("&gt;",">");  
}

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    inputString = inputString.replaceAll("< ","&lt; ");       
    inputString = inputString.replaceAll(" >"," &gt;");       

    return inputString;
}

Answer 5

'＆GT;'没有在XML中转义。所以你不应该有这个问题。关于'＆lt;'，这是我能想到的选项。

在Web响应中对包含特殊字符的文本使用CDATA。
通过撤消订单重写文本。例如。如果它是x＆lt; 2，将其改为2> X。 '＆GT;'除非它是CDATA的一部分，否则不会被转义。
使用XML响应中的其他属性或元素来指示'＆lt;'或'＆gt;'。
使用正则表达式查找以'＆lt;'开头的序列然后是一个字符串，后跟'＆lt;'结束标记。并将其替换为您可以在以后解释和替换的某些代码或某些值。

此外，您不需要这样做：

String response = decodeHtmlEntityCharacters(sb.toString());

您应该能够在处理'＆lt;'之后解析XML登录文字。

您可以使用this网站来测试正则表达式。

Answer 6

为什么不序列化你的xml？它比你正在做的容易得多。

举个例子：

var ser = new XmlSerializer(typeof(MyXMLObject));
using (var reader = XmlReader.Create("http.....xml"))
{
     MyXMLObject _myobj = (response)ser.Deserialize(reader);
}

格式化Web服务响应

6 个答案: