Question

我有一个函数，它接受一个包含数学表达式的字符串，例如import requests from bs4 import BeautifulSoup PNR = "00000000" url = "http://reg.maths.lth.se/" login_url = "http://reg.maths.lth.se/login/student" with requests.Session() as session: # extract token response = session.get(url) soup = BeautifulSoup(response.content, "html.parser") token = soup.find("input", {"name": "_token"})["value"] # submit form session.post(login_url, data={ "_token": token, "pnr": PNR }) # navigate to the main page again (should be logged in) response = session.get(url) soup = BeautifulSoup(response.content, "html.parser") print(soup.title)或6+9*8，并从左到右评估它们（没有正常的操作规则顺序）。

在过去的几个小时里我一直坚持这个问题，终于找到了罪魁祸首，但我不知道它为什么要这样做。当我通过正则表达式（4+9和.split("\\d")）拆分字符串时，我将其分为两个数组，一个是.split("\\D")，其中包含表达式中涉及的数字和{{{ 1}}它包含操作。

我所意识到的是，当我执行以下操作时：

int[]

它没有将第一个操作符号放在索引0中，而是将它放在索引1中。为什么会这样？这是控制台上的String[]：

String question = "5+9*8";
String[] mathOperations = question.split("\\d");
for(int i = 0; i < mathOperations.length; i++) {
    System.out.println("Math Operation at " + i + " is " + mathOperations[i]);
}

Answer 1

因为在mathOperations的位置0处有一个空字符串。换句话说

mathOperations = {"", "+", "*"};

根据split documentation

此方法返回的数组包含此子字符串由与给定匹配的另一个子字符串终止的字符串表达式或由字符串的结尾终止。 ...

为什么数组末尾还没有空字符串？

因此，结果中不包括尾随空字符串阵列。

更详细的解释 - 您的正则表达式匹配字符串，如下所示：

"(5)+(9)*(8)" -> "" + (5) + "+" + (9) + "*" + (8) + ""

但是如文档所指定的那样丢弃尾随的空字符串。（希望这个愚蠢的插图有帮助）

值得注意的是，使用"\\d"的正则表达式会将字符串"55+5"拆分为

["", "", "+"]

那是因为你只匹配一个字符，你应该使用"\\d+"

Answer 2

您可能会发现程序中的以下变体很有帮助，因为一次拆分可以完成两个人的工作......

public class zw {
    public static void main(String[] args) {
            String question = "85+9*8-900+77";
            String[] bits = question.split("\\b");
            for (int i = 0; i < bits.length; ++i) System.out.println("[" + bits[i] + "]");
    }
}

及其输出：

[]
      [85]
      [+]
      [9]
      [*]
      [8]
      [ - ]
      [900]
      [+]
      [77]

在这个程序中，我使用\b作为“零宽度边界”来进行拆分。在分割过程中没有任何角色受到伤害，它们都进入阵列。

此处有更多信息：https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html 在这里：http://www.regular-expressions.info/wordboundaries.html

正则表达式不会将元素存储在第一个索引中

2 个答案: