Question

给定一个字符串"Hello4.2this.is random 24 text42"，我想返回所有整数或浮点数[4.2, 24, 42]。所有其他问题的解决方案只返回24.我想返回一个浮点数，即使数字旁边有非数字字符。由于我是Python新手，我试图避免使用正则表达式或其他复杂的导入。我不知道如何开始。请帮忙。以下是一些研究尝试：Python: Extract numbers from a string，这不起作用，因为它不承认4.2和42.还有其他问题，如提到的问题，其中没有一个可悲地认识到4.2和{{1 }}

Answer 1

来自perldoc perlretut的正则表达式：

import re
re_float = re.compile("""(?x)
   ^
      [+-]?\ *      # first, match an optional sign *and space*
      (             # then match integers or f.p. mantissas:
          \d+       # start out with a ...
          (
              \.\d* # mantissa of the form a.b or a.
          )?        # ? takes care of integers of the form a
         |\.\d+     # mantissa of the form .b
      )
      ([eE][+-]?\d+)?  # finally, optionally match an exponent
   $""")
m = re_float.match("4.5")
print m.group(0)
# -> 4.5

从字符串中获取所有数字：

str = "4.5 foo 123 abc .123"
print re.findall(r"[+-]? *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?", str)
# -> ['4.5', ' 123', ' .123']

Answer 2

使用正则表达式可能会为您提供最简洁的此问题代码。很难打败

的简洁

re.findall(r"[+-]? *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?", str)

来自pythad的答案。

但是，你说＆＃34;我试图避免使用正则表达式，所以这里是一个不使用正则表达式的解决方案。它显然比使用正则表达式的解决方案稍长（并且可能慢得多），但它并不复杂。

代码循环输入字符。当它从字符串中提取每个字符时，它会将其附加到current（一个包含当前正在解析的数字的字符串）如果附加它仍然保持有效数字。当遇到无法附加到current的字符时，current会保存到数字列表中，但前提是current本身不是''之一，'.'，'-'或'-.';这些字符串可能会开始一个数字，但本身并不是有效数字。

保存current后，系统会移除'e'，'e-'或'e+'。这将发生在诸如'1.23eA'之类的字符串中。在解析该字符串时，current最终将变为'1.23e'，但遇到'A'，这意味着该字符串不包含有效的指数部分，因此'e'被丢弃

保存current后，它会重置。通常current会重置为''，但当触发current要保存的字符为'.'或'-'时，current设置为那个角色，因为这些角色可能是新号码的开头。

这里是函数extract_numbers(s)。 return numbers之前的行将字符串列表转换为整数和浮点值列表。如果您只想要字符串，请删除该行。

def extract_numbers(s):
    """
    Extract numbers from a string.

    Examples
    --------
    >>> extract_numbers("Hello4.2this.is random 24 text42")
    [4.2, 24, 42]

    >>> extract_numbers("2.3+45-99")
    [2.3, 45, -99]

    >>> extract_numbers("Avogadro's number, 6.022e23, is greater than 1 million.")
    [6.022e+23, 1]
    """
    numbers = []
    current = ''
    for c in s.lower() + '!':
        if (c.isdigit() or
            (c == 'e' and ('e' not in current) and (current not in ['', '.', '-', '-.'])) or
            (c == '.' and ('e' not in current) and ('.' not in current)) or
            (c == '+' and current.endswith('e')) or
            (c == '-' and ((current == '') or current.endswith('e')))):
            current += c
        else:
            if current not in ['', '.', '-', '-.']:
                if current.endswith('e'):
                    current = current[:-1]
                elif current.endswith('e-') or current.endswith('e+'):
                    current = current[:-2]
                numbers.append(current)
            if c == '.' or c == '-':
                current = c
            else:
                current = ''

    # Convert from strings to actual python numbers.
    numbers = [float(t) if ('.' in t or 'e' in t) else int(t) for t in numbers]

    return numbers

查找给定字符串中的所有浮点数或整数

2 个答案: