使用regex验证URL查询字符串

时间:2014-05-30 16:33:08

标签: regex validation expression

我尝试使用正则表达式验证查询字符串。请注意,我并未尝试匹配值,而是验证其语法。我这样做是为了练习正则表达式,所以我很感激帮助,而不是#34;使用这个lib"虽然看看它是如何在lib中完成的,但这对我有帮助,所以告诉我如果你有一个。

所以,这将是先决条件:

  • 必须以问号开头。
  • 它可能包含带有或不带有等号分隔的值的键,由&符号分隔的对。

我已经相当远了,但我在正则表达式中匹配时遇到麻烦,即等号和符号必须按特定顺序排列,而不必重复匹配组。这是我到目前为止所得到的:

#^\?([\w\-]+((&|=)([\w\-]+)*)*)?$#

它正确匹配?abc=123&def=345,但它也错误地匹配,例如?abc=123=456

我可能会过度杀戮并做类似......

/^\?([\w\-]+=?([\w\-]+)?(&[\w\-]+(=?[\w\-]*)?)*)?$/

...但我不想重复相同的匹配组。

如何告诉正则表达式值之间的分隔符必须在&=之间迭代而不重复匹配组或灾难性的反向跟踪?

谢谢。

修改

我想澄清一点,这不适用于现实世界的实施;为此,应该使用您的语言中最有可能的内置库。问这个问题是因为我想提高我的正则表达式技能,解析查询字符串似乎是一个有意义的挑战。

6 个答案:

答案 0 :(得分:9)

这似乎是你想要的:

^\?([\w-]+(=[\w-]*)?(&[\w-]+(=[\w-]*)?)*)?$

请参阅live demo

这考虑了每对"作为一个键后跟一个可选值(可能是空白),并有一个第一对,后跟一个可选的&然后是另一对,整个表达式(前导?除外)是可选的。这样做会阻止匹配?&abc=def

另请注意,连字符在字符类中的最后一个时不需要转义,允许略微简化。

您似乎希望在键或值中的任何位置允许使用连字符。如果密钥需要连字符:

^\?(\w+(=[\w-]*)?(&\w+(=[\w-]*)?)*)?$

答案 1 :(得分:3)

您可以使用此正则表达式:

^\?([^=]+=[^=]+&)+[^=]+(=[^=]+)?$

它的作用是:

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  ^                        the beginning of the string
--------------------------------------------------------------------------------
  \?                       '?'
--------------------------------------------------------------------------------
  (                        group and capture to \1 (1 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    [^=]+                    any character except: '=' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    =                        '='
--------------------------------------------------------------------------------
    [^=]+                    any character except: '=' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    &                        '&'
--------------------------------------------------------------------------------
  )+                       end of \1 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \1)
--------------------------------------------------------------------------------
  [^=]+                    any character except: '=' (1 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  (                        group and capture to \2 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    =                        '='
--------------------------------------------------------------------------------
    [^=]+                    any character except: '=' (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )?                       end of \2 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \2)
--------------------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string

答案 2 :(得分:1)

这可能不是正则表达式的工作,而是适用于您选择的语言的现有工具。正则表达不是一个魔术棒,您可以在遇到涉及字符串的每个问题上挥手。您可能希望使用已编写,测试和调试的现有代码。

在PHP中,使用parse_url函数。

Perl:URI module

Ruby:URI module

.NET:'Uri' class

答案 3 :(得分:0)

我同意Andy Lester,但可能的正则表达式解决方案是

#^\?([\w-]+=[\w-]*(&[\w-]+=[\w-]*))?$#

这与你发布的非常相似。

我没有测试过,你没有说出你正在使用的语言,所以可能需要稍微调整一下。

答案 4 :(得分:0)

我做到了。

function isValidURL(url) {
  // based off https://mathiasbynens.be/demo/url-regex. testing https://regex101.com/r/pyrDTK/2
  var pattern = /^(?:(?:https?|ftp):\/\/)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:\/?)(?:(?:\?(?:(?!&|\?)(?:\S))+=(?:(?!&|\?)(?:\S))+)(?:&(?:(?!&|\?)(?:\S))+=(?:(?!&|\?)(?:\S))+)*)?$/iuS;
  return pattern.test(url);
}

基地:https://mathiasbynens.be/demo/url-regex

测试:https://regex101.com/r/pyrDTK/4/

答案 5 :(得分:0)

当您需要验证非常复杂的网址时,可以使用此正则表达式

from matplotlib import pyplot as plt

launch_speed = [launch_speed_values]
launch_angle = [launch_angle_values]

pitch_type = ['Sinker', '4-Seam Fastball', 'Knuckle Curve', 'Changeup', 'Sinker', 'Slider', 'Changeup', '4-Seam Fastball', 'Sinker', '4-Seam Fastball', 'Slider', 'Changeup', 'Changeup', '4-Seam Fastball', 'Sinker', '4-Seam Fastball', '4-Seam Fastball', 'Slider', '4-Seam Fastball', 'Knuckle Curve', '4-Seam Fastball', '4-Seam Fastball', '4-Seam Fastball', 'Slider', 'Slider', 'Slider', '4-Seam Fastball', '4-Seam Fastball', 'Slider', '4-Seam Fastball', 'Curveball', 'Sinker', '4-Seam Fastball', '4-Seam Fastball', '4-Seam Fastball', 'Cutter', '4-Seam Fastball', 'Cutter', 
'Slider', '4-Seam Fastball', 'Slider', '4-Seam Fastball', 'Slider', 'Changeup', '4-Seam Fastball', '4-Seam Fastball', 'Changeup', 'Sinker', '4-Seam Fastball', 'Slider', '4-Seam Fastball', 'Slider', 'Knuckle Curve', '4-Seam Fastball', '4-Seam Fastball', '4-Seam Fastball', '4-Seam Fastball', 'Changeup', 'Slider', 'Slider', 'Cutter', 'Sinker', 'Sinker', 'Sinker', 'Sinker', 'Sinker',  'Slider', '4-Seam Fastball', 'Changeup', 'Changeup', '4-Seam Fastball', '4-Seam Fastball', 'Changeup', '4-Seam Fastball', 'Changeup', '4-Seam Fastball', '4-Seam Fastball', '4-Seam Fastball', 
'Cutter', 'Slider', 'Sinker', 'Changeup', '4-Seam Fastball', 'Changeup', '4-Seam Fastball', '4-Seam Fastball', 'Changeup', 'Changeup', '4-Seam Fastball', '4-Seam Fastball']

plt.scatter(launch_speed_launch_angle)

plt.show()