使用Python解析HTTP请求Authorization标头

时间:2009-08-28 21:07:56

标签: python http google-app-engine parsing http-headers

我需要采取这样的标题:

 Authorization: Digest qop="chap",
     realm="testrealm@host.com",
     username="Foobear",
     response="6629fae49393a05397450978507c4ef1",
     cnonce="5ccc069c403ebaf9f0171e9517f40e41"

使用Python解析它:

{'protocol':'Digest',
  'qop':'chap',
  'realm':'testrealm@host.com',
  'username':'Foobear',
  'response':'6629fae49393a05397450978507c4ef1',
  'cnonce':'5ccc069c403ebaf9f0171e9517f40e41'}

是否有图书馆可以做到这一点,或者我可以看一些灵感来源?

我在谷歌应用引擎上这样做,我不确定Pyparsing库是否可用,但如果它是最好的解决方案,我可以将它包含在我的应用中。

目前我正在创建自己的MyHeaderParser对象,并在头字符串上使用reduce()。它有效,但非常脆弱。

nadia的精彩解决方案如下:

import re

reg = re.compile('(\w+)[=] ?"?(\w+)"?')

s = """Digest
realm="stackoverflow.com", username="kixx"
"""

print str(dict(reg.findall(s)))

10 个答案:

答案 0 :(得分:12)

一点regex:

import re
reg=re.compile('(\w+)[:=] ?"?(\w+)"?')

>>>dict(reg.findall(headers))

{'username': 'Foobear', 'realm': 'testrealm', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'response': '6629fae49393a05397450978507c4ef1', 'Authorization': 'Digest'}

答案 1 :(得分:9)

您也可以使用urllib2作为CheryPy

这是片段:

input= """
 Authorization: Digest qop="chap",
     realm="testrealm@host.com",
     username="Foobear",
     response="6629fae49393a05397450978507c4ef1",
     cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""
import urllib2
field, sep, value = input.partition("Authorization: Digest ")
if value:
    items = urllib2.parse_http_list(value)
    opts = urllib2.parse_keqv_list(items)
    opts['protocol'] = 'Digest'
    print opts

输出:

{'username': 'Foobear', 'protocol': 'Digest', 'qop': 'chap', 'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 'realm': 'testrealm@host.com', 'response': '6629fae49393a05397450978507c4ef1'}

答案 2 :(得分:3)

这是我的pyparsing尝试:

text = """Authorization: Digest qop="chap",
    realm="testrealm@host.com",     
    username="Foobear",     
    response="6629fae49393a05397450978507c4ef1",     
    cnonce="5ccc069c403ebaf9f0171e9517f40e41" """

from pyparsing import *

AUTH = Keyword("Authorization")
ident = Word(alphas,alphanums)
EQ = Suppress("=")
quotedString.setParseAction(removeQuotes)

valueDict = Dict(delimitedList(Group(ident + EQ + quotedString)))
authentry = AUTH + ":" + ident("protocol") + valueDict

print authentry.parseString(text).dump()

打印:

['Authorization', ':', 'Digest', ['qop', 'chap'], ['realm', 'testrealm@host.com'],
 ['username', 'Foobear'], ['response', '6629fae49393a05397450978507c4ef1'], 
 ['cnonce', '5ccc069c403ebaf9f0171e9517f40e41']]
- cnonce: 5ccc069c403ebaf9f0171e9517f40e41
- protocol: Digest
- qop: chap
- realm: testrealm@host.com
- response: 6629fae49393a05397450978507c4ef1
- username: Foobear

我不熟悉RFC,但我希望这可以让你滚动。

答案 3 :(得分:1)

如果这些组件永远存在,那么正则表达式就可以解决这个问题:

test = '''Authorization: Digest qop="chap", realm="testrealm@host.com", username="Foobear", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"'''

import re

re_auth = re.compile(r"""
    Authorization:\s*(?P<protocol>[^ ]+)\s+
    qop="(?P<qop>[^"]+)",\s+
    realm="(?P<realm>[^"]+)",\s+
    username="(?P<username>[^"]+)",\s+
    response="(?P<response>[^"]+)",\s+
    cnonce="(?P<cnonce>[^"]+)"
    """, re.VERBOSE)

m = re_auth.match(test)
print m.groupdict()

产生

{ 'username': 'Foobear', 
  'protocol': 'Digest', 
  'qop': 'chap', 
  'cnonce': '5ccc069c403ebaf9f0171e9517f40e41', 
  'realm': 'testrealm@host.com', 
  'response': '6629fae49393a05397450978507c4ef1'
}

答案 4 :(得分:1)

我建议找一个正确的库来解析http头,但遗憾的是我无法重新启动任何。 :(

请查看下面的代码段(它应该最常用):

input= """
 Authorization: Digest qop="chap",
     realm="testrealm@host.com",
     username="Foob,ear",
     response="6629fae49393a05397450978507c4ef1",
     cnonce="5ccc069c403ebaf9f0171e9517f40e41"
"""

field, sep, value = input.partition(":")
if field.endswith('Authorization'):
   protocol, sep, opts_str = value.strip().partition(" ")

   opts = {}
   for opt in opts_str.split(",\n"):
        key, value = opt.strip().split('=')
        key = key.strip(" ")
        value = value.strip(' "')
        opts[key] = value

   opts['protocol'] = protocol

   print opts

答案 5 :(得分:1)

使用PyParsing的原始概念将是最好的方法。你暗中要求的是需要语法的东西......也就是说,正则表达式或简单的解析程序总是很脆弱,这听起来像是你想要避免的东西。

似乎在Google应用引擎上进行pyparsing非常简单:How do I get PyParsing set up on the Google App Engine?

所以我继续使用它,然后实现rfc2617的完整HTTP身份验证/授权标头支持。

答案 6 :(得分:1)

http摘要授权标头字段有点奇怪的野兽。它的格式类似于rfc 2616的Cache-Control和Content-Type头字段,但只是不同,不兼容。如果您仍在寻找比正则表达式更智能且更易读的库,您可以尝试使用str.split()删除授权:摘要部分,并使用parse_dict_header()中的Werkzeug解析其余部分的http模块。 (Werkzeug可以安装在App Engine上。)

答案 7 :(得分:1)

Nadia的正则表达式仅匹配参数值的字母数字字符。这意味着它无法解析至少两个字段。即,uri和qop。根据RFC 2617,uri字段是请求行中的字符串的副本(即HTTP请求的第一行)。由于非字母数字' - ',如果值为“auth-int”,则qop无法正确解析。

这个修改过的正则表达式允许URI(或任何其他值)包含除''(空格),''(qoute)或','(逗号)之外的任何内容。这可能比它需要的更宽松,但不应该导致正确形成的HTTP请求出现任何问题。

reg re.compile('(\w+)[:=] ?"?([^" ,]+)"?')

额外提示:从那里开始,将RFC-2617中的示例代码转换为python是相当简单的。使用python的md5 API,“MD5Init()”变为“m = md5.new()”,“MD5Update()”变为“m.update()”,“MD5Final()”变为“m.digest()”。< / p>

答案 8 :(得分:1)

一个老问题,但我发现非常有帮助。

我需要一个解析器来处理RFC7235定义的任何格式正确的Authorization标头(如果您喜欢阅读ABNF,请举手)。

public class MainEquations extends ListFragment {

private ListView listView; //Main listview
private CustomListviewAdapter mAdapter; //Custom adapter for listview
private AlertDialog.Builder builder;
public static final String TAG = MainEquations.class.getSimpleName();

@Override
public View onCreateView(LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) {

    View view = inflater.inflate(R.layout.main_equations, container, false);
    setHasOptionsMenu(true);

    //Find listview in xml
    listView = (ListView) view.findViewById(R.id.listView);
    ArrayList<CustomListViewClass> equationsList = new ArrayList<>();

    //Populate listview with items
    equationsList.add(new CustomListViewClass( "Ampere's Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Angular Acceleration" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Angular Velocity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Belt Velocity" , "V = (πdₘnₘ)/12"));
    equationsList.add(new CustomListViewClass( "Bolt Stress Area" , "A = π/4(dₙ - 0.9743/n)²"));
    equationsList.add(new CustomListViewClass( "Brake Clamp Load" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Buoyant Force" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Conductivity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Coulomb's Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Darcy's Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Density" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Drag Force" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Dynamic Viscosity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Elastic Potential Energy" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Electric Field" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Engineering Strain" , "ε = ΔL/L₀"));
    equationsList.add(new CustomListViewClass( "Engineering Stress" , "σ = F/A₀"));
    equationsList.add(new CustomListViewClass( "Escape Velocity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Flow Head Loss" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Fluid Pressure" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Fluid Surface Tension" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Force" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Fracture Toughness" , "Kᵢc = Yσ⋅Sqrt(πa)"));
    equationsList.add(new CustomListViewClass( "Gauss Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Gibb's Free Energy" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Gravitational Force" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Gravitational Potential Energy" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Horsepower using Belt Velocity/Force" , "HP = (Fb⋅Vb)/33000"));
    equationsList.add(new CustomListViewClass( "Ideal Gas Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Induced Voltage" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Kinematic Viscosity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Kinetic Energy" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Mach Number" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Magnetic Force Charge" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Moment" , "F⋅d"));
    equationsList.add(new CustomListViewClass( "Momentum" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Nernst Equation" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Ohms Law" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Open Channel Water Flow" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Orbital Velocity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Orifice Discharge" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Pump Power" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Refraction" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Resistivity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Resonant Frequency" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Reynolds Number" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Rotational Kinetic Energy" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Shear Modulus" , "G = τ/γ"));
    equationsList.add(new CustomListViewClass( "Shear Strain" , "γ = τ/G"));
    equationsList.add(new CustomListViewClass( "Shear Stress" , "τ = F/A"));
    equationsList.add(new CustomListViewClass( "Shear Stress of Linear Helical Spring" , "τ = Kₛ(8FD/πd³)"));
    equationsList.add(new CustomListViewClass( "Surface Charge Density" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Thermal Deformation" , "δₜ = αL(T - T₀)"));
    equationsList.add(new CustomListViewClass( "Torque" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Velocity" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Voltage Divider" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Voltage Drop" , "Placeholder"));
    equationsList.add(new CustomListViewClass( "Young's Modulus" , "E = σ/ε"));



    mAdapter = new CustomListviewAdapter(getActivity(),equationsList);
    listView.setAdapter(mAdapter);

    //Set onClick functions for listview
    listView.setOnItemClickListener(new AdapterView.OnItemClickListener() {

        @Override
        public void onItemClick(AdapterView<?> adapter, View arg1,
                                int position, long arg3) {

            switch(position) {
                case 15:
                setVariables("Engineering Strain",
                        "<center>$$\\epsilon = \\frac{\\Delta L}{L_0}$$</center>",
                        "\\(Unitless\\)",
                        "\\(\\Delta L = \\mathrm{change~in~length~in~}in, mm\\)",
                        "\\(L_0 = \\mathrm{original~area~in~}in, mm\\)");
                break;
                case 16:
                setVariables("Engineering Stress",
                        "<center>$$\\sigma = \\frac{F}{A_0}$$</center>",
                        "\\(Pa, MPa, psi, kpsi\\)",
                        "\\(F = \\mathrm{axial~force~in~}N, kip\\)",
                        "\\(A_0 = \\mathrm{cross~sectional~area~in~} m^2, ft^2\\)");
                break;
                case 2:
                setVariables("Moment",
                        "<center>$$M= F{\\cdot}d$$</center>",
                        "\\(lb{\\cdot}ft, N{\\cdot}m\\)",
                        "\\(F = \\mathrm{force~in~}N, lb\\)",
                        "\\(d = \\mathrm{distance~of~radial~arm~in~} m, ft\\)");
                break;
                case 3:
                setVariables("Shear Stress",
                        "<center>$$\\epsilon = \\frac{\\Delta L}{L_0}$$</center>",
                        "\\(Unitless\\)",
                        "\\(\\Delta L = \\mathrm{change~in~length~in~}in, mm\\)",
                        "\\(L_0 = \\mathrm{original~area~in~}in, mm\\)");
                break;

            }
        }
    });

    return view;

}

public void setVariables(String title, String equation, String units, String variable_one, String variable_two) {
    Bundle info = new Bundle();
    info.putString("KEY_TITLE", title);
    info.putString("KEY_EQUATION", equation);
    info.putString("KEY_UNITS", units);
    info.putString("KEY_VARIABLES_ONE", variable_one);
    info.putString("KEY_VARIABLES_TWO", variable_two);
    FragmentManager fm = getActivity().getFragmentManager();
    CustomEquationsDialog dialog = new CustomEquationsDialog();
    dialog.setArguments(info);
    dialog.show(fm, TAG);
}

//Alert function for dialog
public void alert (String title, String message) {
    builder = new AlertDialog.Builder(getActivity(), R.style.Theme_AppCompat_Light_Dialog);
    builder.setTitle(title)
            .setMessage(message)
            .setPositiveButton(android.R.string.yes, new DialogInterface.OnClickListener() {
                public void onClick(DialogInterface dialog, int which) {
                    // continue with delete
                }
            })
            .show();
}

@Override
public void onViewCreated(View view, Bundle savedInstanceState) {
    //Stuff
}

//Options menu created
@Override
public void onCreateOptionsMenu(Menu menu, MenuInflater inflater) {
    inflater.inflate(R.menu.menu_equations, menu);
    super.onCreateOptionsMenu(menu,inflater);

    final SearchView searchView2 = (SearchView) menu.findItem(R.id.item_search).getActionView();

    searchView2.setIconifiedByDefault(false);
    searchView2.setQueryHint("Search Here");
    searchView2.setOnQueryTextListener(new SearchView.OnQueryTextListener() {

        @Override
        public boolean onQueryTextChange(String newText) {
            Filter filter = mAdapter.getFilter();
            if (TextUtils.isEmpty(newText)) {
                listView.clearTextFilter();
                filter.filter(newText);
            } else {
                filter.filter(newText);
                listView.setFilterText(newText);
            }
            mAdapter.notifyDataSetChanged();
            return true;
        }
        public boolean onQueryTextSubmit(String query) {
            searchView2.clearFocus(); //Close searchview when enter button pressed
            return true;
        }
    });

}

}

PaulMcG的答案开始,我想到了这一点:

Authorization = credentials

BWS = <BWS, see [RFC7230], Section 3.2.3>

OWS = <OWS, see [RFC7230], Section 3.2.3>

Proxy-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS
 challenge ] )
Proxy-Authorization = credentials

WWW-Authenticate = *( "," OWS ) challenge *( OWS "," [ OWS challenge
 ] )

auth-param = token BWS "=" BWS ( token / quoted-string )
auth-scheme = token

challenge = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param ) *(
 OWS "," [ OWS auth-param ] ) ] ) ]
credentials = auth-scheme [ 1*SP ( token68 / [ ( "," / auth-param )
 *( OWS "," [ OWS auth-param ] ) ] ) ]

quoted-string = <quoted-string, see [RFC7230], Section 3.2.6>

token = <token, see [RFC7230], Section 3.2.6>
token68 = 1*( ALPHA / DIGIT / "-" / "." / "_" / "~" / "+" / "/" )
 *"="

这允许解析任何Authorization标头:

import pyparsing as pp

tchar = '!#$%&\'*+-.^_`|~' + pp.nums + pp.alphas
t68char = '-._~+/' + pp.nums + pp.alphas

token = pp.Word(tchar)
token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))

scheme = token('scheme')

header = pp.Keyword('Authorization')
name = pp.Word(pp.alphas, pp.alphanums)
value = pp.quotedString.setParseAction(pp.removeQuotes)
name_value_pair = name + pp.Suppress('=') + value
params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))

credentials = scheme + (token68('token') ^ params('params'))

auth_parser = header + pp.Suppress(':') + credentials

输出:

parsed = auth_parser.parseString('Authorization: Basic Zm9vOmJhcg==')
print('Authenticating with {0} scheme, token: {1}'.format(parsed['scheme'], parsed['token']))

将所有内容整合到Authenticating with Basic scheme, token: Zm9vOmJhcg== 类中:

Authenticator

要测试该课程:

import pyparsing as pp
from base64 import b64decode
import re

class Authenticator:
    def __init__(self):
        """
        Use pyparsing to create a parser for Authentication headers
        """
        tchar = "!#$%&'*+-.^_`|~" + pp.nums + pp.alphas
        t68char = '-._~+/' + pp.nums + pp.alphas

        token = pp.Word(tchar)
        token68 = pp.Combine(pp.Word(t68char) + pp.ZeroOrMore('='))

        scheme = token('scheme')

        auth_header = pp.Keyword('Authorization')
        name = pp.Word(pp.alphas, pp.alphanums)
        value = pp.quotedString.setParseAction(pp.removeQuotes)
        name_value_pair = name + pp.Suppress('=') + value
        params = pp.Dict(pp.delimitedList(pp.Group(name_value_pair)))

        credentials = scheme + (token68('token') ^ params('params'))

        # the moment of truth...
        self.auth_parser = auth_header + pp.Suppress(':') + credentials


    def authenticate(self, auth_header):
        """
        Parse auth_header and call the correct authentication handler
        """
        authenticated = False
        try:
            parsed = self.auth_parser.parseString(auth_header)
            scheme = parsed['scheme']
            details = parsed['token'] if 'token' in parsed.keys() else parsed['params']

            print('Authenticating using {0} scheme'.format(scheme))
            try:
                safe_scheme = re.sub("[!#$%&'*+-.^_`|~]", '_', scheme.lower())
                handler = getattr(self, 'auth_handle_' + safe_scheme)
                authenticated = handler(details)
            except AttributeError:
                print('This is a valid Authorization header, but we do not handle this scheme yet.')

        except pp.ParseException as ex:
            print('Not a valid Authorization header')
            print(ex)

        return authenticated


    # The following methods are fake, of course.  They should use what's passed
    # to them to actually authenticate, and return True/False if successful.
    # For this demo I'll just print some of the values used to authenticate.
    @staticmethod
    def auth_handle_basic(token):
        print('- token is {0}'.format(token))
        try:
            username, password = b64decode(token).decode().split(':', 1)
        except Exception:
            raise DecodeError
        print('- username is {0}'.format(username))
        print('- password is {0}'.format(password))
        return True

    @staticmethod
    def auth_handle_bearer(token):
        print('- token is {0}'.format(token))
        return True

    @staticmethod
    def auth_handle_digest(params):
        print('- username is {0}'.format(params['username']))
        print('- cnonce is {0}'.format(params['cnonce']))
        return True

    @staticmethod
    def auth_handle_aws4_hmac_sha256(params):
        print('- Signature is {0}'.format(params['Signature']))
        return True

哪个输出:

tests = [
    'Authorization: Digest qop="chap", realm="testrealm@example.com", username="Foobar", response="6629fae49393a05397450978507c4ef1", cnonce="5ccc069c403ebaf9f0171e9517f40e41"',
    'Authorization: Bearer cn389ncoiwuencr',
    'Authorization: Basic Zm9vOmJhcg==',
    'Authorization: AWS4-HMAC-SHA256 Credential="AKIAIOSFODNN7EXAMPLE/20130524/us-east-1/s3/aws4_request", SignedHeaders="host;range;x-amz-date", Signature="fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024"',
    'Authorization: CrazyCustom foo="bar", fizz="buzz"',
]

authenticator = Authenticator()

for test in tests:
    authenticator.authenticate(test)
    print()

将来,如果我们希望处理CrazyCustom,我们将添加

Authenticating using Digest scheme
- username is Foobar
- cnonce is 5ccc069c403ebaf9f0171e9517f40e41

Authenticating using Bearer scheme
- token is cn389ncoiwuencr

Authenticating using Basic scheme
- token is Zm9vOmJhcg==
- username is foo
- password is bar

Authenticating using AWS4-HMAC-SHA256 scheme
- signature is fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024

Authenticating using CrazyCustom scheme 
This is a valid Authorization header, but we do not handle this scheme yet.

答案 9 :(得分:0)

如果你的回复只有一个字符串永远不会变化,并且有与匹配的表达式一样多的行,你可以将它拆分成名为authentication_array的换行符中的数组并使用正则表达式:

pattern_array = ['qop', 'realm', 'username', 'response', 'cnonce']
i = 0
parsed_dict = {}

for line in authentication_array:
    pattern = "(" + pattern_array[i] + ")" + "=(\".*\")" # build a matching pattern
    match = re.search(re.compile(pattern), line)         # make the match
    if match:
        parsed_dict[match.group(1)] = match.group(2)
    i += 1