从Excel中的文本创建干净的URL

时间:2015-07-09 20:09:30

标签: regex excel

我想从文本中创建一个干净的URL:

  

Alpha测试'购买Berta Global Associates(C)

网址应如下所示:

  

的α-测试 - 购的-贝塔全局缔-C

目前我在Excel中使用此公式:

=LOWER(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A38;"--";"-");" / ";"-");"  ";"-");": ";"-");" - ";"-");"_";"-");"?";"");",";"");".";"");"'";"");")";"");"(";"");":";"");" ";"-");"&";"and");"!";"");"/";"-");"""";""))

但是,我似乎并没有抓住所有特殊符号等,因此我的网址并不像我希望的那样干净。

您是否知道Excel公式或VBA代码,以确保所有特殊符号都正确转换为干净的URL?

谢谢。

1 个答案:

答案 0 :(得分:2)

我可以建议您可以将以下函数放入VBA模块并使用常规公式:

Function NormalizeToUrl(cell As Range)

Dim strPattern As String
Dim regEx As Object

Set regEx = CreateObject("vbscript.regexp")
strPattern = "[^\w-]+"

With regEx
    .Global = True
    .Pattern = strPattern
End With

NormalizeToUrl = LCase(regEx.Replace(Replace(cell.Value, " ", "-"), ""))
End Function

enter image description here

关键是我们在开头用连字符替换所有空格,然后使用匹配任何非单词和非连字符的正则表达式,并用RegExp.Replace删除它们。

<强>更新

发表评论后,目前还不清楚你想用Unicode字母做什么。删除或替换为连字符。这是我尝试从您的公式重建的函数,但逻辑可能存在缺陷。我更喜欢上面的通用方法。

Function NormalizeToUrl(cell As Range)

Dim strPattern As String
Dim regEx As Object

Set regEx = CreateObject("vbscript.regexp")
strPattern = "[^\w -]"

With regEx
    .Global = True
    .Pattern = "[?,.')(:!""]+" ' THESE ARE REMOVED
End With

NormalizeToUrl = regEx.Replace(cell.Value, "")
NormalizeToUrl = Replace(NormalizeToUrl, "&", "and") ' & TURNS INTO "and"

With regEx
    .Global = True
    .Pattern = strPattern ' WE REPLACE ALL NON-WORD CHARS WITH HYPHEN
End With
NormalizeToUrl = LCase(regEx.Replace(Replace(NormalizeToUrl, " ", "-"), "-"))
With regEx
    .Global = True
    .Pattern = "--+" ' WE SHRINK ALL HYPHEN SEQUENCES TO SINGLE HYPHEN
End With
NormalizeToUrl = regEx.Replace(NormalizeToUrl, "-")
End Function
相关问题