替换Ruby中的非标准字符

时间:2016-08-29 16:06:00

标签: ruby string

我有一个字符串数组:

strings = ["\u2014 some text", "\u00A0 Foo", "Bar"]

为了获得如下所示的数组,我应该写什么语句:

strings = [" some text", " Foo", "Bar"]

我试过跟随,但没有运气:

strings.map!{|string| string.gsub!(/(Wu2014|Wu00A0)/, '')}

4 个答案:

答案 0 :(得分:1)

我建议删除其中一个刘海。如果您要删除所有非ASCII字符,可以尝试:

[" some text", " Foo", "Bar"]

这将替换所有非ASCII字符。对我来说,这导致:

var data1 = "ymbgaraibkfmvocpizdydugvalagaivdbfsfbepeyccqfepzvtpyxtbadkhmwmoswrcxnargtlswqemafandgkmydtimuzvjwxvlfwlhvkrgcsithaqlcvrihrwqkpjdhgfgreqoxzfvhjzojhghfwbvpfzectwwhexthbsndovxejsntmjihchaotbgcysfdaojkjldprwyrnischrgmtvjcorypvopfmegizfkvudubnejzfqffvgdoxohuinkyygbdzmshvyqyhsozwvlhevfepdvafgkqpkmcsikfyxczcovrmwqxxbnhfzcjjcpgzjjfateajnnvlbwhyppdleahgaypxidkpwmfqwqyofwdqgxhjaxvyrzupfwesmxbjszolgwqvfiozofncbohduqgiswuiyddmwlwubetyaummenkdfptjczxemryuotrrymrfdxtrebpbjtpnuhsbnovhectpjhfhahbqrfbyxggobsweefcwxpqsspyssrmdhuelkkvyjxswjwofngpwfxvknkjviiavorwyfzlnktmfwxkvwkrwdcxjfzikdyswsuxegmhtnxjraqrdchaauazfhtklxsksbhwgjphgbasfnlwqwukprgvihntsyymdrfovaszjywuqygpvjtvlsvvqbvzsmgweiayhlubnbsitvfxawhfmfiatxvqrcwjshvovxknnxnyyfexqycrlyksderlqarqhkxyaqwlwoqcribumrqjtelhwdvaiysgjlvksrfvjlcaiwrirtkkxbwgicyhvakxgdjwnwmubkiazdjkfmotglclqndqjxethoutvjchjbkoasnnfbgrnycucfpeovruguzumgmgddqwjgdvaujhyqsqtoexmnfuluaqbxoofvotvfoiexbnprrxptchmlctzgqtkivsilwgwgvpidpvasurraqfkcmxhdapjrlrnkbklwkrvoaziznlpor",
    data2 = "qhxepbshlrhoecdaodgpousbzfcqjxulatciapuftffahhlmxbufgjuxstfjvljybfxnenlacmjqoymvamphpxnolwijwcecgwbcjhgdybfffwoygikvoecdggplfohemfypxfsvdrseyhmvkoovxhdvoavsqqbrsqrkqhbtmgwaurgisloqjixfwfvwtszcxwktkwesaxsmhsvlitegrlzkvfqoiiwxbzskzoewbkxtphapavbyvhzvgrrfriddnsrftfowhdanvhjvurhljmpxvpddxmzfgwwpkjrfgqptrmumoemhfpojnxzwlrxkcafvbhlwrapubhveattfifsmiounhqusvhywnxhwrgamgnesxmzliyzisqrwvkiyderyotxhwspqrrkeczjysfujvovsfcfouykcqyjoobfdgnlswfzjmyucaxuaslzwfnetekymrwbvponiaojdqnbmboldvvitamntwnyaeppjaohwkrisrlrgwcjqqgxeqerjrbapfzurcwxhcwzugcgnirkkrxdthtbmdqgvqxilllrsbwjhwqszrjtzyetwubdrlyakzxcveufvhqugyawvkivwonvmrgnchkzdysngqdibhkyboyftxcvvjoggecjsajbuqkjjxfvynrjsnvtfvgpgveycxidhhfauvjovmnbqgoxsafknluyimkczykwdgvqwlvvgdmufxdypwnajkncoynqticfetcdafvtqszuwfmrdggifokwmkgzuxnhncmnsstffqpqbplypapctctfhqpihavligbrutxmmygiyaklqtakdidvnvrjfteazeqmbgklrgrorudayokxptswwkcircwuhcavhdparjfkjypkyxhbgwxbkvpvrtzjaetahmxevmkhdfyidhrdeejapfbafwmdqjqszwnwzgclitdhlnkaiyldwkwwzvhyorgbysyjbxsspnjdewjxbhpsvj",
    hash = Object.create(null),
    count = function (a) { hash[a] = (hash[a] || 0) + this; },
    result;

data1.split('').forEach(count, 1);
data2.split('').forEach(count, -1);

result = Object.keys(hash).filter(function (k) { return hash[k]; });

console.log(result);

答案 1 :(得分:1)

您可能希望从字符串的开头删除所有非ASCII字符:

strings = ["\u2014 some text", "\u00A0 Foo", "Bar"]
strings.map!{|s| s.sub(/\A\P{ASCII}+/,'')} # remove non-ASCII from the start of the string

请参阅Ruby demo

或者,您可以使用

删除除字和空格之外的所有字符
strings.map!{|s| s.sub(/\A[^\w\s]+/,'')}

this Ruby demo

详细

  • \A - 字符串开头
  • \P{ASCII} - 任何字符,但ASCII
  • [^\w\s] - 任何字符,但只有一个字(\w)或空格(\s)字符
  • + - 与量化模式的一次或多次匹配的量词。

答案 2 :(得分:1)

如果要删除非ASCII字符,则

strings.map{| s | s.encode('ASCII', 'binary', invalid: :replace, undef: :replace, replace: '')}

答案 3 :(得分:-1)

你有太多的刘海和不正确的正则表达式

尝试

strings.map{|string| string.gsub!(/(\u2014|\u00A0)/, '')}

strings.map!{|string| string.gsub(/(\u2014|\u00A0)/, '')}

new_strings = strings.map{|string| string.gsub(/(\u2014|\u00A0)/, '')}
strings = new_strings