Question

我想为下面的字符串写一个正则表达式：

/en-us/newyork/stores.storelocation.json
/es/colecciones/víveres.storelocation.json
/es/colección/víveres%C3%ADa.storelocation.json
/fr/collections/magasins.storelocation.json
/ja/%E5%95%86%E5%93%81%E3%82%AB%E3%83%86%E3%82%B4%E3%83%AA%E3%83%BC/%E3%82%B8%E3%83%A5%E3%82%A8%E3%83%AA%E3%83%BC.storelocation.json

我是用英语写的 \/en-us\/[a-zA-Z]+\/[a-zA-Z]+.storelocation.json 但问题是它不能与法语，中文或俄语等其他语言一起使用。如果我用[a-zA-Z]替换[\w]，那么它会考虑层次结构中的所有字符。

字符串的静态部分是＆＃34; .storelocation.json＆＃34;和层次结构将保持相同，如＆＃34; /language/location/stores.storelocation.json"

任何人都可以帮助我。我想要一个匹配上面所有字符串的正则表达式。

Answer 1

而不是[a-zA-Z]使用\p{L}来匹配Unicode字母。

您可以使用此正则表达式：

^/\p{L}{2}(?:-\p{L}{2})?/(?:\p{L}|%[A-F\d]{2})+/(?:\p{L}|%[A-F\d]{2})+\.storelocation\.json$

在Java中使用：

final String regex = 
"^/\\p\\{L\\}{2\\}(?:-\\p\\{L\\}{2\\})?/(?:\\p\\{L\\}|%[A-F\\d]{2})+/(?:\\p\\{L\\}|%[A-F\\d]{2})+\\.storelocation\\.json$";

RegEx Demo

不同语言字符的正则表达式模式

1 个答案: