用括号中的双引号替换单引号

时间:2016-10-30 21:13:24

标签: json regex sed

我必须修改文件json。我必须用双引号替换单引号但我不能使用以下命令sed -i -r "s/'/\"/g" file,因为在文件中有更多单引号我不会改变。

以下代码是字符串的示例:

"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']]

结果应该是:

"categories": [["Clothing, Shoes & Jewelry", "Girls"], ["Clothing, Shoes & Jewelry", "Novelty, Costumes & More", "Costumes & Accessories", "More Accessories", "Kids & Baby"]]

示例文件:

{"categories": [['Movies & TV', 'Movies']], "title": "Understanding Seizures and Epilepsy DVD"},
{"title": "Who on Earth is Tom Baker?", "salesRank": {"Books": 3843450}, "categories": [['Books']]},
{"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']], "description": "description, "title": "Mog's Kittens", "salesRank": {"Books": 1760368}}},
{"description": "Three Dr. Suess' Puzzles", "brand": "Dr. Seuss", "categories": [['Toys & Games', 'Puzzles', 'Jigsaw Puzzles']]},

我使用了正则表达式,但问题是我不知道括号中有多少元素。所以我想用一种方法替换括号中的所有单引号,这是一种完美的方式,但我找不到解决方案。

2 个答案:

答案 0 :(得分:1)

#!/usr/bin/perl -w
use strict;

# read each line from stdin
while (my $l=<>) {    
   chomp($l); # remove newline char

   # split: get contents of innermost square brackets
   my @a=split(/(\[[^][]*\])/,$l);

   foreach my $i (@a) {
      # replace quotes iff innermost square brackets
      if ($i=~/^\[/) { $i=~s/'/"/g; }
   }

   # join and print
   print join('',@a)."\n";
}

答案 1 :(得分:0)

我找到了一种方法,使用python。

请注意,由于单引号(以及一些复制/粘贴问题,缺少引号,我修复了),python json无法识别您提供的json流。

我的解决方案是完全使用python库,我怀疑你可以对sed做同样的事情,这就是为什么我提供它,尽管你没有提到这项技术。

  • 我使用ast.literal_eval读取数据,因为它是具有精确python语法的字典列表。单引号不是ast
  • 的问题
  • 我使用json.dump编写数据。它使用双引号写入数据。
  • 请注意,我将其写入“假”文件(即带有I / O写入方法的字符串,以“欺骗”json序列化程序)。

这是一个独立的代码片段:

import io

foo = """[{"categories": [['Movies & TV', 'Movies']], "title": "Understanding Seizures and Epilepsy DVD"},
{"title": "Who on Earth is Tom Baker?", "salesRank": {"Books": 3843450}, "categories": [['Books']]},
{"categories": [['Clothing, Shoes & Jewelry', 'Girls'], ['Clothing, Shoes & Jewelry', 'Novelty, Costumes & More', 'Costumes & Accessories', 'More Accessories', 'Kids & Baby']], "description": "description", "title": "Mog's Kittens", "salesRank": {"Books": 1760368}},
{"description": "Three Dr. Suess' Puzzles",
"brand": "Dr. Seuss", "categories": [['Toys & Games', 'Puzzles', 'Jigsaw Puzzles']]}
]"""

fp = io.StringIO()

json_data=ast.literal_eval(foo)
json.dump(json_data,fp)
print(fp.getvalue())

结果:

[{"categories": [["Movies & TV", "Movies"]], "title": "Understanding Seizures and Epilepsy DVD"}, {"salesRank": {"Books": 3843450}, "categories": [["Books"]], "title": "Who on Earth is Tom Baker?"}, {"description": "description", "salesRank": {"Books": 1760368}, "categories": [["Clothing, Shoes & Jewelry", "Girls"], ["Clothing, Shoes & Jewelry", "Novelty, Costumes & More", "Costumes & Accessories", "More Accessories", "Kids & Baby"]], "title": "Mog's Kittens"}, {"brand": "Dr. Seuss", "description": "Three Dr. Suess' Puzzles", "categories": [["Toys & Games", "Puzzles", "Jigsaw Puzzles"]]}]

这是一个完整的脚本,包含2个参数(输入文件和输出文件)并执行转换。如果您对python不满意,可以在现有的bash脚本中使用此脚本(例如,在fix_quotes.py中保存):

import ast,json,sys

input_file = sys.argv[1]
output_file = sys.argv[2]

with open(input_file,"r") as fr:
    json_data=ast.literal_eval(fr.read())
with open(output_file,"w") as fw:
    json.dump(json_data,fw)