从行中提取多个模式匹配

时间:2019-05-29 06:04:42

标签: awk sed grep

我得到了一个CSV文件,其中包含混合格式的客户数据。我想从每一行中提取多列文本,但仅提取那些与某些模式匹配的文本。

例如:

email":"stanleyers32@gmail.com",,,"state":"NY","city":"NORTH CHARLESTON",, 16","last_name":"STANLEY","first_name":"GLENN",,"__v":0,,
first_name":"Dawn","last_name":"Alston",,"email":"dawlston@gmail.com",,,,"__v":0}
email":"666horfan@gmail.com",,,"state":"NJ","city":"MONROE CITY","last_name":"White","first_name":"Danny",,"__v":0,

我想从这些行中提取名字,姓氏,城市,州和电子邮件,并除掉所有其他内容,同时保持其余内容之间的分隔符。如您所见,列结构不一致,所以我想先删除这里不需要的任何数据。

我尝试使用SED和grep来匹配每个值的后缀,但无法获得正确的输出。预期的输出如下:

email":"stanleyers32@gmail.com","state":"NY","city":"NORTH CHARLESTON","last_name":"STANLEY","first_name":"GLENN",    
first_name":"Dawn","last_name":"Alston","email":"dawlston@gmail.com"
email":"666horfan@gmail.com","state":"NJ","city":"MONROE CITY","last_name":"White","first_name":"Danny"

0 个答案:

没有答案