将txt文件解析为csv

时间:2018-01-24 15:52:59

标签: python csv parsing

我有这个大文本文件,像这样继续重复

[Event "Rated Crazyhouse game"]
[Site "https://lichess.org/NsNeyqv1"]
[Date "2018.01.23"]
[Round "-"]
[White "nikoskaterini"]
[Black "Ominous"]
[Result "0-1"]
[UTCDate "2018.01.23"]
[UTCTime "18:22:39"]
[WhiteElo "1611"]
[BlackElo "2118"]
[WhiteRatingDiff "-10"]
[BlackRatingDiff "+2"]
[Variant "Crazyhouse"]
[TimeControl "30+0"]
[ECO "?"]
[Opening "?"]
[Termination "Normal"]

1. d4 { [%clk 0:00:30] } d5 { [%clk 0:00:30] } 2. Bf4 { [%clk 0:00:27] } Nc6 { [%clk 0:00:30] } 3. e3 { [%clk 0:00:26] } Nf6 { [%clk 0:00:30] } 4. Bd3 { [%clk 0:00:25] } Bg4 { [%clk 0:00:30] } 5. c3 { [%clk 0:00:25] } Bxd1 { [%clk 0:00:29] } 6. Kxd1 { [%clk 0:00:24] } e6 { [%clk 0:00:28] } 7. Nd2 { [%clk 0:00:23] } Bd6 { [%clk 0:00:28] } 8. Bxd6 { [%clk 0:00:21] } cxd6 { [%clk 0:00:28] } 9. Ngf3 { [%clk 0:00:19] } O-O { [%clk 0:00:27] } 10. Ng5 { [%clk 0:00:17] } B@e4 { [%clk 0:00:27] } 11. Bxe4 { [%clk 0:00:14] } dxe4 { [%clk 0:00:27] } 12. B@e2 { [%clk 0:00:11] } B@f5 { [%clk 0:00:27] } 13. Nh3 { [%clk 0:00:08] } Bxh3 { [%clk 0:00:24] } 14. gxh3 { [%clk 0:00:08] } N@d3 { [%clk 0:00:23] } 15. Rg1 { [%clk 0:00:08] } Nxf2+ { [%clk 0:00:22] } 16. Ke1 { [%clk 0:00:06] } Nd3+ { [%clk 0:00:22] } 17. Bxd3 { [%clk 0:00:05] } exd3 { [%clk 0:00:20] } 18. B@h6 { [%clk 0:00:04] } Q@e2# { [%clk 0:00:18] } 0-1


[Event "Rated Crazyhouse game"]
[Site "https://lichess.org/0r8jJe5d"]
[Date "2018.01.23"]
[Round "-"]
[White "RefuteMeThisWaste"]
[Black "Ominous"]
[Result "0-1"]
[UTCDate "2018.01.23"]
[UTCTime "15:51:19"]
[WhiteElo "1718"]
[BlackElo "2115"]
[WhiteRatingDiff "-23"]
[BlackRatingDiff "+3"]
[Variant "Crazyhouse"]
[TimeControl "300+0"]
[ECO "?"]
[Opening "?"]
[Termination "Time forfeit"]

1. e4 { [%clk 0:05:00] } e5 { [%clk 0:05:00] } 2. Bc4 { [%clk 0:04:58] } Bc5 { [%clk 0:04:58] } 3. Nf3 { [%clk 0:04:50] } Nc6 { [%clk 0:04:56] } 4. Bxf7+ { [%clk 0:04:42] } Kxf7 { [%clk 0:04:55] } 5. d4 { [%clk 0:04:37] } Bxd4 { [%clk 0:04:53] } 6. Ng5+ { [%clk 0:04:27] } Kf8 { [%clk 0:04:46] } 7. Qf3+ { [%clk 0:03:38] } Nf6 { [%clk 0:04:42] } 8. P@d5 { [%clk 0:03:11] } B@g4 { [%clk 0:04:37] } 9. Qa3+ { [%clk 0:02:45] } P@c5 { [%clk 0:04:21] } 10. dxc6 { [%clk 0:02:40] } Bxf2+ { [%clk 0:04:19] } 11. Kxf2 { [%clk 0:02:30] } dxc6 { [%clk 0:04:18] } 12. h3 { [%clk 0:01:41] } Bge6 { [%clk 0:04:06] } 13. B@g4 { [%clk 0:01:00] } Bxg4 { [%clk 0:03:52] } 14. hxg4 { [%clk 0:00:57] } B@d4+ { [%clk 0:03:45] } 15. B@e3 { [%clk 0:00:53] } Nxg4+ { [%clk 0:03:38] } 16. Kf1 { [%clk 0:00:38] } Nxe3+ { [%clk 0:03:34] } 17. Bxe3 { [%clk 0:00:35] } Qf6+ { [%clk 0:03:19] } 18. N@f5 { [%clk 0:00:29] } Bxe3 { [%clk 0:03:14] } 19. Qxe3 { [%clk 0:00:24] } B@d4 { [%clk 0:03:11] } 20. N@e6+ { [%clk 0:00:16] } Bxe6 { [%clk 0:03:09] } 21. Nxe6+ { [%clk 0:00:12] } Qxe6 { [%clk 0:03:08] } 22. B@e7+ { [%clk 0:00:08] } Qxe7 { [%clk 0:03:01] } 23. Nxe7 { [%clk 0:00:07] } P@e2+ { [%clk 0:03:00] } 24. Qxe2 { [%clk 0:00:03] } N@g3+ { [%clk 0:02:59] } 25. Ke1 { [%clk 0:00:01] } B@f2+ { [%clk 0:02:55] } 0-1

我想将其转换为csv文件,其中“Event”,“Site”,“date”等都是标题。任何人都可以指出我如何开始这个项目的正确方向。谢谢你的帮助。

1 个答案:

答案 0 :(得分:0)

我会使用pandas执行此任务:

file = open("file.txt", "r")
games = []
dic = {}
for l in file:
    if l[0] == "[":
        string = l[1:-1]
        header = string.split()[0]
        dic[header] = string[len(header):-2].strip().strip('"')
    if l[0] == "1":
        games.append(dic)
        dic = {}
    else:
        pass

df = pd.DataFrame(games)

df.to_csv('games.csv', index=False)

df1 = pd.read_csv('games.csv')
print(df1)

这导致:

     Black  BlackElo  BlackRatingDiff        Date ECO                  Event  \
0  Ominous      2118                2  2018.01.23   ?  Rated Crazyhouse game   
1  Ominous      2115                3  2018.01.23   ?  Rated Crazyhouse game   

  Opening Result Round                          Site   Termination  \
0       ?    0-1     -  https://lichess.org/NsNeyqv1        Normal   
1       ?    0-1     -  https://lichess.org/0r8jJe5d  Time forfeit   

  TimeControl     UTCDate   UTCTime     Variant              White  WhiteElo  \
0        30+0  2018.01.23  18:22:39  Crazyhouse      nikoskaterini      1611   
1       300+0  2018.01.23  15:51:19  Crazyhouse  RefuteMeThisWaste      1718   

   WhiteRatingDiff  
0              -10  
1              -23