将文本文件树结构内容转换为关系平面数据集

时间:2014-06-19 19:37:08

标签: java parsing

给定一个包含以下内容的文本文件:

A.  PRODUCT TYPE [A0001]
    MUTUAL FUNDS [A0002]
        OPEN FUNDED
        CLOSED FUNDS [A1313]
            MONEY FUNDS [A1317]
                INDEX TRACKING [A1318]
                EQUITY TRACKING[A1319]
                SECTOR TRACKING [A1320]
            REGION-SPECIFIC FUNDS [A1325]
            COUNTRY-SPECIFIC FUNDS [A1314]
                AUSTRIA [A1315]
                AUSTRALIA [A1323]
                XXXXX
                XXXXX]
            CXXXXXX [A1321]
            XXXXXXXX [A1324]
        XXXXXXXXX [A1306]
            XXXXX [A1308]
            XXXXX [A1307]
        XXXXXXX [A1309]
            XXXXXX [A1311]
            XXXXXX [A1310]
            XXXXXXX [A1312]
        XXXXXXXXXXX [A1299]
            XXXXXXXX [A1300]
            XXXXXXX [A1301]
        XXXXXXXXXX [A1329]
        XXXXXXXXXX [A1326]
            XXXXXXX [A1327]
            XXXXXXXXXX [A1328]
        XXXXXXXXXXXXX [A1302]
            XXXXXXXXXXX [A1303]
            XXXXXXXXXX [A1304]
    XXXXXXX [A0323]
        XXXXXXXXXX [A0351]
            XXXXXXX [A0362]
            XXXXXXX [A0363]
            XXXXXXXX [A0364]
            XXXXXXX [A0365]

将每行文本转换为以下内容的推荐方法是什么:

PRODUCT TYPE [A0001] > MUTUAL FUNDS [A0002] > CLOSED FUNDS [A1313] > MONEY FUNDS [A1317] >  INDEX TRACKING [A1318]
PRODUCT TYPE [A0001] > MUTUAL FUNDS [A0002] > CLOSED FUNDS [A1313] > MONEY FUNDS [A1317] >  EQUITY TRACKING[A1319]
PRODUCT TYPE [A0001] > MUTUAL FUNDS [A0002] > CLOSED FUNDS [A1313] > MONEY FUNDS [A1317] >  SECTOR TRACKING [A1320]

1 个答案:

答案 0 :(得分:1)

  • 创建POJO以表示文件中的每个实体。所以创建一个类ProductType,一个类Fund,也许是一个类Tracking(我不知道你的域名,所以我不知道这些实体或它们的含义)。

  • 然后构建解析器以将文本文件解析为POJO。您可以使用单元测试(某些字符串输入,某些对象结构输出)轻松验证。在您的情况下,您可以逐行解析文件,并通过确定行开头的制表符/空白来选择文件。

  • 当您拥有对象结构时,您可以遍历它并生成您喜欢的任何内容。你可以,例如使用访客模式来封装最终文本的生成。