Java

时间:2018-12-14 06:52:06

标签: java csv parsing univocity

我正在编写一个程序来解析基于键值的日志,如下所示:

dstcountry="United States" date=2018-12-13 time=23:47:32

我正在使用Univocity解析器来做到这一点。这是我的代码。

CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.getFormat().setDelimiter(' ');
parserSettings.getFormat().setQuote('"');
parserSettings.getFormat().setQuoteEscape('"');
parserSettings.getFormat().setCharToEscapeQuoteEscaping('"');
CsvParser keyValueParser = new CsvParser(parserSettings);
String line = "dstcountry=\"United States\" date=2018-12-13 time=23:47:32";
String[] resp = keyValueParser.parseLine(line);

但是解析器给了我这个输出:

dstcountry="United, 
States", 
date=2018-12-13, 
time=23:47:32

预期输出为

dstcountry="United States", 
date=2018-12-13, 
time=23:47:32

代码是否存在问题,或者这是解析器错误?

致谢,
哈里

2 个答案:

答案 0 :(得分:0)

我最终编写了自己的解析器。如果有人需要,我在这里粘贴以供将来参考。欢迎提出建议和意见。

import UIKit

let json = """
[
    [
        {
        "id": 22,
        "request_id": "rqst5c12fc9e856ae1.06631647",
        "business_name": "Code Viable",
        "business_email": "code@viable.com",
        "title": "Apache Load/Ubuntu",
        }
    ],
    [
        {
        "id": 24,
        "request_id": "rqst5c130cae6f7609.41056231",
        "business_name": "Code Viable",
        "business_email": "code@viable.com",
        "title": "Load",
        }
    ]
]
"""

struct ResponseObject: Codable {
    let id: Int?
    let requestId: String?
    let businessName: String?
    let businessEmail: String?
    let title: String?
}

if let data = json.data(using: .utf8) {
    let decoder = JSONDecoder()
    decoder.keyDecodingStrategy = .convertFromSnakeCase

    do {
        let parsedResponse = try decoder.decode([[ResponseObject]].self, from: data)
        print(parsedResponse)
    } catch {
        print(error.localizedDescription)
    }
}

答案 1 :(得分:0)

此处是库的作者。这不是解析器错误。这里的问题是您没有解析CSV文件。

当解析器看到:dstcountry="United,后跟一个空格(这是您的分隔符)时,它将认为它是一个值。

引号设置仅适用于以引号字符开头的字段。由于您的输入不是"dstcountry=""United States""",因此解析器将无法根据需要进行处理。没有CSV解析器可以帮您做到这一点。

同样,您未在处理CSV。您在这里唯一能做的就是使用2个解析器实例:一个分解在=周围的行,另一个分解在第一个解析器的结果中由分隔的值。例如:

    CsvParserSettings parserSettings = new CsvParserSettings();
    //break down the rows around the `=` character
    parserSettings.getFormat().setDelimiter('=');

    CsvParser keyValueParser = new CsvParser(parserSettings);
    String line = "dstcountry=\"United States\" date=2018-12-13 time=23:47:32";
    String[] keyPairs = keyValueParser.parseLine(line);

    //break down each value around the whitespace.
    parserSettings.getFormat().setDelimiter(' ');
    CsvParser valueParser = new CsvParser(parserSettings);

    //add all values to a list
    List<String> row = new ArrayList<String>();

    for(String value : keyPairs){
        //if a value has a whitespace, break it down using the the other parser instance
        String[] values = valueParser.parseLine(value);

        Collections.addAll(row, values);
    }

    //here is your result
    System.out.println(row);

这将打印出来:

[dstcountry, United States, date, 2018-12-13, time, 23:47:32]

您现在有了键值。以下代码将根据需要将其打印出来:

    for (int i = 0; i < row.size(); i += 2) {
        System.out.println(row.get(i) + " = " + row.get(i + 1));
    }

输出:

dstcountry = United States

date = 2018-12-13

time = 23:47:32

希望这会有所帮助,并感谢您使用我们的解析器!