如何从CSV文件中获取特定数据

时间:2018-05-25 20:16:43

标签: java arrays csv arraylist sequential

我有一个非常大的CSV文件,我已设法使用Scanner

将所有这些放入ArrayList
    Path filepath = Paths.get("./data.csv");

    try{
      Scanner InputStream = new Scanner(filepath);
      while (InputStream.hasNext()){

        wholefile.add(String.valueOf(InputStream.next()));
      } InputStream.close();

    System.out.println(wholefile);

    } catch (IOException e) {
      e.printStackTrace();
    }

  }

我的数组看起来像这样:

wholefile = [id,property,address,first_name,last_name,email,Owner,contact,address,Price,Date,sold,1,94032,Mockingbird,Alley,Brander,Verillo,bverillo0 @ sogou.com ,, 435587.57 ,, 2,293,Haas,Lane,Maxy,Reynalds ...........]

以下是excel中csv文件的屏幕截图 https://plus.google.com/photos/photo/115135191238195349859/6559552907258825106?authkey=CIu-hovf5pj29gE

我想对这些数据做一些事情,但我很困惑我需要写什么方法:

  1. 按ID
  2. 获取房产记录
  3. 获取n个顶级房产的列表
  4. 一个月的总销售额。
  5. 任何帮助或指导都会非常感激,我不确定我是否以正确的方式对此进行了讨论

    https://plus.google.com/photos/photo/115135191238195349859/6559637333893665186

3 个答案:

答案 0 :(得分:1)

不要浪费时间重新发明轮子

我建议使用 Apache Commons CSV库来操作.csv文件。

您可以找到官方文档 here

一些例子 here

答案 1 :(得分:0)

使用字符串的ArrayList在执行您想要的操作时会有糟糕的性能。 首先创建一个与CVS标题匹配的对象。然后在读取文件时开始添加到您创建的对象的ArrayList,并进行排序,搜索和Total sales只需在ArrayList上创建一个流。

答案 2 :(得分:0)

我不得不推出一个自定义的CSV解析器,用于我们试图做的一些概念验证,我认为你可以在这里重新定位:

CSVReader.java

public class CSVReader implements Iterable<CSVRow> {

    private List<String> _data;
    private int _itPos = 0;
    private int _skip = 0;
    private FileIterator _it;
    private boolean _hasTrailingComma = false;

    public CSVReader(Path path, boolean hasTrailingComma) throws IOException {
        this(Files.readAllLines(path), hasTrailingComma);
    }

    public CSVReader(Path path) throws IOException {
        this(path, false);
    }

    public CSVReader(List<String> data, boolean hasTrailingComma) {
        _data = data;
        _it = new FileIterator();
        _hasTrailingComma = hasTrailingComma;
    }

    public CSVReader(List<String> data) {
        this(data, false);
    }

    public CSVRow getHeaders() {
        return new CSVRow(_data.get(0), _hasTrailingComma);
    }

    public void skip(int rows) {
        _skip = rows;
    }

    @Override
    public Iterator<CSVRow> iterator() {
        _itPos = _skip;
        return _it;
    }

    private class FileIterator implements Iterator<CSVRow> {

        @Override
        public boolean hasNext() {
            return _itPos < _data.size();
        }

        @Override
        public CSVRow next() {
            if (_itPos == _data.size()) {
                throw new NoSuchElementException();
            }
            return new CSVRow(_data.get(_itPos++), _hasTrailingComma);
        }

    }
}

<强> CSVRow.java

public class CSVRow implements Iterable<String> {

    private String[] _data;
    private int _itPos = 0;
    private int _skip = 0;
    private RowIterator _it = null;
    private int _actualLength = 0;

    public CSVRow(String row, boolean trailingComma) {
        // Minor hack
        // in case the data doesn't end in commas
        // we check for the last character and add
        // a comma. Ideally, the input file should be fixed;
        if(trailingComma && !row.endsWith(",")) {
            row += ",";
        }
        _data = row.split("\\s*,\\s*", -1);
        _actualLength = trailingComma ? _data.length - 1 : _data.length;
        _it = new RowIterator();
    }

    public CSVRow(String row) {
        this(row, false);
    }

    public void skip(int cells) {
        _skip = cells;
    }

    @Override
    public Iterator<String> iterator() {
        _itPos = _skip;
        return _it;
    }

    public String[] toArray() {
        return Arrays.copyOf(_data, _actualLength);
    }

    private class RowIterator implements Iterator<String> {

        @Override
        public boolean hasNext() {
            return _itPos < _actualLength;
        }

        @Override
        public String next() {
            if (_itPos == _actualLength) {
                throw new NoSuchElementException();
            }
            return _data[_itPos++];
        }

    }
}

<强>用法

public static void main(String[] args) {
    Path filepath = Paths.get("./data.csv");
    CSVReader reader = new CSVReader(filepath);
    for (CSVRow row : reader) {
        for (String str : row) {
                System.out.printf("%s ", str);
        }
        System.out.println();
    }
}

现在将每一行建模为一个对象是有用的,这样你就可以用Java做一些事情了。您可以定义为每行建模的类Property

public class Property {

    private int id;
    private String address;
    private String firstName;
    private String lastName;
    private String email;
    private String ownerContactAddress;
    private BigDecimal price;
    private java.sql.Date dateSold;

    public Property() {
    } 

    // Setters and getters
    public long getId() {
        return this.id;
    }
    public void setId(String id) {
        this.id = Long.parseLong(id);
    }
    public String getAddress() {
        return this.address;
    }
    public void setAddress(String address) {
        this.address = address;
    }
    // TODO: setter/getters for firstName, lastName, email, ownerContactAddress

    public BigDecimal getPrice() {
        return this.price;
    }
    public void setPrice(String price, Locale locale) throws ParseException {
        NumberFormat format = NumberFormat.getNumberInstance(locale);
        if (format instanceof DecimalFormat) {
            ((DecimalFormat) format).setParseBigDecimal(true);
        }
        this.price = (BigDecimal) format.parse(amount.replaceAll("[^\\d.,]",""));
    }
    public java.sql.Date getDateSold() {
        return this.dateSold;
    }
    public void setDateSold(String date, String format) throws ParseException {
        SimpleDateFormat sdf = new SimpleDateFormat(format);
        this.dateSold = new Date(sdf.parse(date).getTime());
    }
}

将所有内容整合在一起(未经测试)

public static void main(String[] args) {

    // Collection to store properties
    // You could also write a class to wrap this 
    // map along with the methods you need to implement
    // Say PropertyTable {
    //        private Map<Long, Property> properties ...
    //        Property getPropertyById(long id);
    //        getHighestPriced() // sort the map by price
    // }
    Map<Long, Property> properties = new HashMap<>();

    Path filepath = Paths.get("./data.csv");
    CSVReader reader = new CSVReader(filepath);
    for (CSVRow row : reader) {
        Iterator<String> it = row.iterator();
        Property p = new Property();
        p.setId(it.next());
        p.setAddress(it.next());
        // ... set the remaining properties
        p.setPrice(it.next(), new Locale("en", "GB"));
        p.seDateSold(it.next(), "MM/dd/yyyy");
        properties.put(p.getId(), p);
    }
    // At this point, you should have all the properties read

    // let's try to get property with id 5
    Property prop = properties.get(5L);
}

我希望这会有所帮助。