以最快的方式批量加载从vertcicadb表到mysql表的大量数据

时间:2017-09-27 05:32:56

标签: java mysql hibernate batch-processing vertica

我需要在 Vertica 表上执行一些 select *查询并将它们放入 MySQL 表中。但由于迭代,它非常慢。什么可能是一个更快的过程?有人可以向我解释如何在java中实现hibernate或任何其他快速进程吗?

import java.io.IOException;
import java.sql.*;


public class Main {
public static void main(String args[]) throws SQLException, IOException, ClassNotFoundException {
        try {
            Class.forName("com.vertica.jdbc.Driver");
            Connection c = null;
            Statement stmt = null;

     c=DriverManager.getConnection("jdbc:vertica://host,user,pass);
            stmt = c.createStatement();
            //File f2 = new File("/Users/pragati.ratan/Desktop/Kalyan.csv");
            //CSVWriter csvWriter = new CSVWriter(new FileWriter(f2), ',');
            String sql = "select * from unified_global_dw.offnetwork_daily_burn_fact_v where date >= '2017-09-03 00:00:00';";
            ResultSet rs = stmt.executeQuery(sql);
            //csvWriter.writeAll(rs, true);


            Class.forName("com.mysql.jdbc.Driver");
            Connection c1 = null;
            Statement stmt1 = null;
            c1 = DriverManager.getConnection("jdbc:mysql://hostname/database", "user", "pass");
            // stmt1 = c1.createStatement();
            String sql1 = "insert into offnetwork_daily_burn (id,offer_id,date,latest_pull_on,burn) values (null,?,?,?,?)";
            PreparedStatement preparedStatement = c1.prepareStatement(sql1);
            while (rs.next()) {
                int offer_id = rs.getInt(2);
                Date dateTime = rs.getDate(3);
                Date datetime1 = rs.getDate(4);
                double burn = rs.getDouble(5);

                preparedStatement.setInt(1, offer_id);
                preparedStatement.setDate(2, dateTime);
                preparedStatement.setDate(3, datetime1);
                preparedStatement.setDouble(4, burn);

                preparedStatement.executeUpdate();


            }
            c.close();
            c1.close();
        }catch (Exception e){
            e.printStackTrace();
        }



        }

}

这是我现在编写的代码。

1 个答案:

答案 0 :(得分:1)

这可能不是正确的答案,但是!

为什么不尝试从两个堆栈实现主批量导出和导入?

Vertica导出

vsql -U user -w password-H hosts  -F ',' -At -c "SELECT * FROM schema.TableName"' > /tmp/TableName.csv

MySQL导入

mysqlimport --ignore-lines=1 \
            --fields-terminated-by=, \
            --local -u root \
            -p Database \
             TableName.csv