将记录通过Python插入Teradata表的问题

时间:2018-06-07 18:58:33

标签: python sql teradata

我有一个项目,我输入一个Terdata数据库表名作为参数,执行一个SQL语句,为每个列提供表(min,max等)的聚合,并返回该信息,然后我将其放入数据帧。 我想要做的是获取数据帧中的行(表中每列column_name 1行)并将结果插入另一个数据库表中。数据分析'结果将存储在哪里。

def main():
    def func_1(cfg_tbl):
        udaExec = teradata.UdaExec(appName="DataAnalysis", version="1.0", logConsole=False)
        main_query = """
        SELECT 'SELECT '''
        || TRIM(ColumnName)
        || ''', COUNT(DISTINCT "' || ColumnName || '") AS DISTINCT_COUNT,'
        || ' COUNT(1) - COUNT("' || ColumnName || '") AS NULL_COUNT,'
        || ' MAX("' || ColumnName || '") AS MAX_COL_VALUE,'
        || ' MIN("' || ColumnName || '") AS MIN_COL_VALUE,'
        || CASE WHEN ColumnType IN ('I', 'D', 'F', 'I1', 'I2', 'I8', 'N', 'DA', 'TS') THEN ' MAX(LENGTH(TO_CHAR("' || ColumnName || '")))'
                WHEN ColumnType IN ('CF', 'CV', 'CO') THEN ' MAX(LENGTH("' || ColumnName || '"))'
                ELSE NULL END || ' AS MAX_COLUMN_LENGTH,'
        || CASE WHEN ColumnType IN ('I', 'D', 'F', 'I1', 'I2', 'I8', 'N', 'DA', 'TS') THEN ' MIN(LENGTH(TO_CHAR("' || ColumnName || '")))'
                WHEN ColumnType IN ('CF', 'CV', 'CO') THEN ' MIN(LENGTH("' || ColumnName || '"))'
                ELSE NULL END || ' AS MIN_COLUMN_LENGTH,'
        || ' COUNT(1) AS TABLE_COUNT,'
        || ' ''%s_%s'' AS TABLE_NM,'
        || ' ''%s'' AS SOURCE_TYPE'
        || ' FROM ' || TRIM(DatabaseName) || '.' || TRIM(TableName) || ';' AS COL
        FROM DBC.ColumnsV A
        WHERE DatabaseName = 'XXX'
        AND TableName = '%s_%s'
        """ % (cfg_pre, cfg_tbl, cfg_src, cfg_pre, cfg_tbl)
        #  connect to Teradata, execute above sql and subsequent SELECT statements
        session = udaExec.connect(method="odbc", dsn="XXXX", username="XXX", password="XXX")
        pd.set_option('max_colwidth', 500)
        df = pd.read_sql(main_query, session)
        sql_execute = list(df.values.flatten())  
        col = ['COLUMN_NAME', 'DISTINCT_COUNT', 'NULL_COUNT', 'MAX_COL_VALUE', 'MIN_COL_VALUE',
        'MAX_COL_LENGTH', 'MIN_COL_LENGTH', 'TABLE_CNT', 'TABLE_NM', 'DATA_SOURCE']

        jdf = pd.DataFrame(columns=col)
        for script in sql_execute:
            jdf = pd.read_sql(script, session)
            print(jdf)
            session.execute("""INSERT INTO DATA_ANALYSIS(COLUMN_NM, DISTINCT_COUNT, NULL_COUNT, MAX_COL_VALUE, MIN_COL_VALUE,
                                            MAX_COL_LENGTH, MIN_COL_LENGTH, TABLE_CNT, TABLE_NM, DATA_SOURCE) 
                                            VALUES(?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""", (jdf[0], jdf[1], jdf[2], jdf[3], 
                                                                                     jdf[4], jdf[5], jdf[6], jdf[7], jdf[8], jdf[9]))

我面临的问题是底部的INSERT代码。我得到一个KeyError:0,我知道我在插入时必须做错事。有什么想法吗?

确切的错误是:

    Traceback (most recent call last):
  File "C:\Users\xxx\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2442, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5280)
  File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5126)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1210, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20523)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1218, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20477)
KeyError: 0

0 个答案:

没有答案