Question

运行用于数据集传输的脚本，该脚本需要近3-4个月的时间才能通过ssh完成。不幸的是，连接会在6-8天后中断，因此需要重新启动。

脚本：

import psycopg2
from time import sleep
from config import config
from tqdm import tqdm
import requests
import json
import subprocess

subprocess.call("./airquality.sh", shell=True)

def val_json():
    db = "select to_json(d) from (  select \
        a.particles_data as particles, \
        a.o3_data as \"O3\", \
        to_timestamp(a.seconds) as \"dateObserved\", \
        l.description as name, \
            json_build_object( \
                'coordinates', \
                json_build_array(l.node_lon, l.node_lat) \
            ) as location \
        from airquality as a \
            inner join deployment as d on \
                d.deployment_id = a.deployment_id \
            inner join location as l on \
                l.location_id = d.location_id \
    ) as d"
    return db

def main():

    url = 'http://localhost:1026/v2/entities/003/attrs?options=keyValues'
    headers = {"Content-Type": "application/json", \
               "fiware-service": "urbansense",  \
               "fiware-servicepath": "/basic"}
    conn = None
    try:
        params = config()
        with psycopg2.connect(**params) as conn:
            with conn.cursor(name='my_cursor') as cur:
                cur.itersize = 2000
                cur.execute(val_json())
       # row = cur.fetchone()
                for row in tqdm(cur):
                    jsonData = json.dumps(row)
                    if jsonData.startswith('[') and jsonData.endswith(']'):
                        jsonData = jsonData[1:-1]
                        print(jsonData)
                    requests.post(url, data= jsonData, headers=headers)
                    sleep(1)

                cur.close()
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
    finally:
        if conn is not None:
            conn.close()

if __name__ == '__main__':
    main()

如何创建文件并跟踪传输进度，所以当再次运行此脚本（连接断开后）时，将从先前停止的位置获取数据集？

编辑：

糟糕！我迷路了。我设法使脚本运行并将进度写入文本文件（air.txt），该文本文件是我手动创建的，内容为0（否则脚本将根本无法运行）。运行此脚本时，air.txt文件的内容将使用光标位置值进行更新。

问题：

我现在的问题是，当我停止运行脚本（作为一种检查方法），并再次重新启动以确保它从先前的位置选择时，脚本从0开始再次覆盖先前的值（并开始一个新计数，而不是将其读取为开始位置）。以下是我更新的脚本：

def val_json():
    db = "select to_json(d) from (  select \
        a.particles_data as particles, \
        a.o3_data as \"O3\", \
        to_timestamp(a.seconds) as \"dateObserved\", \
        l.description as name, \
            json_build_object( \
                'coordinates', \
                json_build_array(l.node_lon, l.node_lat) \
            ) as location \
        from airquality as a \
            inner join deployment as d on \
                d.deployment_id = a.deployment_id \
            inner join location as l on \
                l.location_id = d.location_id \
    ) as d"
    return db

def main():
    RESTART_POINT_FILE = 'air.txt'
    conn = None
    try:
        params = config()
        with open(RESTART_POINT_FILE) as fd:
           rows_to_skip = int(next(fd))
    #except OSError:
        rows_to_skip = 0
        with psycopg2.connect(**params) as conn:
            with conn.cursor(name='my_cursor') as cur:
                cur.itersize = 2000
                cur.execute(val_json())

                for processed_rows, row in enumerate(tqdm(cur)):
                    if processed_rows < rows_to_skip: continue
                    jsonData = json.dumps(row)
                    if jsonData.startswith('[') and jsonData.endswith(']'):
                        jsonData = jsonData[1:-1]

                        print('\n', processed_rows, '\t', jsonData)
                    #update progress file...
                    with open(RESTART_POINT_FILE, "w") as fd:
                        print(processed_rows, file=fd)
                    sleep(1)

                cur.close()
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)

    finally:
        if conn is not None:
            conn.close()

if __name__ == '__main__':
    main()

Answer 1

一种简单的方法是在众所周知的地方使用专用文件。

该文件将包含一行，其中包含成功处理或不存在的行数。

在开始时，如果不存在该文件，则要跳过的记录数将为0，如果存在，则要跳过的记录数将为该文件第一行的数。应该更改循环以跳过那些记录并跟踪最后处理的记录的数量。

成功终止后，应删除文件；如果写入错误，则应写入最后成功处理的记录的编号。

骨骼代码：

RESTART_POINT_FILE = ... # full path of the restart point file

# begin: read the file:
try:
    with open(RESTART_POINT_FILE) as fd:
        rows_to_skip = int(next(fd))
except OSError:
    rows_to_skip = 0

# loop:

                for processed_row, row in enumerate(tqdm(cur)):
                    if processed_row < rows_to_skip: continue
                    ...

# end
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
        # write the file
        with open(RESTART_POINT_FILE, "w") as fd:
            print(processed_rows, file=fd)
    finally:
        if conn is not None:
            conn.close()
        # try to remove the file if it exists
        try:
            os.remove(RESTART_POINT_FILE)
        except OSError:
            pass

注意：没有经过测试...

Answer 2

尝试将while循环用于与True或fals的连接，而当连接为fals时，请等到其再次为真为止

Answer 3

如果您的问题完全是由于ssh远程终端超时造成的，那么简单的答案是：使用将在远程上运行的终端多路复用器，例如 tmux ，屏幕机器，并且即使会话超时也能保持程序运行，您只需要在方便时重新连接，并重新连接终端以查看其处理过程，甚至甚至可以像 nohup 这样的“终端分离器”（但随后您将必要时需要在文件上重定向标准输出）。

但是，这不会使您免于偶尔的OOM杀机，服务器重新启动等等），为此，使用重载机制对程序状态进行常规序列化是一个好主意

创建一个文件以跟踪任务进度

3 个答案: