无法使用fastparquet读取镶木地板文件,但与pyarrow一起使用-可为null的整数

时间:2019-06-04 14:58:15

标签: python pandas parquet pyarrow fastparquet

当前正在运行如下代码:

Traceback (most recent call last):
  File "/home/cena/.PyCharmCE2019.1/config/scratches/scratch_2.py", line 9, in <module>
    elem = browser.find_element_by_css_selector("*[class^='" + poeisz_classname + "']")
  File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 598, in find_element_by_css_selector
    return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
  File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
    'value': value})['value']
  File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"*[class^='full-product full-product--price__for']"}
  (Session info: chrome=74.0.3729.169)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 4.15.0-50-generic x86_64)

由于文件很大,所以我遇到了内存消耗问题,因此我想调查df = pd.read_parquet('/tmp/my-file.parquet', engine='pyarrow') 是否可以更好地利用内存。

当我切换引擎时:

fastparquet

此行现在会引发如下错误:

df = pd.read_parquet('/tmp/my-file.parquet', engine='fastparquet')

我相信这是因为我有一个包含空值的整数字段。我找不到任何不支持此功能的文档。

关于为什么会发生这种情况或在仍然使用fastparquet时如何解决的任何想法?

0 个答案:

没有答案