达到R中的内存限制

时间:2013-05-22 13:27:49

标签: r date memory memory-leaks

我在R中遇到一个奇怪的问题。

我有一个大的data.table dataTs1:

Classes ‘data.table’ and 'data.frame':  419172 obs. of  5 variables:
 $ TimeStamp: chr  "01MAR13:07:15:00" "01MAR13:07:16:00" "01MAR13:07:18:00" ...
 $ col1     : chr  "ALL1" "ALL1" "ALL1" "ALL1" ...
 $ col2     : int  NA NA NA NA NA NA NA NA NA NA ...
 $ col3     : int  4 4 4 4 4 4 4 4 4 4 ...
 $ col4     : int  621 810 4 4 8 1 3 1 1 1 ...

我使用fread函数加载了这个表。

内存分配似乎没问题。

> memory.size(max=TRUE)
[1] 82.94

我尝试将第一行的类修改为POSIX,所以我写道:

  

dataTs1 $ TimeStamp< - strptime(dataTs1 $ TimeStamp,“%d%b%y:%H:%M:%S”)

有了这条线,我达到了16G的内存限制......但是当我写道:

test <- 1:length(dataTs1$TimeStamp)
dataTs1$TimeStamp <- test

它可以在没有任何内存过载的情况下完美运行。

我对R很新,我会很感激如果你能帮助我搞清楚我在这里做错了什么。

THX


编辑:

实际上,当我没有内存过载时,我会收到一个奇怪的警告:

>dataTs1[,TimeStamp:=strptime(TimeStamp,"%d%b%y:%H:%M:%S")]
Warning messages:
1: In `[<-.data.table`(x, j = name, value = value) :
  Supplied 9 items to be assigned to 419172 items of column 'TimeStamp' (recycled leaving remainder of 6 items).
2: In `[<-.data.table`(x, j = name, value = value) :
  Coerced 'list' RHS to 'character' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 419172 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'character' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please.
> str(dataTs1)
Classes ‘data.table’ and 'data.frame':  419172 obs. of  5 variables:
 $ TimeStamp: chr  "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ ...
 $ V6FCDSB  : chr  "ALL1" "ALL1" "ALL1" "ALL1" ...
 $ V6FCDTD  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ _TYPE_   : int  4 4 4 4 4 4 4 4 4 4 ...
 $ N        : int  621 810 4 4 8 1 3 1 1 1 ...
 - attr(*, ".internal.selfref")=<externalptr> 

1 个答案:

答案 0 :(得分:6)

data.table不支持POSIXlt,永远不会。 "The no-support for POSIXlt is set in stone"