我的data.frame看起来像这样:
> head(df,10)
DateTime BP1 BQ1 BP2 BQ2 BP3 BQ3 BP4 BQ4 BP5 BQ5
1 2015-09-16 09:15:01 70730 1 0 0 0 0 0 0 0 0
2 2015-09-16 09:15:01 70735 1 70730 1 70285 1 70185 1 0 0
3 2015-09-16 09:15:01 70905 1 70735 3 70730 1 0 0 0 0
4 2015-09-16 09:15:01 70910 1 70905 1 70735 2 70730 1 0 0
5 2015-09-16 09:15:03 70905 1 70900 1 70730 1 70230 1 70220 1
6 2015-09-16 09:15:06 70910 1 70905 2 70900 1 70795 1 70730 1
7 2015-09-16 09:15:06 70905 2 70900 1 70795 1 70730 1 70220 1
8 2015-09-16 09:15:06 70910 1 70905 2 70900 1 70795 1 70730 1
9 2015-09-16 09:15:07 70915 1 70910 1 70905 1 70900 1 70795 1
10 2015-09-16 09:15:07 71000 1 70915 1 70905 1 70785 1 70730 1
BP = BidPrice和BQ = BidQty
我的目标是将每一行与前一行进行比较,看看是否有任何修改/添加/取消,并根据它给它一个分数并添加它们。
就像你可以看到df [1,2] = df [2,4]这样,因为价格下降了1级,所以会得分为-1。
然后,df [4,4] = df [5,2]因此,由于价格上涨一个等级,因此会得到+1的分数。
依旧...... 对每一行执行此操作,然后对该行的分数求和。因此输出应该是一个矢量或具有正整数和负整数的data.frame。
到目前为止我所拥有的东西很少,而且我已经绞尽脑汁待了好几个小时,而且还没有到任何地方。我想我最终得到了一个永无止境的循环:我的代码如下:
mod<- function(file, level = 5){
whole_data<- read.csv(file = file,header = FALSE,sep = "", col.names = c("DateTime","Seq","BP1","BQ1","BO1","AP1","AQ1","AO1","BP2","BQ2","BO2","AP2","AQ2","AO2","BP3","BQ3","BO3","AP3","AQ3","AO3","BP4","BQ4","BO4","AP4","AQ4","AO4","BP5","BQ5","BO5","AP5","AQ5","AO5","BP6","BQ6","BO6","AP6","AQ6","AO6","BP7","BQ7","BO7","AP7","AQ7","AO7","BP8","BQ8","BO8","AP8","AQ8","AO8","BP9","BQ9","BO9","AP9","AQ9","AO9","BP10","BQ10","BO10","AP10","AQ10","AO10"), colClasses = c(NA, rep("integer",31), rep("NULL", 30)))
whole_data<- whole_data[which(whole_data$DateTime != 0),]
whole_data$DateTime= as.POSIXct(whole_data$DateTime/(10^9), origin="1970-01-01") #timestamp conversion
df<- data.frame(DateTime= whole_data$DateTime, BP1 = whole_data$BP1, BQ1=whole_data$BQ1, BP2 = whole_data$BP2, BQ2=whole_data$BQ2, BP3 = whole_data$BP3, BQ3=whole_data$BQ3, BP4 = whole_data$BP4, BQ4=whole_data$BQ4, BP5 = whole_data$BP5, BQ5=whole_data$BQ5)
v<- NULL
for(i in 1:nrow(df)){
for(j in seq_along(c(2,4,6,8,10)))
if(level == 5){
x <- df[i+1,j]==df[i,c(2,4,6,8,10)]
y<- which(x==TRUE)
v[i]<- c(v,(y-(j-(j-1))*-1))
i=i+1
}
v
}
}
对于具有定量财务背景的任何人,我试图通过捕获订单修改,取消和新订单来分析电子订单。这只是我尝试做的许多事情的第一步,但任何指针/评论/帮助我肯定会走很长的路,我将不胜感激。
最快的方法是什么?对于循环?我真的迷失了我应该如何去做!任何能指向正确方向的东西都将不胜感激。
如果需要任何其他信息,请告诉我。