如何计算属于一系列值的数据点的百分比?

时间:2012-10-19 16:48:59

标签: r

给定一个值表(比如介于0到100之间)和附图,使用R计算有多少数据点落在值20-60之间的最简单方法(图像中的红色框) ?

有没有办法使用R的绘图功能创建红色框(我是使用图像编辑器完成的......)?

感谢您的帮助。 enter image description here

2 个答案:

答案 0 :(得分:13)

计算区间内包含的概率质量:

x <- rnorm(1e6)  ## data forming your empirical distribution
ll <- -1.96      ## lower bound of interval of interest
ul <- 1.96       ## upper bound of interval of interest

sum(x > ll & x < ul)/length(x)
# [1] 0.949735

然后绘制直方图和红色框:

h <- hist(x, breaks=100, plot=FALSE)       # Calculate but don't plot histogram
maxct <- max(h$counts)                     # Extract height of the tallest bar
## Or, if you want the height of the tallest bar within the interval
# start <- findInterval(ll, h$breaks)
# end   <- findInterval(ul, h$breaks)
# maxct <- max(h$counts[start:end])

plot(h, ylim=c(0, 1.05*maxct), col="blue") # Plot, leaving a bit of space up top

rect(xleft = ll, ybottom = -0.02*maxct,    # Add box extending a bit above
     xright = ul, ytop = 1.02*maxct,       # and a bit below the bars
     border = "red", lwd = 2)

enter image description here

答案 1 :(得分:8)

set.seed(42) 
x <- rlnorm(5000) #some data
hist(x) #histogram
rect(7,-50,10,100,border="red") #red rectangle
table(cut(x,breaks=c(0,7,10,Inf)))/length(x) #fraction of values in intervals
#(0,7]    (7,10]   (10,Inf] 
#0.9754   0.0136   0.0110 

Cut根据值所属的时间间隔对值进行分类。table然后创建一个计数表,然后可以除以总计数length(x)

相关问题