无法为所有组绘制ggplot2黄土线

时间:2018-04-28 16:56:20

标签: r ggplot2

我试图在每个bin中绘制一些不等n的数据。对于非常不平衡的类别,ggplot2的geom_smooth(method="loess")函数无法绘制一条线。我怎样才能在每个方框中划一条线?

我无法提供完整的数据集,但这里是dput()的示例。它仅包含与图中左侧第一列对应的数据:

structure(list(SyllPos = structure(c(2L, 2L, 1L, 2L, 1L, 2L, 
2L, 2L, 3L, 2L, 1L, 3L, 3L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 3L, 
2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 1L, 3L, 1L, 2L, 1L, 1L, 2L, 
3L, 2L, 2L, 2L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 3L, 2L, 1L, 2L, 3L, 1L, 2L, 2L, 1L), .Label = c("1", 
"2", "3"), class = "factor"), Vowel = structure(c(6L, 6L, 5L, 
1L, 1L, 4L, 4L, 5L, 4L, 4L, 4L, 1L, 6L, 4L, 5L, 1L, 1L, 4L, 4L, 
6L, 6L, 5L, 6L, 6L, 4L, 6L, 5L, 5L, 4L, 1L, 1L, 5L, 1L, 1L, 5L, 
4L, 1L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 1L, 5L, 5L, 1L, 6L, 5L, 6L, 
6L, 5L, 4L, 6L, 4L, 1L, 4L, 4L, 6L, 4L, 1L, 1L, 4L, 4L, 4L), .Label = c("a", 
"A", "e", "i", "o", "u"), class = "factor"), F0_min = c(175.4793618612, 
161.9387247949, 156.4967046937, 173.1514145171, 159.8804957163, 
175.2917843952, 172.7116138335, 174.6049809487, 195.8368591846, 
195.4172420312, 182.6852946151, 188.4100959521, 188.983672073, 
214.0355579244, 169.23097112, 152.1439895502, 156.621222189, 
175.8042928291, 171.367861216, 193.7238081091, 179.3106662597, 
182.9049959569, 178.1478311468, 171.9863659221, 185.8157515956, 
196.6284794848, 179.9183082837, 180.7406792084, 165.2450513336, 
152.0289582284, 173.1748795491, 168.4186744926, 188.4070592283, 
149.6196463529, 168.7081562312, 179.1505731882, 151.186575271, 
187.8501100842, 208.5895328686, 167.7112210192, 174.5130946688, 
167.1739428511, 189.0655970555, 198.5328530886, 156.4296130688, 
186.6138701847, 173.8934695337, 159.7378035477, 209.431835937, 
172.915664809, 177.6465488766, 188.7978637368, 172.3481292301, 
178.2089953258, 178.2263785249, 178.2383870226, 174.3579576427, 
201.3184127519, 201.7100790628, 194.3845286637, 174.2388389206, 
177.534, 157.5118957437, 173.408586022, 201.8608904925, 202.1619587211
)), .Names = c("SyllPos", "Vowel", "F0_min"), row.names = c(NA, 
-66L), class = c("tbl_df", "tbl", "data.frame"))

这是输出图和我的代码:

plot <- ggplot(mydata, aes(x=as.integer(SyllPos), y=F0_min)) +
    geom_point(shape=1) +
    geom_smooth(method="loess")+
    theme_bw()+
    facet_grid(Vowel ~ SpeakerId)

faceted plots with most but not all cells containing a loess line

对于能够解释为什么第3行左边的第5个单元格和最后一行最右边的单元格有大量标准错误/野生颠簸与任何异常数据点不对应的人的额外荣誉。

1 个答案:

答案 0 :(得分:0)

问题是x值的范围非常小(1-3)。 geom_smooth() loess方法使用参数span来确定用于计算平滑曲线的窗口大小。对于小范围的x值,默认情况下span相应较小(在这种情况下略小于1,尽管我不确定它有多小)。由于x轴只有整数值,因此span参数无法获得足够的数据来计算窗口内的平滑曲线。使用span=1,如下面的调用,一切正常!

plot <- ggplot(mydata , aes(x=as.integer(SyllPos), y=F0_min)) +
    geom_point(shape=1) +
    geom_smooth(method="loess", span = 1)+
    theme_bw()+
    facet_grid(Vowel ~ SpeakerId)

almost the same graph as in the question, but with loess lines for all categories, and only computed with F0_min instead of F0_min-min_min on the y-axis

为了记录,我不可能在没有弄错这个答案的情况下做到这一点(对于另一个被社区简单投票的问题)Working with span in ggplot2 / geom_smooth

相关问题