我试图在每个bin中绘制一些不等n的数据。对于非常不平衡的类别,ggplot2的geom_smooth(method="loess")
函数无法绘制一条线。我怎样才能在每个方框中划一条线?
我无法提供完整的数据集,但这里是dput()
的示例。它仅包含与图中左侧第一列对应的数据:
structure(list(SyllPos = structure(c(2L, 2L, 1L, 2L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 3L, 3L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 3L,
2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 2L, 1L, 3L, 1L, 2L, 1L, 1L, 2L,
3L, 2L, 2L, 2L, 1L, 2L, 1L, 3L, 2L, 1L, 3L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 2L, 1L, 2L, 3L, 1L, 2L, 2L, 1L), .Label = c("1",
"2", "3"), class = "factor"), Vowel = structure(c(6L, 6L, 5L,
1L, 1L, 4L, 4L, 5L, 4L, 4L, 4L, 1L, 6L, 4L, 5L, 1L, 1L, 4L, 4L,
6L, 6L, 5L, 6L, 6L, 4L, 6L, 5L, 5L, 4L, 1L, 1L, 5L, 1L, 1L, 5L,
4L, 1L, 4L, 4L, 4L, 4L, 4L, 6L, 6L, 1L, 5L, 5L, 1L, 6L, 5L, 6L,
6L, 5L, 4L, 6L, 4L, 1L, 4L, 4L, 6L, 4L, 1L, 1L, 4L, 4L, 4L), .Label = c("a",
"A", "e", "i", "o", "u"), class = "factor"), F0_min = c(175.4793618612,
161.9387247949, 156.4967046937, 173.1514145171, 159.8804957163,
175.2917843952, 172.7116138335, 174.6049809487, 195.8368591846,
195.4172420312, 182.6852946151, 188.4100959521, 188.983672073,
214.0355579244, 169.23097112, 152.1439895502, 156.621222189,
175.8042928291, 171.367861216, 193.7238081091, 179.3106662597,
182.9049959569, 178.1478311468, 171.9863659221, 185.8157515956,
196.6284794848, 179.9183082837, 180.7406792084, 165.2450513336,
152.0289582284, 173.1748795491, 168.4186744926, 188.4070592283,
149.6196463529, 168.7081562312, 179.1505731882, 151.186575271,
187.8501100842, 208.5895328686, 167.7112210192, 174.5130946688,
167.1739428511, 189.0655970555, 198.5328530886, 156.4296130688,
186.6138701847, 173.8934695337, 159.7378035477, 209.431835937,
172.915664809, 177.6465488766, 188.7978637368, 172.3481292301,
178.2089953258, 178.2263785249, 178.2383870226, 174.3579576427,
201.3184127519, 201.7100790628, 194.3845286637, 174.2388389206,
177.534, 157.5118957437, 173.408586022, 201.8608904925, 202.1619587211
)), .Names = c("SyllPos", "Vowel", "F0_min"), row.names = c(NA,
-66L), class = c("tbl_df", "tbl", "data.frame"))
这是输出图和我的代码:
plot <- ggplot(mydata, aes(x=as.integer(SyllPos), y=F0_min)) +
geom_point(shape=1) +
geom_smooth(method="loess")+
theme_bw()+
facet_grid(Vowel ~ SpeakerId)
对于能够解释为什么第3行左边的第5个单元格和最后一行最右边的单元格有大量标准错误/野生颠簸与任何异常数据点不对应的人的额外荣誉。
答案 0 :(得分:0)
问题是x值的范围非常小(1-3)。 geom_smooth()
loess
方法使用参数span
来确定用于计算平滑曲线的窗口大小。对于小范围的x值,默认情况下span
相应较小(在这种情况下略小于1,尽管我不确定它有多小)。由于x轴只有整数值,因此span
参数无法获得足够的数据来计算窗口内的平滑曲线。使用span=1
,如下面的调用,一切正常!
plot <- ggplot(mydata , aes(x=as.integer(SyllPos), y=F0_min)) +
geom_point(shape=1) +
geom_smooth(method="loess", span = 1)+
theme_bw()+
facet_grid(Vowel ~ SpeakerId)
为了记录,我不可能在没有弄错这个答案的情况下做到这一点(对于另一个被社区简单投票的问题)Working with span in ggplot2 / geom_smooth