基于一天中的时间的密度图

时间:2018-01-23 17:39:25

标签: r ggplot2 lubridate density-plot

我有以下数据集:

https://app.box.com/s/au58xaw60r1hyeek5cua6q20byumgvmj

我想根据一天中的时间创建密度图。这是我到目前为止所做的:

library("ggplot2")
library("scales")
library("lubridate")

timestamp_df$timestamp_time <- format(ymd_hms(hn_tweets$timestamp), "%H:%M:%S")

ggplot(timestamp_df, aes(timestamp_time)) + 
       geom_density(aes(fill = ..count..)) +
       scale_x_datetime(breaks = date_breaks("2 hours"),labels=date_format("%H:%M"))

它给出以下错误: Error: Invalid input: time_trans works with objects of class POSIXct only

如果我将其转换为POSIXct,则会为数据添加日期。

更新1

以下转换数据为'NA'

timestamp_df$timestamp_time <- as.POSIXct(timestamp_df$timestamp_time, format = "%H:%M%:%S", tz = "UTC"

更新2

以下是我想要实现的目标: enter image description here

1 个答案:

答案 0 :(得分:0)

这是一种方法:

library(ggplot2)
library(lubridate)
library(scales)

df <- read.csv("data.csv") #given in OP

将字符转换为POSIXct

df$timestamp <- as.POSIXct(strptime(df$timestamp, "%m/%d/%Y %H:%M",  tz = "UTC"))

library(hms)

提取小时和分钟:

df$time <- hms::hms(second(df$timestamp), minute(df$timestamp), hour(df$timestamp))  

再次转换为POSIXct,因为ggplot不适用于类hms

df$time <- as.POSIXct(df$time)


ggplot(df, aes(time)) + 
  geom_density(fill = "red", alpha = 0.5) + #also play with adjust such as adjust = 0.5
  scale_x_datetime(breaks = date_breaks("2 hours"), labels=date_format("%H:%M"))

enter image description here

将其缩放为1:

ggplot(df) + 
  geom_density( aes(x = time, y = ..scaled..), fill = "red", alpha = 0.5) +
  scale_x_datetime(breaks = date_breaks("2 hours"), labels=date_format("%H:%M"))

其中..scaled..是创建绘图时stat_density的计算变量。

enter image description here