Question

我根据随机条目/行构建数据框。这是创建随机条目的功能：

createRandomEntry <- function() {
    names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
    ages <- 30:45
    return(
        data.frame(
            Name = sample(names, 1),
            Age = sample(ages, 1),
            stringsAsFactors = FALSE
        )
    )
}

现在我使用此功能将它们组合成一个大data.frame：

createRandomEntries <- function(n) {
    df <- createRandomEntry()
    for (i in 2:n) {
        df <- rbind(df, createRandomEntry())
    }
    return(df)
}

从技术上讲，它运作良好，但由于种种原因，它有点笨拙：

我必须在两个地方调用createRandomEntry()函数
我必须使用循环
- 循环必须从索引2开始，因为我已经有了第一个条目
也许打电话rbind通常可能效率低下，我不知道......

在早期版本中，createRandomEntry()返回了list而不是data.frame。然后我使用replicate()创建了一个矩阵，首先必须对其进行转置（通过调用t()），以便创建一个data.frame。并且data.frame不可排序（错误：＆＃34;未实现的类型＆＃39;列表＆＃39; in＆quot; orderVector1＆＃39;＆＃34;）。在每一行调用unlist()或从createRandomEntry()返回一个向量都可以解决排序问题，但之后我只会在每一列中获取字符串。

必须有更好的方法。但是如何？

修改：拥有一个创建一个条目的函数非常重要，因为条目的某些值可能彼此相关，像这样的增强功能显示：

createRandomEntry <- function() {
    names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
    ages <- 30:45
    startedIn <- sample(1995:2005, 1)
    lostMotivation <- startedIn + sample(1:3, 1)
    return(
        data.frame(
            Name = sample(names, 1),
            Age = sample(ages, 1),
            StartYear = startedIn,
            LostMotivation = lostMotivation,
            stringsAsFactors = FALSE
        )
    )
}
createRandomEntries(3)

产生：

     Name Age StartYear LostMotivation
1   Ashok  42      1998           2000
2 Dilbert  43      1997           1999
3 Dilbert  30      1996           1999

Answer 1

只需将n从第二个功能移到第一个功能吗？

createRandomEntries <- function(n) {
    names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
    ages <- 30:45
    return(
        data.frame(
            Name = sample(names, n, TRUE),
            Age = sample(ages, n, TRUE),
            stringsAsFactors = FALSE
        )
    )
}

Answer 2

根据Bruno Zamengo的回答，我现在重写了这个功能：

createRandomEntries <- function(n) {
    names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
    ages <- 30:45
    df <- data.frame(
        Name = sample(names, n, replace = TRUE),
        Age = sample(ages, n, replace = TRUE),
        StartYear = sample(1995:2005, n, replace = TRUE),
        stringsAsFactors = FALSE
    )
    df$LostMotivation <- df$StartYear + sample(1:3, n, replace = TRUE)
    return(df)
}

但是，我没有按照建议使用merge。

R：以更优雅的方式组合data.frames

2 个答案: