如何在预测后替换列表中的项目?

时间:2017-12-12 21:26:05

标签: r prediction

我正在尝试使用列表来构建缺失值的预测,然后将这些缺失的值写回列表中。我对预测感到满意,但在此之后卡住了 - 如何将新发现的值写回my_list?

#my_list is a list with cars, some are missing MPG

# These cars have no MPG
empty_rows <- subset(my_list, cartable.mpg=='0')

#These have an MPG, we'll use them to build our model
usable_rows <- subset(my_list, cartable.mpg !='0')

#Do a regression based on mpg,cylinders and weight
fitted_lm = lm(as.numeric(cartable.mpg) ~ as.numeric(cartable.cyl)+as.numeric(cartable.wt), usable_rows)

#Predict the missing rows
filled_rows <- predict(fitted_lm, empty_rows)

1 个答案:

答案 0 :(得分:1)

由于您没有提供任何可重现的最小数据集,因此以下是使用mtcars的示例。

简而言之,我将mtcars拆分为训练数据集(用于模型构建),以及已删除响应变量的测试数据集(在本例中为mpg)。然后,我构建了一个线性模型lm(mpg ~ wt),并使用该模型预测测试数据集的mpg

# Training sample is half the full sample
# Set fixed RNG seed for reproducibility
set.seed(2017);
idx <- sample(nrow(mtcars) / 2);

# Training sample to build the model
df.train <- mtcars[idx, ];

# Test sample without response variable in column 1
df.test <- mtcars[-idx, -1];

# Linear model
fit <- lm(mpg ~ wt, data = df.train);

# Prediction for test sample
pred <- predict(fit, df.test);
df.test <- cbind.data.frame(
    mpg = pred,
    df.test);

# Bind data for training and test sample and flag which one is which
df <- rbind.data.frame(
    cbind.data.frame(df.train, train = TRUE),
    cbind.data.frame(df.test, train = FALSE));

df[, c("mpg", "wt", "train")];
#                         mpg    wt train
#Cadillac Fleetwood  10.40000 5.250  TRUE
#Merc 230            22.80000 3.150  TRUE
#Duster 360          14.30000 3.570  TRUE
#Hornet 4 Drive      21.40000 3.215  TRUE
#Merc 280            19.20000 3.440  TRUE
#Lincoln Continental 10.40000 5.424  TRUE
#Mazda RX4           21.00000 2.620  TRUE
#Merc 450SL          17.30000 3.730  TRUE
#Merc 280C           17.80000 3.440  TRUE
#Mazda RX4 Wag       21.00000 2.875  TRUE
#Hornet Sportabout   18.70000 3.440  TRUE
#Merc 450SE          16.40000 4.070  TRUE
#Valiant             18.10000 3.460  TRUE
#Merc 450SLC         15.20000 3.780  TRUE
#Merc 240D           24.40000 3.190  TRUE
#Datsun 710          22.80000 2.320  TRUE
#Chrysler Imperial   10.17314 5.345 FALSE
#Fiat 128            24.32264 2.200 FALSE
#Honda Civic         26.95458 1.615 FALSE
#Toyota Corolla      25.96479 1.835 FALSE
#Toyota Corona       23.13039 2.465 FALSE
#Dodge Challenger    18.38390 3.520 FALSE
#AMC Javelin         18.76632 3.435 FALSE
#Camaro Z28          16.94420 3.840 FALSE
#Pontiac Firebird    16.92171 3.845 FALSE
#Fiat X1-9           25.51488 1.935 FALSE
#Porsche 914-2       24.59258 2.140 FALSE
#Lotus Europa        27.41348 1.513 FALSE
#Ford Pantera L      19.95856 3.170 FALSE
#Ferrari Dino        21.75818 2.770 FALSE
#Maserati Bora       18.15895 3.570 FALSE
#Volvo 142E          21.71319 2.780 FALSE