将重复的列折叠成行

时间:2015-11-18 00:45:26

标签: r dataframe reshape

我有一个从API中提取的数据框。经过一些清洁后,它看起来像这样:

Title   Year  Rating  Title    Year  Rating  Title    Year  Rating
Movie 1 1997  6.7     Movie 2  1987  8.2     Movie 3  2009  7.1

列标题重复,在这种情况下,单行包含3个单独的条目。

我如何重塑这个以便最终得到3列(标题,年份,评级)和3行(电影1,电影2,电影3)?

最简单的方法是什么?

3 个答案:

答案 0 :(得分:2)

将输入data.frame转换为列表,并根据列的常用列名将列拆分为多个组。然后取消列出每组列以在每个组中生成单个列并转换回data.frame。 (如果<! DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title> Google book search </title> <!--link rel="stylesheet" href="css/style.css"--> </head> <body> <div id = "input"> <form> <input type="button" id="search_button" value="search"> </form> </div> </body> </html> <script > var body = document.getElementsByTagName("body")[0]; var dosearch= function() { var newDiv =document.createElement("div"); newDiv.setAttribute("id", "map"); newDiv.style.width="100px"; newDiv.style.height="100px"; body.appendChild(newDiv); }; window.onload=function(){ console.log("ready"); var search_button= document.getElementById("search_button"); search_button.addEventListener("click", dosearch); } </script> <script type="text/javascript"> var map; function initMap() { map = new google.maps.Map(document.getElementById('map'), { center: {lat: -34.397, lng: 150.644}, zoom: 8 }); } </script> <script src="https://maps.googleapis.com/maps/api/js?key=AIzaSyDpKAwq-qKxzm-9D1405KCFp7ZTtu_Vimg&callback=initMap" async defer></script> 中有多行,这也适用。)

DF

,并提供:

as.data.frame(lapply(split(as.list(DF), names(DF)), unlist))

注意:我们假设此输入:

  Rating  Title Year
1    6.7 Movie1 1997
2    8.2 Movie2 1987
3    7.1 Movie3 2009

答案 1 :(得分:2)

我认为如果您从API获得数据,那么您的清洁工作肯定会出错。你已经丢失了所有信息,以确定除了列顺序之外哪个评级和哪个标题与哪个电影相关。

但无论如何,你可以这样做:

library(dplyr)
library(tidyr)

data %>%
  gather(variable, value) %>%
  mutate(ID = rep(1:3, length.out = n() ) ) %>%
  spread(variable, value)

答案 2 :(得分:1)

这可以通过melt中的data.table来完成,measure可以通过指定pattern

library(data.table)#v1.9.6+ melt(setDT(df1), measure=patterns('Title', 'Year', 'Rating'), value.name=c('Title', 'Year', 'Rating'))[,variable:=NULL][] # Title Year Rating #1: Movie 1 1997 6.7 #2: Movie 2 1987 8.2 #3: Movie 3 2009 7.1 中的多个列
df1 <- structure(list(Title = "Movie 1", Year = 1997L, Rating = 6.7, 
Title = "Movie 2", Year = 1987L, Rating = 8.2, Title = "Movie 3", 
Year = 2009L, Rating = 7.1), .Names = c("Title", "Year", 
"Rating", "Title", "Year", "Rating", "Title", "Year", "Rating"
), class = "data.frame", row.names = c(NA, -1L))

数据

Vector2 myVector = new Vector2(2, 1);
myVector *= 2;