通过汇总R中的多个列来创建新列

时间:2020-05-10 23:45:38

标签: r dplyr tidyr

背景

我有一个数据集df,我想在其中汇总多个列并创建一个新列。我需要将“类型”,“跨度”和“人口”列相乘并创建一个新的“输出”列

ID       Status      Type     Span   State   Population

A        Yes         2        70%    Ga      10000

所需的输出

ID        Status     Type      Span   State   Population   Output

A         Yes        2         70%    Ga      10000        14000      

投放

structure(list(ID = structure(1L, .Label = "A ", class = "factor"), 
Status = structure(1L, .Label = "Yes", class = "factor"), 
Type = 2L, Span = structure(1L, .Label = "70%", class = "factor"), 
State = structure(1L, .Label = "Ga", class = "factor"), Population = 10000L), class = "data.frame", 
row.names = c(NA, 
-1L))

这是我尝试过的

 df %>% 
 mutate(Output = Type * Span * Population)

2 个答案:

答案 0 :(得分:2)

在这里,我们正在根据来自不同列的输入创建一个新列。我们可以只使用from django.urls import path from . import views app_name='posts' urlpatterns = [ path('', views.PostList.as_view(), name="all"), path("new/", views.CreatePost.as_view(), name="create"), path("by/<username>/",views.UserPosts.as_view(),name="for_user"), path("by/<username>/<int:pk>/",views.PostDetail.as_view(),name="single"), path("delete/<int:pk>/",views.DeletePost.as_view(),name="delete"), path('post/<int:pk>/comment/', views.add_comment_to_post, name='add_comment_to_post'), ] 来获得from django.contrib import messages from django.contrib.auth.mixins import LoginRequiredMixin from django.urls import reverse_lazy from django.http import Http404 from django.views import generic from django.contrib.auth.decorators import login_required from .forms import PostForm, CommentForm from .models import Post, Comment # pip install django-braces from braces.views import SelectRelatedMixin from . import forms from . import models from django.contrib.auth import get_user_model User = get_user_model() class PostList(SelectRelatedMixin, generic.ListView): model = models.Post select_related = ("user", "group") class UserPosts(generic.ListView): model = models.Post template_name = "posts/user_post_list.html" def get_queryset(self): try: self.post_user = User.objects.prefetch_related("posts").get( username__iexact=self.kwargs.get("username") ) except User.DoesNotExist: raise Http404 else: return self.post_user.posts.all() def get_context_data(self, **kwargs): context = super().get_context_data(**kwargs) context["post_user"] = self.post_user return context class PostDetail(SelectRelatedMixin, generic.DetailView): model = models.Post select_related = ("user", "group") def get_queryset(self): queryset = super().get_queryset() return queryset.filter( user__username__iexact=self.kwargs.get("username") ) class CreatePost(LoginRequiredMixin, SelectRelatedMixin, generic.CreateView): # form_class = forms.PostForm fields = ('message','group') model = models.Post # def get_form_kwargs(self): # kwargs = super().get_form_kwargs() # kwargs.update({"user": self.request.user}) # return kwargs def form_valid(self, form): self.object = form.save(commit=False) self.object.user = self.request.user self.object.save() return super().form_valid(form) class DeletePost(LoginRequiredMixin, SelectRelatedMixin, generic.DeleteView): model = models.Post select_related = ("user", "group") success_url = reverse_lazy("posts:all") def get_queryset(self): queryset = super().get_queryset() return queryset.filter(user_id=self.request.user.id) def delete(self, *args, **kwargs): messages.success(self.request, "Post Deleted") return super().delete(*args, **kwargs) def add_comment_to_post(request, pk): post = get_object_or_404(Post, pk=pk) if request.method == "POST": form = CommentForm(request.POST) if form.is_valid(): comment = form.save(commit=False) comment.post = post comment.save() return redirect('post_detail', pk=post.pk) else: form = CommentForm() return render(request, 'posts/add_comment_to_post.html', {'form': form}) 的{​​{1}}%,然后乘以'Type'。请注意,“ Span”不是数字,因为它具有mutate,因此我们用Span除以100提取数字部分,然后将“ Purple”与“ Type”相乘

Population

如果列“类型”,“人口”不是数字,则最好将%和{Population”转换为parse_number(假设它们是library(dplyr) df %>% mutate(Output = Type * Population * readr::parse_number(as.character(Span))/100) # ID Status Type Span State Population Output #1 A Yes 2 70% Ga 10000 14000 类) 。另一个选项是numeric,然后在修改后的类数据集上工作

答案 1 :(得分:2)

我们可以使用'%'删除sub符号,将其转换为数字并乘以值。

这可以在R的基础上完成:

df$output <- with(df, Type * as.numeric(sub('%', '', Span)) * Population/100)
df

#  ID Status Type Span State Population  output
#1 A     Yes    2  70%    Ga      10000   14000