Question

我在我的R脚本中使用dplyr，这是由SQL Server存储过程调用的。我分析的查询输出如下：

到目前为止，我的R脚本和TSQL查询是：

sqlQuerySensory <- "Select
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        --Sensory Value Attributes only for mean and standard deviation analytics.
        where VariableAttributeID = 1
        and c.StudyID = 21"

 x = sqlQuerySensory

 codemean <- function(x) {
   '%>%' = magrittr::'%>%'
   dplyr::group_by(x, .data$code)
   dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)
   return ()
 }
 result <- codemean(x = x)

 OutputDataSet <- result$x

R Interactive窗口显示以下错误：

Error in UseMethod("group_by_") : 
no applicable method for 'group_by_' applied to an object of class 
"character"
Error: object 'result' not found

如何修改dplyr脚本以生成给定响应集的列c3的平均值？

更新：我已经使用收到的评论和文档的反馈纠正了大部分脚本。我还使用静态StudyID进行测试，现在收到查询中每列的完整摘要。修订后的TSQL语法如下：

ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int OUT


AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
Declare @inquery nvarchar(max) = N'Select
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        --Sensory Value Attributes only for mean and standard deviation analytics.
        where VariableAttributeID = 1
        and c.StudyID = 21'
        ;
BEGIN TRY

        exec sp_execute_external_script
        @language = N'R',
        @script = N'
            OutputDataSet = data.frame(summary(InputDataSet))',
@input_data_1 = @inquery

END TRY

BEGIN CATCH
    THROW;
END CATCH
END

根据我上面的存储过程，如何返回由StudyID分组的列c1到c8的平均值？

如何使用dplyr计算列值的平均值？

0 个答案: