我在我的R脚本中使用dplyr,这是由SQL Server存储过程调用的。我分析的查询输出如下:
到目前为止,我的R脚本和TSQL查询是:
sqlQuerySensory <- "Select
c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue,
c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1,
c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
from ClosedStudyResponses c
--Sensory Value Attributes only for mean and standard deviation analytics.
where VariableAttributeID = 1
and c.StudyID = 21"
x = sqlQuerySensory
codemean <- function(x) {
'%>%' = magrittr::'%>%'
dplyr::group_by(x, .data$code)
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)
return ()
}
result <- codemean(x = x)
OutputDataSet <- result$x
R Interactive窗口显示以下错误:
Error in UseMethod("group_by_") :
no applicable method for 'group_by_' applied to an object of class
"character"
Error: object 'result' not found
如何修改dplyr脚本以生成给定响应集的列c3的平均值?
更新: 我已经使用收到的评论和文档的反馈纠正了大部分脚本。我还使用静态StudyID进行测试,现在收到查询中每列的完整摘要。修订后的TSQL语法如下:
ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int OUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
Declare @inquery nvarchar(max) = N'Select
c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue,
c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1,
c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
from ClosedStudyResponses c
--Sensory Value Attributes only for mean and standard deviation analytics.
where VariableAttributeID = 1
and c.StudyID = 21'
;
BEGIN TRY
exec sp_execute_external_script
@language = N'R',
@script = N'
OutputDataSet = data.frame(summary(InputDataSet))',
@input_data_1 = @inquery
END TRY
BEGIN CATCH
THROW;
END CATCH
END
根据我上面的存储过程,如何返回由StudyID分组的列c1到c8的平均值?