我正在尝试对考试成绩进行频率分析。 我的数据集中有记录分数的学生。
如下所示:
student, score
1, 1
2, 1
3, 1
4, 3
5, 3
6, 3
7, 4
8, 4
9, 4
10, 4
我运行代码:
proc freq data=stuff;
var score;
run;
输出:
score, freq, pct, cum.freq., cum.pct.
1, 3, .3, 3, .3
3, 3, .3, 6, .6
4, 4, .4, 10, 1
我想表明:
score, freq, pct, cum.freq., cum.pct.
1, 3, .3, 3, .3
2, 0, 0, 3, .3
3, 3, .3, 6, .6
4, 4, .4, 10, 1
5, 0, 0, 10, 1
SAS有没有办法做到这一点? 谢谢你的帮助。
答案 0 :(得分:0)
总有办法。以下是一些方法。它们要么是手工构建表,要么在要求SAS汇总之前预处理输入。
这些将在以下方法中用作输入
/* Your input data */
data have;
infile datalines dlm = ",";
input student score;
datalines;
1,1
2,1
3,1
4,3
5,3
6,3
7,4
8,4
9,4
10,4
;
run;
%let max = 5;
/* Create a dummy set of all the desired groups */
data numbers;
do score = 1 to &max;
output;
end;
run;
此方法会为原始数据添加虚拟值,并使用proc freq
的权重函数来决定哪些行应该对计数做出贡献(我更喜欢这种方法)
/* Combine the inputs and add weight variable */
data input;
set
have (in = a)
numbers;
/* Set weight so that empty groups won't be counted */
weight = ifn(a, 1, 0)
run;
/* Create the report */
proc freq data = input;
table score;
/* Only add to the count when weight = 1, include 0 rows */
weight weight / zeros;
run;
此方法手动构建表格
/* Get the total count and each group's count */
proc summary data = have;
class score;
output out = counts;
run;
/* Combine and create the summary stats */
data want;
merge
counts (rename = (_FREQ_ = freq))
numbers;
by score;
/* Keep the total and cumulative total when new rows are loaded */
retain total cumsum;
/* Set up the total and delete the total row */
if _TYPE_ = 0 then do;
total = freq;
delete;
end;
/* Prevent missing values by adding 0 */
freq + 0;
/* Calculate stats */
pct = freq / total;
cumsum + freq;
cumpct = cumsum / total;
drop _TYPE_ total;
run;
proc print;run;
此方法会修改proc freq
输出以添加额外行
/* Create the frequency report as a dataset */
proc freq data = have;
table score;
/* Output as dataset */
ods output OneWayFreqs = freqs (drop = table f_score);
run;
/* Combine and build the extra rows */
data want (drop = _:);
merge
freqs (in = f)
numbers;
by score;
/* Set up temporary variables for storing cumulatives */
retain _lp _lf;
/* Store cumulatives */
if f then do;
_lf = CumFrequency;
_lp = CumPercent;
end;
/* Put stored values into new rows */
else do;
Frequency = 0;
Percent = 0;
CumFrequency = _lf;
CumPercent = _lp;
end;
run;
proc print; run;
使用上面的Joe建议也可以帮助您解决问题,但遗憾的是无法计算累积列。 这是使用Joe的方法的部分解决方案
/* Create a format showing each of the desired levles */
proc format;
value score
1='1'
2='2'
3='3'
4='4'
5='5';
quit;
proc tabulate data = have;
/* Specify that every formatted value should be included */
class score / preloadfmt;
/* Apply the format */
format score score.;
tables score,
/* Request the output table include frequency and percent */
(n="freq" pctn="cumfreq" ) /
/* Include missings and display them as 0s */
printmiss misstext="0";
run;