当我使用Proc Freq时,我希望显示频率为零的观测值

时间:2015-01-31 00:32:39

标签: sas

我正在尝试对考试成绩进行频率分析。 我的数据集中有记录分数的学生。

如下所示:

student,  score

1,           1

2,           1

3,           1

4,           3

5,           3

6,           3

7,           4

8,           4

9,           4

10,          4

我运行代码:

proc freq data=stuff;
var score;
run;

输出:

score, freq, pct, cum.freq., cum.pct.

1,      3,    .3,    3,        .3

3,      3,    .3,    6,        .6

4,      4,    .4,    10,        1

我想表明:

score, freq, pct, cum.freq., cum.pct.

1,      3,    .3,    3,        .3

2,      0,     0,    3,        .3

3,      3,    .3,    6,        .6

4,      4,    .4,    10,        1

5,      0,     0,    10,        1

SAS有没有办法做到这一点? 谢谢你的帮助。

1 个答案:

答案 0 :(得分:0)

总有办法。以下是一些方法。它们要么是手工构建表,要么在要求SAS汇总之前预处理输入。

这些将在以下方法中用作输入

/* Your input data */
data have;
    infile datalines dlm = ",";
    input student score;
    datalines;
1,1
2,1
3,1
4,3
5,3
6,3
7,4
8,4
9,4
10,4
;
run;
%let max = 5;
/* Create a dummy set of all the desired groups */
data numbers;
    do score = 1 to &max;
        output;
    end;
run;

此方法会为原始数据添加虚拟值,并使用proc freq的权重函数来决定哪些行应该对计数做出贡献(我更喜欢这种方法)

/* Combine the inputs and add weight variable */
data input;
    set 
        have (in = a) 
        numbers;
    /* Set weight so that empty groups won't be counted */
    weight = ifn(a, 1, 0)
run;
/* Create the report */
proc freq data = input;
    table score;
    /* Only add to the count when weight = 1, include 0 rows */
    weight weight / zeros;
run;

此方法手动构建表格

/* Get the total count and each group's count */
proc summary data = have;
    class score;
    output out = counts;
run;
/* Combine and create the summary stats */
data want;
    merge 
        counts (rename = (_FREQ_ = freq)) 
        numbers;
    by score;
    /* Keep the total and cumulative total when new rows are loaded */
    retain total cumsum;
    /* Set up the total and delete the total row */
    if _TYPE_ = 0 then do; 
        total = freq;
        delete;
    end;
    /* Prevent missing values by adding 0 */
    freq + 0;
    /* Calculate stats */
    pct = freq / total; 
    cumsum + freq;
    cumpct = cumsum / total;
    drop _TYPE_ total;
run;
proc print;run;

此方法会修改proc freq输出以添加额外行

/* Create the frequency report as a dataset */
proc freq data = have;
    table score;
    /* Output as dataset */
    ods output OneWayFreqs = freqs (drop = table f_score);
run;
/* Combine and build the extra rows */
data want (drop = _:);
    merge 
        freqs (in = f)
        numbers;
    by score;
    /* Set up temporary variables for storing cumulatives */
    retain _lp _lf;
    /* Store cumulatives */
    if f then do;
        _lf = CumFrequency;
        _lp = CumPercent;
    end;
    /* Put stored values into new rows */
    else do;
        Frequency = 0;
        Percent = 0;
        CumFrequency = _lf;
        CumPercent = _lp;
    end;
run;
proc print; run;

使用上面的Joe建议也可以帮助您解决问题,但遗憾的是无法计算累积列。 这是使用Joe的方法的部分解决方案

/* Create a format showing each of the desired levles */
proc format;
    value score
        1='1'
        2='2'
        3='3'
        4='4'
        5='5';
quit;
proc tabulate data = have;
    /* Specify that every formatted value should be included */
    class score / preloadfmt;
    /* Apply the format */
    format score score.;
    tables score, 
        /* Request the output table include frequency and percent */
        (n="freq" pctn="cumfreq" ) / 
        /* Include missings and display them as 0s */
        printmiss misstext="0";
run;