将导入的文件名捕获到SAS中的变量中

时间:2018-05-18 19:17:32

标签: sas sas-macro

我正在尝试找到一种方法来捕获变量中我正在SAS中导入的文件的名称,因为我想用它来构建导出文件的名称。例如,我在目录c:\ TEMP中有文件TEST.xlsx,我想在SAS中导入并在对数据进行一些操作之后将结果导出为TEST-01.xlsx。请有人帮助我这样做吗?

谢谢你, 丹

PROC IMPORT DATAFILE= "c:\TEMP\*.xlsx" DBMS=xlsx out=TABLE_START  REPLACE;          
RUN;

3 个答案:

答案 0 :(得分:1)

首先找到名字。

data _null_;
  length fname $300 ;
  infile "c:\TEMP\*.xlsx" filename=fname;
  input @;
  call symputx('fname',fname);
  stop;
run;

然后您可以在IMPORT或其他步骤中使用文件名。

PROC IMPORT DATAFILE= "&fname" DBMS=xlsx out=TABLE_START  REPLACE;          
RUN;

答案 1 :(得分:0)

我不知道你可以在PROC IMPORT中使用通配符。据我所知,在使用这种导入数据的方法时,没有一种简单的方法可以捕获这些信息。通配符通常用于INFILE语句,您尝试一次读取多个文件。如果文件夹中有多个XLSX文件,我也不知道它会读到哪个,这似乎是我认为的第一个。不幸的是,XLSX也没有向日志添加任何更多信息。

我认为这意味着你需要完全改变你的方法。如果您稍后需要文件名,我会推荐这种方法:

%let myfile = 'path to your excel file';

proc import out=want datafile=&myFile. dbms=xlsx replace; run;

data want2;
   set want;
   source = &myfile;
run;

如果这不是Excel文件,那么还有其他选择,但遗憾的是它是XLSX。如果您要导入多个CSV文件,例如FILEVAR和FILENAME选项。

如果它在日志中,你可以用迂回的方式捕获它,但这似乎也不起作用。这对我来说有点奇怪,所以我打算在communities.sas.com上重新发布它,看看SAS或其他高级用户是否有建议。

答案 2 :(得分:0)

OP不清楚是否必须处理一个或多个Excel文件。

回答这个问题的另一种方法是使用PIPE语句中的FILENAME选项引用该文件来管理操作系统命令的结果,例如Linux中的ls或Windows中的dir

如果有多个文件与OP中列出的通配符匹配,并且必须处理每个文件,则必须使用为SAS宏变量分配多个文件名的技术。

作为示例数据,我将使用来自Alberto Barradas的Kaggle Pokémon with Stats数据库的Excel版本数据。我将包含第1代,第2代和所有神奇宝贝的三个Excel文件保存到基于Windows的计算机上的子目录中。

要导入文件集,我们首先生成文件名列表,然后将其读入SAS数据集。

%let dirname = /folders/myshortcuts/sf_gitrepos/pokemonData;
filename DIRLIST pipe "dir /B &dirname\*.xlsx";

data dirlist;
   length fname $256;
   infile dirlist length = reclen;
   input fname $varying256. reclen;
run;

请注意,如果您的SAS安装处于LOCKDOWN模式,则会限制其发出操作系统命令。在这种情况下,您必须转到操作系统并发出命令以生成目录列表并将其保存到文件中。在Windows中,它看起来像:

dir /b *.xlsx > excelfiles.txt

请注意,/b选项表示“裸”,并打印目录列表而不包含摘要信息。由于我使用SAS大学版本来获得这个答案,我不得不使用以下代码将文件名读入SAS。

data dirlist;
   length fname $256;
   infile "/folders/myshortcuts/sf_gitrepos/pokemonData/excelfiles.txt" length=reclen;
   input fname $varying256. reclen;
run;

SAS日志输出:

1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         data dirlist;
 74            length fname $256;
 75            infile "/folders/myshortcuts/sf_gitrepos/pokemonData/excelfiles.txt" length=reclen;
 76            input fname $varying256. reclen;
 77         run;

 NOTE: The infile "/folders/myshortcuts/sf_gitrepos/pokemonData/excelfiles.txt" is:
       Filename=/folders/myshortcuts/sf_gitrepos/pokemonData/excelfiles.txt,
       Owner Name=root,Group Name=vboxsf,
       Access Permission=-rwxrwx---,
       Last Modified=20May2018:03:56:41,
       File Size (bytes)=38

 NOTE: 3 records were read from the infile "/folders/myshortcuts/sf_gitrepos/pokemonData/excelfiles.txt".
       The minimum record length was 10.
       The maximum record length was 12.
 NOTE: The data set WORK.DIRLIST has 3 observations and 1 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.01 seconds
       cpu time            0.01 seconds

接下来,我们将使用PROC SQL的INTO功能为每个文件名生成一个SAS宏变量,如William Murphy在2007年SAS用户大会上发表的Changing Data Set Variables to Macro Variables文章中所述。

proc sql noprint;
   select count(*) into :NObs from dirlist;
   select fname into :Name1-:Name%left(&NObs) from dirlist;
run;

...以及SAS日志的输出:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         proc sql;
 74            select count(*) into :NObs from dirlist;
 75            select fname into :Name1-:Name%left(&NObs) from dirlist;
 MPRINT(LEFT):  Name3
 76         run;
 NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.
 77         
 78         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 91

最后,我们将编写并执行SAS宏来迭代运行PROC IMPORT以导入Excel文件。

%macro genimport;
    %local i;
    %do i = 1 %to &NObs;
      proc import out=want&i datafile="&dirname/&&Name&i" dbms=xlsx replace; 
      run;
   %end;
%mend;
%genimport;

...和输出:

1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         %macro genimport;
 74             %local i;
 75             %do i = 1 %to &NObs;
 76               proc import out=want&i datafile="&dirname/&&Name&i" dbms=xlsx replace;
 77               run;
 78            %end;
 79         %mend;
 80         %genimport;
 MPRINT(GENIMPORT):   proc import out=want1 datafile="/folders/myshortcuts/sf_gitrepos/pokemonData/gen01.xlsx" dbms=xlsx replace;
 MPRINT(GENIMPORT):   RXLX;
 MPRINT(GENIMPORT):   run;

 NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with 
       options MSGLEVEL=I.
 NOTE: The import data set has 165 observations and 13 variables.
 NOTE: WORK.WANT1 data set was successfully created.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.06 seconds
       cpu time            0.03 seconds


 MPRINT(GENIMPORT):   proc import out=want2 datafile="/folders/myshortcuts/sf_gitrepos/pokemonData/gen02.xlsx" dbms=xlsx replace;
 MPRINT(GENIMPORT):   RXLX;
 MPRINT(GENIMPORT):   run;

 NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with 
       options MSGLEVEL=I.
 NOTE: The import data set has 106 observations and 13 variables.
 NOTE: WORK.WANT2 data set was successfully created.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.05 seconds
       cpu time            0.04 seconds


 MPRINT(GENIMPORT):   proc import out=want3 datafile="/folders/myshortcuts/sf_gitrepos/pokemonData/Pokemon.xlsx" dbms=xlsx replace;
 MPRINT(GENIMPORT):   RXLX;
 MPRINT(GENIMPORT):   run;

 NOTE: One or more variables were converted because the data type is not supported by the V9 engine. For more details, run with 
       options MSGLEVEL=I.
 NOTE: The import data set has 800 observations and 13 variables.
 NOTE: WORK.WANT3 data set was successfully created.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.16 seconds
       cpu time            0.13 seconds

要确认我们已将文件读入SAS,我们可以在SAS Studio输出数据查看器中查看生成的SAS数据集之一。

enter image description here

此时,可以使用&Name1解析宏变量%scan()等以获取文件的名称,并在通过{{1}进一步操作后用于写入输出文件}} 步。