从网站下载数据

时间:2014-07-27 15:22:46

标签: regex matlab

我使用以下代码从网站下载文件夹中的两个文件。

我想下载一些包含" MOD09GA.A2008077.h22v05.005.2008080122814.hdf"和" MOD09GA.A2008077.h23v05.005.2008080122921.hdf"在页面中。但我不知道如何选择这些文件。下面的代码下载了所有文件,但我只需要其中两个。

有没有人有任何想法?

URL = 'http://e4ftl01.cr.usgs.gov/MOLT/MOD09GA.005/2008.03.17/';
% Local path on your machine
localPath = 'E:/myfolder/';

% Read html contents and parse file names with ending *.hdf
urlContents = urlread(URL);
ret = regexp(urlContents, '"\S+.hdf.xml"', 'match');


% Loop over all files and download them
for k=1:length(ret)
    filename = ret{k}(2:end-1);
    filepathOnline = strcat(URL, filename);
    filepathLocal = fullfile(localPath, filename);
    urlwrite(filepathOnline, filepathLocal);
end

1 个答案:

答案 0 :(得分:2)

尝试使用tokens代替正则表达式:

localPath = 'E:/myfolder/';
urlContents = 'aaaa "MOD09GA.A2008077.h22v05.005.2008080122814.hdf.xml" and "MOD09GA.A2008077.h23v05.005.2008080122921.hdf.xml"  aaaaa';

ret = regexp(urlContents , '"(\S+)(?:\.\d+){2}(\.hdf\.xml)"', 'tokens');
%// Loop over each file name
for k=1:length(ret)
    filename = [ret{k}{:}];
    filepathLocal = fullfile(localPath, filename)
end
相关问题