我有多个文本文件,其中包含这种格式的数据
File1.txt
subID imageCondition trial textItem imageFile response RT
Participant003 images 7 Is there a refrigerator? 07_targetPresent-refrigerator.jpg z 1.436971
Participant003 images 6 Is there an oven mitt? 06_targetPresent-ovenmitt.jpg z 0.519301
Participant003 images 1 Is there a toaster? 01_targetAbsent-toaster.jpg m 1.110664
Participant003 images 3 Is there a wine bottle? 03_targetAbsent-winebottle.jpg m 1.278945
Participant003 images 2 Is there a kettle? 02_targetAbsent-kettle.jpg z 2.672123
Participant003 images 5 Is there a blender? 05_targetPresent-blender.jpg m 2.633802
Participant003 images 8 Is there a bucket? 08_targetPresent-bucket.jpg m 2.596154
Participant003 images 4 Is there a surf board? 04_targetAbsent-surfboard.jpg m 1.072850
File2.txt
subID imageCondition trial textItem imageFile response RT
Participant005 images 1 Is there a toaster? 01_targetAbsent-toaster.jpg 0.000000
Participant005 images 2 Is there a kettle? 02_targetAbsent-kettle.jpg m 8.213927
Participant005 images 6 Is there an oven mitt? 06_targetPresent-ovenmitt.jpg z 3.569293
Participant005 images 4 Is there a surf board? 04_targetAbsent-surfboard.jpg 0.000000
Participant005 images 3 Is there a wine bottle? 03_targetAbsent-winebottle.jpg m 8.538699
Participant005 images 7 Is there a refrigerator? 07_targetPresent-refrigerator.jpg z 0.857319
Participant005 images 5 Is there a blender? 05_targetPresent-blender.jpg 0.000000
Participant005 images 8 Is there a bucket? 08_targetPresent-bucket.jpg z 1.967220
我希望能够将此数据读取到单元格数组中,以便可以单独访问其中的值。
我有以下用于读取数据的代码,但它无济于事,因为我无法以某种方式存储数据,因此无法访问各个值。例如,我想要“试验”或“响应”列中的所有值。
function content = load_data(fileName)
fid = fopen(fileName,'r')
if fid > 0
line_no =1;
oneline{line_no} = fgetl(fid);
while ischar(oneline{line_no})
line_no = line_no +1;
oneline{line_no} = fgetl(fid);
endwhile
fclose(fid)
content = oneline;
endif
endfunction
for i= 1:size(txtFiles,2)
data{i} = load_data(txtFiles{1,i});
end
for i=1:1:length(data)
dataMat = cell2mat(data(i));
for j=1:1:length(dataMat)
line = dataMat{1,j};
% Here I'm only able to fetch lines of data as strings that are separated by more than one space characters, making it more difficult access the required data
endfor
endfor
我正在寻找一种将文本文件中的数据读取到单元格数组或矩阵中的方法,这样我可以轻松访问所需的值,但是我只能使用传统的从文本中导入数据的方法文件。或者,如果我只是以一种可以访问所需内容的方式来获取解析数据方面的帮助。
注意:有多个这样的文本文件。如果您可以显示如何访问各个列中的值(例如, “响应”列。
答案 0 :(得分:1)
这很容易做到,例如strsplit可以根据空间分割数据;除了textItem字段中包含空格。所以我建议使用正则表达式。当您一次查找多个单独的片段时,使用named tokens是组织结果的一种便捷方法。我意识到,如果您不熟悉正则表达式,那么跳进去就很难了。请查看regex101.com以获得信息,以及一个非常有用的在线工具来测试您的正则表达式。请参阅regex101上的this specific example。也就是说,这是对您的数据有效的我的答案:
text = fileread(filename);
data = regexp(data,'^(?<subID>\w+)\s+(?<imageCondition>\w+)\s+(?<trial>\d+)\s+(?<textItem>.*?\?)\s+(?<imageFile>[-\.\w]+)\s+(?<response>\w)\s+(?<RT>[\d\.]+)','names','lineanchors')
或者您可以将其变成表格:
dataTable = struct2table(data)
结果如下:
subID imageCondition trial textItem imageFile response RT
__________________ ______________ _____ ____________________________ _____________________________________ ________ ____________
{'Participant003'} {'images'} {'7'} {'Is there a refrigerator?'} {'07_targetPresent-refrigerator.jpg'} {'z'} {'1.436971'}
{'Participant003'} {'images'} {'6'} {'Is there an oven mitt?' } {'06_targetPresent-ovenmitt.jpg' } {'z'} {'0.519301'}
{'Participant003'} {'images'} {'1'} {'Is there a toaster?' } {'01_targetAbsent-toaster.jpg' } {'m'} {'1.110664'}
{'Participant003'} {'images'} {'3'} {'Is there a wine bottle?' } {'03_targetAbsent-winebottle.jpg' } {'m'} {'1.278945'}
{'Participant003'} {'images'} {'2'} {'Is there a kettle?' } {'02_targetAbsent-kettle.jpg' } {'z'} {'2.672123'}
{'Participant003'} {'images'} {'5'} {'Is there a blender?' } {'05_targetPresent-blender.jpg' } {'m'} {'2.633802'}
{'Participant003'} {'images'} {'8'} {'Is there a bucket?' } {'08_targetPresent-bucket.jpg' } {'m'} {'2.596154'}
{'Participant003'} {'images'} {'4'} {'Is there a surf board?' } {'04_targetAbsent-surfboard.jpg' } {'m'} {'1.072850'}
如果要将数字字段转换为数字:
dataTable.trial = str2double(dataTable.trial);
dataTable.RT = str2double(dataTable.RT);
然后给出:
subID imageCondition trial textItem imageFile response RT
__________________ ______________ _____ ____________________________ _____________________________________ ________ ______
{'Participant003'} {'images'} 7 {'Is there a refrigerator?'} {'07_targetPresent-refrigerator.jpg'} {'z'} 1.437
{'Participant003'} {'images'} 6 {'Is there an oven mitt?' } {'06_targetPresent-ovenmitt.jpg' } {'z'} 0.5193
{'Participant003'} {'images'} 1 {'Is there a toaster?' } {'01_targetAbsent-toaster.jpg' } {'m'} 1.1107
{'Participant003'} {'images'} 3 {'Is there a wine bottle?' } {'03_targetAbsent-winebottle.jpg' } {'m'} 1.2789
{'Participant003'} {'images'} 2 {'Is there a kettle?' } {'02_targetAbsent-kettle.jpg' } {'z'} 2.6721
{'Participant003'} {'images'} 5 {'Is there a blender?' } {'05_targetPresent-blender.jpg' } {'m'} 2.6338
{'Participant003'} {'images'} 8 {'Is there a bucket?' } {'08_targetPresent-bucket.jpg' } {'m'} 2.5962
{'Participant003'} {'images'} 4 {'Is there a surf board?' } {'04_targetAbsent-surfboard.jpg' } {'m'} 1.0729
您还询问了如何访问它。从表中获取第三个“响应”:
dataTable.response{3}
或者从结构中:
data(3).response