如何从字符串中搜索行并从特定列中提取数据

时间:2014-12-12 20:06:03

标签: python python-3.x

我有一个名为search的字符串,其中包含多个这样的列:

1     4     +/+
2     6     +/+
4     3     -/-
5     3     +/+

它由制表符分隔。我希望逐行逐行(我假设使用for循环),搜索行是否包含" + / +"如果是,请将该值添加到包含第二列值的嵌套列表中。如果没有条目(例如3不在第一行),那么我希望配对值为0.所以对于这个特定情况的输出,它应该是[[1,4],[2,6] ],[3,0],[4,0],[5,3]]。

到目前为止,我已经尝试了

for i = 5
    if search[3] == '+/+'
        basefile[i]=(i, search[5])
    else
        basefile[i]=(i, 0)

我要退出哪些步骤?我是python的新手。

2 个答案:

答案 0 :(得分:0)

假设您打开了文件并将其存储在变量f:

counts = 0
for line in f:
    if '+/+' in line:
        counts += 1

或者,如果将整个事物存储为字符串,则可以使用split方法(将在空白处拆分字符串并将非空白部分放在列表中),迭代生成的列表,并计算。所以,如果s是你的字符串:

counts = 0
for i in s.split():
    if i == '+/+':
        counts += 1

答案 1 :(得分:0)

lines.csv:

1   4   +/+
2   6   +/+
4   3   -/-
5   3   +/+

以下是一个例子:

lines = {} # create an empty dictionary to store key/value pair
with open('lines.csv','r') as f:
    orig_lines = f.read().split('\n') # split on new lines
    for line in orig_lines:
        line = line.split('\t') # split the line by tabs
        if len(line) < 3: # continue loop if line does not have 3 columns
            continue
        lines.update({line[0]:[line[1],line[2]]}) # dictionary with key/value ... value is a list with columns 2 and 3

# grab what the max value is from column 1
max_value = int(max(lines.keys()))
print("max:", max_value)
paired = [] # create an empty list

# if you want to take a look at lines uncomment this:
# import pdb;pdb.set_trace()
#
# and then you can actively inspect what is happening in the script
# as it is running
# print(lines) or print(lines['1'])

# we can iterate to the max_value
# since we are not sure which values in between are missing (ie "3")
for i in xrange(1, max_value + 1):
    i = str(i) # convert the int to a string so we can use it as a key
    try:
        print(lines[i])
    except Exception, e:
        # if we get an exception that means the key doesn't exist
        # lets add an empty one and continue to the next loop
        # (ie "3")
        paired.append([i,0])
        continue
    # if '+/+' is in the list stored in the key/value 
    if '+/+' in lines[i]:
        print(lines[i])
        paired.append([i,lines[i][0]])
    else: # matches any line that doesn't have '+/+' in it
        print(lines[i])
        paired.append([i, 0])
print(paired)

注意:在itertools

中使用collections或其他内容可能有一个非常简单的解决方案

输出:

bob@bob-p7-1298c:~/Desktop$ python test.py
('max:', 5)
['4', '+/+']
['4', '+/+']
['6', '+/+']
['6', '+/+']
['3', '-/-']
['3', '-/-']
['3', '+/+']
['3', '+/+']
[['1', '4'], ['2', '6'], ['3', 0], ['4', 0], ['5', '3']]