Question

我在将一个文件中的两列替换为另一文件中的两列时遇到问题。第一个文件包含29行和两列，而第二个文件是具有1400行和不同列的巨大文件。第一个文件如下所示：

    ag-109    3.905E-07  
    am-241    1.121E-06  
    am-243    7.294E-09  
    cs-133    1.210E-05  
    eu-151   2.393E-08  
    eu-153   4.918E-07  
    gd-155   2.039E-08  
    mo-95   1.139E-05  
    nd-143  9.869E-06 
      ..............
      .............

第二个文件如下：

 u-234       101  0  7.471e-06   293   end  
 u-235       101  0  0.0005265   293   end  
 u-236       101  0  0.0001285   293   end  
 u-238       101  0  0.02278     293   end  
 np-237      101  0  1.018e-05   293   end  
 pu-238      101  0  2.262e-06   293   end  
 pu-239      101  0  0.000147    293   end  
 .........  
 .......
 # the first 29 lines of column1 repeated, and each 29 lines, has one value 
  of column 2.  
  from "101" 29 times, then "102" 29 lines,.... till "1018"
  as below.
 .    
 u-234       1018  0  7.471e-06  293   end  
 u-235       1018  0  0.0005265  293   end  
 u-236       1018  0  0.0001285  293   end  
 u-238       1018  0  0.02278    293   end  
 np-237      1018  0  1.018e-05  293   end  
 pu-238      1018  0  2.262e-06  293   end 

 after the "1018" 
 file2 text continue like this 

    u-234       201 0  8.856E-06 293   end 
    u-235       201 0  7.832E-04 293   end 
    u-236       201 0  8.506E-05 293   end

我想在列2等于“ 201”时停止替换列，直到文件末尾。

/// ***必须注意，file2的其余部分是完全不同的文本并继续使用其他具有不同列长的文本和数字 ******** \。

另外：\\ file1包含29行，我有多个类似于file1的文件，它们的所有列都必须按顺序替换为file2列。澄清：如果您在file2的第2列中看到“ 101”，则此值重复29 与file1相关的时间。它将持续到file1_18替换为file2中的“ 1018”行

我希望我能澄清一下，对我来说很难说清楚。

我试图将file1的第一列更改为file2的第一列，并将file1的第二列更改为文件2的第三列。

我面临两个问题： 1-我无法更换色谱柱 2-更改列后如何写入整个文件。

我试图读取两个文件并将它们分成几列，然后在有条件的情况下读取特定的列。

我也曾尝试将文件转换为* .csv，但它弄乱了空格，需要使用带有特定扩展名的系统代码来运行。

  with open('100_60.inp') as f:

       while True:

          line = f.readline()
          if not line:
               break
          columns=re.split(r"\s+",line.strip())
          if len(columns)==6 and columns[5]=='end' and columns[1]!='11': 

             if columns[1]=='201':
                break 

             repla =columns[0]
             compo=columns[3]
             print(repla,compo)  # this will print col1 and col4 of file2


  with open('20_3.2_10_100_18.txt') as s:

           while True:

                nuclide = s.readline()
                if not nuclide:
                   break

                rows = re.split(r"\s+",nuclide.strip())
                material = rows[0]
                com2 = rows[1]
                print(material,com2) # col1 and col2 from file1

输出应如下所示：

     ag-109      101  0  3.905E-07  293   end  
     am-241      101  0  1.121E-06  293   end  
     am-243      101  0  7.294E-09  293   end  
     cs-133      101  0  1.210E-05  293   end  
     eu-151      101  0  2.393E-08  293   end  
     eu-153      101  0  4.918E-07  293   end  
     gd-155      101  0  2.039E-08  293   end

....
....
....

我真的是Python的初学者。我不知道如何完成。我也不知道编辑后如何写完整的文件。

请提供任何帮助。

预先感谢

Answer 1

一个人可以使用dataframe中的pandas将一列替换为另一列。

尝试以下代码（注释显示正在执行的操作）：

df = pd.read_csv('file1.txt', sep='\s+', header=None)
df2 = pd.read_csv('file2.txt', sep='\s+', header=None)
df2[0] = df[0]  # replace column in 2nd dataframe with column in first.
df2[3] = df[1]  # similarly replace another column.
print(df2)

输出：

        0    1  2             3    4    5
0  ag-109  101  0  3.905000e-07  293  end
1  am-241  101  0  1.121000e-06  293  end
2  am-243  101  0  7.294000e-09  293  end
3  cs-133  101  0  1.210000e-05  293  end
4  eu-151  101  0  2.393000e-08  293  end
5  eu-153  101  0  4.918000e-07  293  end
6  gd-155  101  0  2.039000e-08  293  end

要写入文件：

df2.to_csv('outfile.txt', sep=' ', index=False, header=False)

文件输出：

ag-109 101 0 3.9049999999999996e-07 293 end
am-241 101 0 1.121e-06 293 end
am-243 101 0 7.294000000000001e-09 293 end
cs-133 101 0 1.21e-05 293 end
eu-151 101 0 2.3930000000000003e-08 293 end
eu-153 101 0 4.918e-07 293 end
gd-155 101 0 2.0390000000000003e-08 293 end

要按条件选择特定行，可以编写：

newdf = df2[(df2[0] == 'u-235') | (df2[0]=='u-238')]
print(newdf)

输出：

       0    1  2         3    4    5
1  u-235  101  0  0.000526  293  end
3  u-238  101  0  0.022780  293  end

Answer 2

您的输入文件看起来像“固定宽度字段”文本文件。如果您可以使用熊猫，那很容易：

# load the input files
df1 = pd.read_fwf('file1.txt', header=None)
df2 = pd.read_fwf('file2.txt', header=None)

# create an empty dataframe and feed it with the length of df1
df3 = pd.DataFrame()
df3[0] = df1[0]
df3[1] = df2.iloc[0:len(df1), 1]
df3[2] = df2.iloc[0:len(df1), 2]
df3[3] = df1[1]
df3[4] = df2.iloc[0:len(df1), 4]
df3[5] = df2.iloc[0:len(df1), 5]

# output a file
with open('output.txt', 'w') as fd:
    fd.write(df3.to_string(header=False, index=False))

编辑后，不能选择使用pandas，因此您应同时读取两个文件，并同时写入输出文件的每一行。代码可能是：

with open(file1) as f1, open(file2) as f2, open(outfile, 'w') as fout:
    sep = re.compile(r'\s+')    # compile the separator for re
    while True:
        # read a line from each file
        try:
            line1 = next(f1)
            line2 = next(f2)
        except StopIteration:
            break                # stop processing on end of any file

        # split lines in fields
        fields1 = sep.split(line1.strip())
        fields2 = sep.split(line2.strip())

        if fields2[1] != '101':
            break                # stop processing if past 101

        # replace fields and write a line on the output file    
        fields2[0] = fields1[0]
        fields2[3] = fields1[1]
        fout.write(' {}      {}  {}  {}  {}   {} \n'.format(*fields2))

输出文件看起来像您的问题的预期文件。

我如何从另一个文件替换文件的一部分

2 个答案: