【发布时间】:2018-05-20 05:48:25
【问题描述】:
我有一个这样的 TEXT 文件-
sfdfd
kgfkhgjk
fsdfs
sgsgggggfsdf
Node: RBS6301 CXP102051/26_R30F L17A.4-6 (C17.0_LSV198_PA24)
=================================
col1 clo2 clo3
=================================
1 avb wer21g2
---------------------------------
=================================
empcode Emnname Date DESC
12d sf 2018-02-06 dghsjf hfhgf jfjh
asf2 asdfw2 2018-02-16 fsfsfg jhjhhjghk
dsf21 sdf2 2016-02-06 sdgfsgf
sdgg dsds dkfd-sffddfdf aaaa
dfd gfg dfsdffd aaaa
df dfdf efefkhgvkjgjk kgkjjk
4fr freff klhlkkl
-----------------------------------
hfjh
vkgjlbljkbkjbk/n/l jhfjhfhj kutiugjm iugiuk
hfhj
fggggggggggggggggggggggg
从上面我使用 -
提取了以下部分import pandas as pd
import csv
findStr = 'empcode Emnname'
EndStr = '-----------------------------------'
tmp1 = []
tmp = []
tmp2=[]
with open('test123.txt') as f:
out = []
for line in f:
if line.startswith(findStr):
tmp.append(re.findall('\w+',line.strip()))
for line in f:
if line.rstrip()==EndStr:
out.append(tmp)
break
tmp.append(re.sub('\s',' ',line.strip()))
f.close()
tmp O/P-
[['empcode', 'Emnname', 'Date', 'DESC'],
'12d sf 2018-02-06 dghsjf hfhgf jfjh',
'asf2 asdfw2 2018-02-16 fsfsfg jhjhhjghk',
'dsf21 sdf2 2016-02-06 sdgfsgf',
'sdgg dsds dkfd-sffddfdf aaaa',
'dfd gfg dfsdffd aaaa',
'df dfdf efefkhgvkjgjk kgkjjk',
'4fr freff klhlkkl']
但是,我希望 NA 在空白区域中。即在 gfg 之下或 4fr 之后。任何人都可以请帮忙。应该是——
[['empcode', 'Emnname', 'Date', 'DESC'],
'12d sf 2018-02-06 dghsjf hfhgf jfjh',
'asf2 asdfw2 2018-02-16 fsfsfg jhjhhjghk',
'dsf21 sdf2 2016-02-06 sdgfsgf',
'sdgg dsds dkfd-sffddfdf aaaa',
'dfd gfg dfsdffd aaaa',
'df NA dfdf efefkhgvkjgjk kgkjjk',
'4fr NA NA freff klhlkkl']
【问题讨论】: