【发布时间】:2019-11-19 19:10:44
【问题描述】:
我有一个函数,它以以下格式返回值:
["Stage 1 : Package Description: Blisters are made in a cold-forming process from an aluminium base web. Each tablet isfilled into a separate blister and a lidding foil of aluminium is welded on. The blistersare opened by pressing the tablets through the lidding foil. Values: ['Blister', 'Foil', 'Aluminium']", "Stage 2 : Package Description: The tablets are filled into box cylindrically shaped bottles made of white colouredpolyethylene. The volumes of the bottles depend on the tablet strength and amount oftablets, ranging from 20 to 175 ml. The screw type cap is made of white colouredpolypropylene and is equipped with a tamper proof ring. Values: ['Bottle', 'Cylindrically shaped Bottles', 'Polyethylene'] Colour: White", "Stage 3 : Package Description: The tablets are filled into box cylindrically shaped bottles made of white colouredpolyethylene. The volumes of the bottles depend on the tablet strength and amount oftablets, ranging from 20 to 175 ml. The screw type cap is made of white colouredpolypropylene and is equipped with a tamper proof ring. Values: ['Bottle', 'Screw Type Cap', 'Polypropylene'] Colour: White"]
因此,某些阶段“颜色”将出现,而某些阶段“颜色”将不存在。我想将这些值提取到 csv 中,其中的列应如下所示:
CSV 格式的预期输出:
StageNumber PackageDescription Values1 Values2 Values3 Colour
1. Blisters are made in a ... Blister Foil Aluminium
2. The tablets are filled ... Bottle Cylindrically shaped Bottles Polyethylene White
到目前为止的代码:
paragraphs = ['The tablets are filled into cylindrically shaped bottles made of white coloured\npolyethylene. The volumes of the bottles depend on the tablet strength and amount of\ntablets, ranging from 20 to 175 ml. The screw type cap is made of white coloured\npolypropylene and is equipped with a tamper proof ring.', 'PVC/PVDC blister pack', 'Blisters are made in a cold-forming process from an aluminium base web. Each tablet is\nfilled into a separate blister and a lidding foil of aluminium is welded on. The blisters\nare opened by pressing the tablets through the lidding foil.', '\n']
final_ref = [['Blister', 'Foil', 'Aluminium'], ['Blister', 'Base Web', 'PVC/PVDC'], ['Bottle', 'Cylindrically shaped Bottles', 'Polyethylene'], ['Bottle', 'Screw Type Cap', 'Polypropylene'], ['Blister', 'Base Web', 'PVC'], ['Blister', 'Base Web', 'PVD/PVDC'], ['Bottle', 'Square Shaped Bottle', 'Polyethylene']]
colours = ['White', 'Yellow', 'Blue', 'Red', 'Green', 'Black', 'Brown', 'Silver', 'Purple', 'Navy blue', 'Gray', 'Orange', 'Maroon', 'pink', 'colourless', 'blue']
TEXT_WITHOUT_COLOUR = 'Stage {counter} : Package Description: {sen} Values: {values}'
TEXT_WITH_COLOUR = TEXT_WITHOUT_COLOUR + ' Colour: {colour}'
counter = 1
result = []
def is_missing(words, sen):
for w in words:
if w.lower() not in sen.lower():
return True
return False
for words in final_ref:
for sen in paragraphs:
if is_missing(words, sen):
continue
kwargs = {
'counter': counter,
'sen': sen,
'values': str(words)
}
if words[0] == 'Bottle':
for wd in colours:
if wd.lower() in sen.lower():
kwargs['colour'] = wd
break
text_const = TEXT_WITH_COLOUR
else:
text_const = TEXT_WITHOUT_COLOUR
result.append(text_const.format(**kwargs).replace('\n', '').replace('\t', ''))
counter += 1
print(result)
【问题讨论】:
标签: python python-3.x list csv file-io