a = "The process maps are similar to Manual Excellence Process Framework (MEPF)"
输入 = “The process maps are similar to Manual Excellence Process Framework (MEPF)”
输出 = Manual Excellence Process Framework (MEPF)
我想编写一个Python脚本,处理一段文本,从中提取括号内给定首字母缩写词的完整形式,例如(MEPF)
的完整形式是Manual Excellence Process Framework
。我希望通过匹配括号内每个大写字母来追加完整形式。
我的想法是,每当括号内出现首字母缩写词时,就映射每个大写字母。例如,(MEPF)从最后一个字母F开始匹配括号前的最后一个单词,这里是Framework,然后是P(Process),然后是E(Excellence),最后是M(Manual)。所以最终输出将是完整形式(Manual Excellence Process Framework)。如果你能按照这种方式尝试一次,那对我将非常有帮助。
回答:
使用简单的正则表达式和一些后处理:
a = "I like International Business Machines (IBM). The Manual Excellence Process Framework (MEPF)"import rem = re.findall(r'([^)]+) \(([A-Z]+)\)', a)out = {b: ' '.join(a.split()[-len(b):]) for a,b in m}out
输出:
{'IBM': 'International Business Machines', 'MEPF': 'Manual Excellence Process Framework'}
如果你想检查首字母缩写词是否确实与单词匹配:
out = {b: ' '.join(a.split()[-len(b):]) for a,b in m if all(x[0]==y for x,y in zip(a.split()[-len(b):], b)) }
示例
a = "No match (ABC). I like International Business Machines (IBM). The Manual Excellence Process Framework (MEPF)."m = re.findall(r'([^)]+) \(([A-Z]+)\)', a){b: ' '.join(a.split()[-len(b):]) for a,b in m if all(x[0]==y for x,y in zip(a.split()[-len(b):], b))}# {'IBM': 'International Business Machines',# 'MEPF': 'Manual Excellence Process Framework'}