【发布时间】:2021-11-26 08:42:00
【问题描述】:
我有一个这样的数据库:
>2654570298
MRNYSYKGKWEKLLTPEIVKKLTLINEFKGEQRLFIKAHKDELKELSELA
KIQSTEASNKIEGIFTSDDRFKSLAQAKTTPRNRNESEIAGYRDVLNTIH
DSYEYIPISASYFLQLHRDLYKFVAKNDVGKFKSSDNIIRETDEKGNERL
RFRPVPAWETPAAIDELCKAYADAKEEIDPLILNAMFILDFLCIHPFNDG
NGRMSRLLTLLLLYKTGFIVGKYISIEKIIEESKETYYEVLQDSLVGWHE
NENDYKPFVNYMLGVIVNAYKEFESRTELVTNPNLTKSDRIREIIKDHIG
TITKAELLEMNPDISDTTVQRTLAKLLKNNDIKKIGGGRYTKYTWNTEEQ
>2654570299|K03427
MITGELKNKIDGLWDVFAAGGLVNPLEVIEQITYLMFIKDLDDVDKRKEK
ESAMLGLPYKSIFAGEVKIGDRTIEGTQLKWSVFHDFSAGRMYAIMQEWV
FPFIKNLHSDKNSTYSKYMDDAIFKFPTPLLLSKVVDSLDEIYEIMNSTL
VLDVRGDVYEYLLNKIASAGRNGQFRTPRHIIRMMVEMVEPKADDVICDP
GDLLKVCKTKKTELLFLALFLRMLKVGGRCACIVPDGVLFGSSKAHKDIR
KQVVEENRLEAVISMPSGVFKPYAGVSTAILIFTKTGHGGTDNVWFYDMT
ADGYSLDDKRTPVSENDIPDIIERFKNLDKEIDRERTDKSFMVPKQDIAD
NDYDLSINKYKEVVYEKIEYPPTSEIMADIREIEMEIGKEMDELEKLLNI
>2654570301
MNESELYKELGILTKDKSKWAENIQYVSSLLNHESAKIQAKALWLLGEMG
LEYPDSIQDAVPMVASFCDSENALLRERAVNALGRIGRGNYNLIEPYWSD
LFRFASDDEPKVRLSFIWASENVATNTPDIYENHMSVFESLLHDIDDKVR
MESPEIFRVLGKRRPEFVIPYIEQLQKMAETDSNRVVRIHSLGAIKVTTS
K
>2654570302
MWNMIWPLVLIVGSNCFYNICTKSMPEGTNTFGALTVTYLVGAVLSAVLF
VVSVKPAGVLNEISKINWTSFVLGLVIVGLEAGYVFLYRAGWKVSNGALT
ANICLAIALIVIGFLLYKESISIKQVAGIVVCGFGLFLING
>2654570303|K01153
MKNKELLKRVGYVVLICLSFFVATWYFFENNKICTICWIAIGSKNVYDLV
HRIKNSKKED
我想过滤它,只打印标题包含“|K”的序列,使用 awk、grep 或类似的东西。期望的输出:
>2654570299|K03427
MITGELKNKIDGLWDVFAAGGLVNPLEVIEQITYLMFIKDLDDVDKRKEK
ESAMLGLPYKSIFAGEVKIGDRTIEGTQLKWSVFHDFSAGRMYAIMQEWV
FPFIKNLHSDKNSTYSKYMDDAIFKFPTPLLLSKVVDSLDEIYEIMNSTL
VLDVRGDVYEYLLNKIASAGRNGQFRTPRHIIRMMVEMVEPKADDVICDP
GDLLKVCKTKKTELLFLALFLRMLKVGGRCACIVPDGVLFGSSKAHKDIR
KQVVEENRLEAVISMPSGVFKPYAGVSTAILIFTKTGHGGTDNVWFYDMT
ADGYSLDDKRTPVSENDIPDIIERFKNLDKEIDRERTDKSFMVPKQDIAD
NDYDLSINKYKEVVYEKIEYPPTSEIMADIREIEMEIGKEMDELEKLLNI
>2654570303|K01153
MKNKELLKRVGYVVLICLSFFVATWYFFENNKICTICWIAIGSKNVYDLV
HRIKNSKKED
请注意,一个标题和下一个标题之间的行数并不总是相同的,换行符总是将一个序列和后面的标题分开。
有人可以帮忙吗?
【问题讨论】: