【发布时间】:2019-01-15 19:21:51
【问题描述】:
我有一个结构突然的文件,当结构不符合时,我想删除这些行。所以结构应该是:1)一行以“Sequence”开头,2)一行以“Start”开头,3)一行以数字开头。
现在在我的文件中,有些行没有数字,但有前两行(数字行已用 grep 删除)。我希望找到一种方法,使用 awk 或 sed,在没有数字行的情况下删除前两行。希望这是可能的吗?
cat file.txt
Sequence: HM855457_IGHV1-8*02_Homosapiens_F_V-REGION_24..319_296nt_1_____296+0=296__rev-compl_ from: 1 to: 296
Start End Strand Pattern Mismatch Sequence
217 225 + pattern:AA[CT]NNN[AT]CN . aacacctcc
Sequence: MG719312_IGHV1-8*03_Homosapiens_F_V-REGION_127..422_296nt_1_____296+0=296___ from: 1 to: 296
Start End Strand Pattern Mismatch Sequence
217 225 + pattern:AA[CT]NNN[AT]CN . aacacctcc
Sequence: M99648_IGHV2-26*01_Homosapiens_F_V-REGION_164..464_301nt_1_____301+0=301___ from: 1 to: 301
Start End Strand Pattern Mismatch Sequence
Sequence: L21969_IGHV2-70*01_Homosapiens_F_V-REGION_144..444_301nt_1_____301+0=301___ from: 1 to: 301
Start End Strand Pattern Mismatch Sequence
176 184 + pattern:AA[CT]NNN[AT]CN . aatactaca
Sequence: X92241_IGHV2-70*02_Homosapiens_F_V-REGION_144..433_290nt_1_____290+0=290_partialin3'__ from: 1 to: 290
Start End Strand Pattern Mismatch Sequence
176 184 + pattern:AA[CT]NNN[AT]CN . aatactaca
预期输出:
cat file.txt
Sequence: HM855457_IGHV1-8*02_Homosapiens_F_V-REGION_24..319_296nt_1_____296+0=296__rev-compl_ from: 1 to: 296
Start End Strand Pattern Mismatch Sequence
217 225 + pattern:AA[CT]NNN[AT]CN . aacacctcc
Sequence: MG719312_IGHV1-8*03_Homosapiens_F_V-REGION_127..422_296nt_1_____296+0=296___ from: 1 to: 296
Start End Strand Pattern Mismatch Sequence
217 225 + pattern:AA[CT]NNN[AT]CN . aacacctcc
Sequence: L21969_IGHV2-70*01_Homosapiens_F_V-REGION_144..444_301nt_1_____301+0=301___ from: 1 to: 301
Start End Strand Pattern Mismatch Sequence
176 184 + pattern:AA[CT]NNN[AT]CN . aatactaca
Sequence: X92241_IGHV2-70*02_Homosapiens_F_V-REGION_144..433_290nt_1_____290+0=290_partialin3'__ from: 1 to: 290
Start End Strand Pattern Mismatch Sequence
176 184 + pattern:AA[CT]NNN[AT]CN . aatactaca
【问题讨论】:
-
你能展示预期的输出和你的尝试吗?
-
叹息。调整 my previous answer 中的 awk 脚本是非常简单的。这就是我试图警告您关于使用 sed 执行此类任务的 wrt - 现在,您需要一个完全不同的解决方案,您需要一个完全不同的解决方案。