【发布时间】:2022-01-07 02:14:10
【问题描述】:
当我将 Snakemake 提交到集群时,我遇到了“通配符”对象没有属性“输出”的错误,类似于之前的问题 'Wildcards' object has no attribute 'output'。我想知道您是否对如何使其与集群兼容有任何建议?
虽然我的规则 annotate_snps 在我本地测试时有效,但我在集群上收到以下错误:
input: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk.vcf.gz
output: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_rename.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_tmp.vcf.gz, results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_gatk_ann.vcf.gz
log: results/CI226380_S4/vars/CI226380_S4_bwa_H37Rv_annotate_snps.log
jobid: 1139
wildcards: samp=CI226380_S4, mapper=bwa, ref=H37Rv
WorkflowError in line 173 of /oak/stanford/scg/lab_jandr/walter/tb/mtb/workflow/Snakefile:
'Wildcards' object has no attribute 'output'
我的规则定义为:
rule annotate_snps:
input:
vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk.vcf.gz'
log:
'results/{samp}/vars/{samp}_{mapper}_{ref}_annotate_snps.log'
output:
rename_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_rename.vcf.gz'),
tmp_vcf=temp('results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_tmp.vcf.gz'),
ann_vcf='results/{samp}/vars/{samp}_{mapper}_{ref}_gatk_ann.vcf.gz'
params:
bed=config['bed_path'],
vcf_header=config['vcf_header']
shell:
'''
# Rename Chromosome to be consistent with snpEff/Ensembl genomes.
zcat {input.vcf}| sed 's/NC_000962.3/Chromosome/g' | bgzip > {output.rename_vcf}
tabix {output.rename_vcf}
# Run snpEff
java -jar -Xmx8g {config[snpeff]} eff {config[snpeff_db]} {output.rename_vcf} -dataDir {config[snpeff_datapath]} -noStats -no-downstream -no-upstream -canon > {output.tmp_vcf}
# Also use bed file to annotate vcf
bcftools annotate -a {params.bed} -h {params.vcf_header} -c CHROM,FROM,TO,FORMAT/PPE {output.tmp_vcf} > {output.ann_vcf}
'''
非常感谢您!
【问题讨论】:
标签: cluster-computing wildcard snakemake