如果缺少值，则按子组从 -tabstat- 中省略汇总统计信息答案

【问题标题】：omitting summary statistics from -tabstat- by subgroup if they are missing values如果缺少值，则按子组从 -tabstat- 中省略汇总统计信息
【发布时间】：2015-09-21 09:38:37
【问题描述】：

我正在编写代码以使用 tabstat, by() 和 esttab 在 Latex 中导出汇总统计表。

这里有一个复制我的数据结构的玩具示例：

cls
clear all
set more off

use auto, clear

// Create two groups to be used in the -by()- option
gen rep2="First dataset" if rep78>=3
replace rep2="Second dataset" if rep78<3 

// Recode "price" as completely missing in the first group
replace price=. if rep2=="First dataset"  

// Table
eststo: estpost tabstat weight price mpg trunk, ///
    column(statistics) statistics(count mean median sd) by(rep2) nototal

local sum_statistics "count(label(Observations)) mean(label(Mean) fmt(2)) p50(label(Median)) sd(label(Standard deviation) fmt(2))"

esttab using "table1.tex", replace type ///
    title("Summary Statistics")  ///
    cells("`sum_statistics'") ///   
    noobs nonum booktabs

输出将汇总统计信息显示到两个子表中，每个子表对应一个数据集（由rep2 定义）。这两个数据集不一定具有相同的变量：price 在第一个数据集中完全缺失。

我想完全省略 price 的汇总统计行，仅用于“第一个数据集”（将其留给“第二个数据集”）。这是因为由于“第一个数据集”缺少变量 price，因此其所有汇总统计信息都是缺失值。这相当于在特定分组中“观察”等于 0 的情况下省略整行汇总统计信息。

我查看了tabstat 的文档，但我不太确定如何进行。我必须使用estout 的drop() 选项吗？

非常感谢，S

【问题讨论】：

标签： latex stata

【解决方案1】：

正如您提到的，您可以使用drop() 选项：

clear all
set more off

sysuse auto, clear

// Create two groups to be used in the -by()- option
gen rep2="First" if rep78>=3
replace rep2="Second" if rep78<3 

// Recode "price" as completely missing in the first group
replace price=. if rep2=="First dataset"  

// Table
eststo: estpost tabstat weight price mpg trunk, ///
    column(statistics) statistics(count mean median sd) by(rep2) nototal

local sum_statistics "count(label(Observations)) mean(label(Mean) fmt(2)) p50(label(Median)) sd(label(Standard deviation) fmt(2))"

esttab, replace type ///
    title("Summary Statistics")  ///
    cells("`sum_statistics'") ///   
    noobs nonum booktabs drop(First:price)

这涉及使用全名，而不仅仅是变量名。

请注意，我在分组变量的值中取出了空格。打电话给esttab 时这似乎很麻烦，但我留给你去探索。

【讨论】：

变量名没什么大不了的，非常感谢帮助！