使用 SPSS 以编程方式在案例之间复制数据答案

【问题标题】：Programmatically copying data between cases using SPSS使用 SPSS 以编程方式在案例之间复制数据
【发布时间】：2013-10-18 18:58:07
【问题描述】：

对于一个学校项目，我发现自己在使用人口普查局当前人口调查的数据。我选择 SPSS 来处理数据，因为在我有限的时间范围内，它似乎是最容易直接使用的软件。一切似乎都很简单，除了一个给我带来麻烦的操作。

对于我的数据集中的每个案例——每个案例代表一个被调查的个体——定义了以下（相关）变量：

Household ID (HHID) - 每个被调查家庭的唯一编号
个人 ID (PID)——家庭中每个人的唯一编号
此人的年龄 (AGE)
此人是否接受公共健康保险--a 0 或 1 (HASHEALTH)
个人父亲的个人ID，如果家庭中存在（如果不存在则为0）（POPNUM）
个人母亲的个人 ID，如果家庭中存在（如果不存在则为 0）（MOMNUM）

问题是：我需要将任何给定父母的 KIDHASHEALTH 值设置为 HHID 和 POPNUM 或 MOMNUM 值与当前案例的 HHID 和 PID 匹配的最年轻的人的 HASHEALTH 值——从功能上讲，他们最小的孩子。

到目前为止，我一直无法弄清楚如何使用 SPSS 语法来做到这一点。任何人都可以想出一种方法来完成我正在尝试做的事情，使用语法或其他方式吗？

提前非常感谢。

使用示例数据编辑：

HHID |PID |AGE |POPNUM |MOMNUM |HASHEALTH |KIDHASHEALTH
-----+----+----+-------+-------+----------+------------
1    |1   |45  |0      |0      |0         |0 //KIDHASHEALTH == 0 because
1    |2   |48  |0      |0      |0         |0 //youngest child's HASHEALTH == 0
1    |3   |13  |1      |2      |0         |0
2    |1   |33  |0      |0      |0         |1 // == 1 because youngest child's
2    |2   |28  |0      |0      |0         |1 // HASHEALTH == 1
2    |3   |15  |1      |2      |0         |0
2    |4   |12  |1      |2      |1         |0
-----+----+----+-------+-------+----------+------------

【问题讨论】：

因为这只是关于编程，这与本网站无关，但会在 StackOverflow 上成为主题。鉴于您的描述，我怀疑您可以通过AGGREGATE 变量或使用CASESTOVARS 将数据集重塑为每个家庭一行。如果您提供一个看起来像您的数据的示例，会更容易提供帮助（我很难了解您列出的所有不同变量以及它们之间的相互关系）。
...你应该给出一个数据的sn-p和想要的结果的sn-p。
@AndyW 你说得对，这在 StackOverflow 上可能会更热门——我将它标记为移动，如果有人这样做，它会将它移动到 SO我会很感激的。我现在将编辑一些示例数据。
@ttnphns 已编辑示例数据
@AndyW 用示例数据编辑

标签： spss

【解决方案1】：

以下代码仅在您的小数据 sn-p 上进行了测试。因此，不能保证所有数据的特殊性。代码假设 AGE 是整数。

*Let's add small fractional noise to those children AGE who HASHEALTH=1.
*In order to insert the info about health right into the age number. 
if hashealth age= age+rv.unif(-.1,+.1).

*Turn to fathers. Combine POPNUM and PID numbers in one column.
compute parent= popnum. /*Copy POPNUM as a new var PARENT.
if parent=0 parent= pid. /*and if the case is not a child, fill there PID.
*Now a father and his children have the same code in PARENT
*and so we can propagate the minimal age in that group (which is the age of the
*youngest child, provided the man has children) to all cases of the group,
*including the father.
aggregate /outfile= * mode= addvari
          /break= hhid parent /*breaking is done also by household, of course
          /youngage1= min(age). /*The variable showing that minimal age.
*Turn to mothers and do the same thing.
compute parent= momnum.
if parent=0 parent= pid.
aggregate /outfile= * mode= addvari
          /break= hhid parent
          /youngage2= min(age). /*The variable showing that minimal age.
*Take the minimal value from the two passes.
compute youngage= min(youngage1,youngage2).

*Compute binary KIDHASHEALTH variable.
*Remember that YOUNGAGE is not integer if that child has HASHEALTH=1.
compute kidhashealth= 0.
if popnum=0 and momnum=0 /*if we deal with a parent
   and age<>youngage /*and the youngage age listed is not their own
   and rnd(youngage)<>youngage kidhashealth= 1. /*and the age isn't integer, assign 1.
compute age= rnd(age). /*Restore integer age
exec.
delete vari parent youngage1 youngage2 youngage.

【讨论】：

这太棒了，非常感谢。我发现的唯一问题是没有孩子的人最终会得到kidhashealth = 1，但这应该是一个快速解决方案；我会在这里发布结果。
如果您发现难以解决，请询问。
在最后一个条件中添加“和年龄年轻”是一个简单的问题，以防止没有孩子的人被视为自己的孩子，如果他们自己的 hashealth=1 就扔掉东西。我将编辑答案以反映这一点。