【发布时间】:2015-11-24 14:27:49
【问题描述】:
我有一个由分号和逗号分隔的旧数据源列。第一个分号表示姓氏,第二个表示名字和中间名(或首字母),最后一个分号表示个人类型。逗号表示新名称已开始。这是此数据的示例。
+-------+---------------------------------------------------------------------------------------------------------------------+
| ID | SOURCE |
+-------+---------------------------------------------------------------------------------------------------------------------+
| 62963 | RENZ;MICHAEL;DECEASED,WANDER;MARIA;MINOR,WANDER;HENRY RUDOLPH;MINOR,WANDER;ROSA;MINOR,WANDER;PAUL EMIL;MINOR |
| 62964 | HERNDON;A C;ESTATE,BERRING;A F;DECEASED,BEIRING;A F;DECEASED,BEIRING;ANDREAS FREDERICK;DECEASED |
| 62965 | ZINCH;;ESTATE,ZINTZ;;ESTATE,HAYNES;HENRY;DECEASED |
| 62965 | ZINCH;;ESTATE,ZINTZ;;ESTATE,HAYNES;HENRY;DECEASED |
| 62966 | KRAUS;JOSEPHINE;MINOR,KENNEDY;GEORGE;DECEASED |
| 62967 | CAREY;JAMES;ESTATE,DE LA GARZA;REFUGIO;DECEASED |
| 62968 | LEWIS;FLORENCE;ESTATE,LOCKWOOD;ALBERT A;DECEASED |
| 62969 | GLAESER;EMMA;MINOR,GLAESER;HERMAN JR;MINOR,GLAESER;HERMAN;MINOR,RODRIGUEZ;HILARIO;DECEASED,RODRIGUEZ;MARIE;DECEASED |
| 62970 | STORY;BETTIE;ESTATE,EIGENDORFF;FRANZ;DECEASED |
| 62971 | HOWELL;MAMIE;MINOR,HOWELL;ETHEL;MINOR |
+-------+---------------------------------------------------------------------------------------------------------------------+
我正在尝试以如下方式提取数据:
+-----------+------------+-------------+-------------------+----------+
| ID | SEQUENCE | LAST | FIRSTMIDDLE | TYPE |
+-----------+------------+-------------+-------------------+----------+
| 62963 | 1 | RENZ | MICHAEL | DECEASED |
| 62963 | 2 | WANDER | MARIA | MINOR |
| 62963 | 3 | WANDER | HENRY RUDOLPH | MINOR |
| 62963 | 4 | WANDER | ROSA | MINOR |
| 62963 | 5 | WANDER | PAUL EMIL | MINOR |
| 62964 | 1 | HERNDON | A C | ESTATE |
| 62964 | 2 | BERRING | A F | DECEASED |
| 62964 | 3 | BEIRING | A F | DECEASED |
| 62964 | 4 | BEIRING | ANDREAS FREDERICK | DECEASED |
| 62965 | 1 | ZINCH | | ESTATE |
| 62965 | 2 | ZINTZ | | ESTATE |
| 62965 | 3 | HAYNES | HENRY | DECEASED |
| 62966 | 1 | KRAUS | JOSEPHINE | MINOR |
| 62966 | 2 | KENNEDY | GEORGE | DECEASED |
| 62967 | 1 | CAREY | JAMES | ESTATE |
| 62967 | 2 | DE LA GARZA | REFUGIO | DECEASED |
| 62968 | 1 | LEWIS | FLORENCE | ESTATE |
| 62968 | 2 | LOCKWOOD | ALBERT A | DECEASED |
| 62969 | 1 | GLAESER | EMMA | MINOR |
| 62969 | 2 | GLAESER | HERMAN JR | MINOR |
| 62969 | 3 | GLAESER | HERMAN | MINOR |
| 62969 | 4 | RODRIGUEZ | HILARIO | DECEASED |
| 62969 | 5 | RODRIGUEZ | MARIE | DECEASED |
| 62970 | 1 | STORY | BETTIE | ESTATE |
| 62970 | 2 | EIGENDORFF | FRANZ | DECEASED |
| 62971 | 1 | HOWELL | MAMIE | MINOR |
| 62971 | 2 | HOWELL | ETHEL | MINOR |
+-----------+------------+-------------+-------------------+----------+
这种类型的数据提取是我不太熟悉的。我想我需要使用SUBSTRING 和CHARINDEX 的复杂组合,但鉴于源列可以包含的条目数量各不相同,我不确定如何最好地处理这个问题。任何关于我应该从哪里开始的指导都会非常有帮助。
【问题讨论】:
-
Google“SQL 拆分函数”实际上有数千个示例。
-
理想情况下,这段代码的目的是修复这个严重损坏的架构。您永远不想发现自己将分隔数据存储在列中。
-
解析字符串不是为 SQL 设计的。实际上这是最糟糕的想法。你考虑过SQL CLR Functions吗?
-
@RBarryYoung 我会研究这个的,谢谢。
-
@JoelCoehoorn 没错。我正在尝试将提供给我们的源数据转换为适用于我们的新架构的格式。
标签: sql sql-server tsql