【发布时间】:2020-06-19 17:40:28
【问题描述】:
由于我的数据库存储了 SSN 和电话号码的格式化数据,因此我需要一种方法来首先获取传入数据,无论其格式如何,并对其进行格式化以匹配我的数据库在这些字段中存储数据的方式。我要迁移的数据由应用程序由最终用户从外部应用程序临时导入临时表,然后重构和操作以插入到我的客户端数据库中。
我在处理没有正则表达式的数据时遇到问题。如何在 SQL Server 中完成此类 DML 任务?我的两种数据类型的所需输出如下。我正在努力将我的源数据转换为这些输出格式。
数据存储所需的插入输出格式
社会保障号:123-45-6789
SSN:如果 8 个字符,然后用前导零填充
SSN:如果少于 8 个字符,则用问号“?”填充... ???-??-1234(不要问)
电话:123-456-7890
示例代码
WITH fakeCSVData AS
(
SELECT '111223333' AS SSN, '(444) 4444444' AS Phone UNION ALL
SELECT '211222121' AS SSN, '101 232-4545' AS Phone UNION ALL
SELECT '12334556' AS SSN, '(191) 330-4345' AS Phone UNION ALL
SELECT '41531' AS SSN, '(039) 084-8309' AS Phone UNION ALL
SELECT '220981278' AS SSN, '(298) 372-9234' AS Phone UNION ALL
SELECT '222013450' AS SSN, '(78) 909-7790' AS Phone UNION ALL
SELECT '123456789' AS SSN, '(717)_272-7277' AS Phone UNION ALL
SELECT '113344556' AS SSN, '210-973-2123' AS Phone UNION ALL
SELECT '808768252' AS SSN, '(219) 362-1895' AS Phone UNION ALL
SELECT '3456' AS SSN, '895 536-5356' AS Phone UNION ALL
SELECT '204874556' AS SSN, '(909) 544-9124' AS Phone UNION ALL
SELECT '80832934' AS SSN, '0271932132' AS Phone
)
SELECT
CASE WHEN LTRIM(RTRIM(csv.ssn)) LIKE '[0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]' THEN LTRIM(RTRIM(csv.ssn))
WHEN LTRIM(RTRIM(csv.ssn)) LIKE '[0-9][0-9]-[0-9][0-9]-[0-9][0-9][0-9][0-9]' THEN RIGHT( REPLICATE('0', 1) + LTRIM(RTRIM( csv.ssn )), 11)
WHEN LTRIM(RTRIM(csv.ssn)) LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' THEN SUBSTRING(LTRIM(RTRIM(csv.ssn)),1,3) + '-' + SUBSTRING(LTRIM(RTRIM(csv.ssn)),4,2) + '-' + SUBSTRING(LTRIM(RTRIM(csv.ssn)),6,4)
WHEN LTRIM(RTRIM(csv.ssn)) LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' THEN RIGHT( REPLICATE('0', 1) + LTRIM(RTRIM( SUBSTRING(LTRIM(RTRIM(csv.ssn)),1,2) + '-' + SUBSTRING(LTRIM(RTRIM(csv.ssn)),3,2) + '-' + SUBSTRING(LTRIM(RTRIM(csv.ssn)),5,4) )), 11)
WHEN RIGHT(LTRIM(RTRIM(csv.ssn)),4) LIKE '%[0-9][0-9][0-9][0-9]' THEN '???-??-' + RIGHT(LTRIM(RTRIM(csv.ssn)),4)
END AS SocSecNo
, NullIf(LEFT( REPLACE( LTRIM(RTRIM( REPLACE(REPLACE(csv.Phone, ')', ''), '(', '') )), ' ' , '-') , 12), '') AS Phone
FROM fakeCSVData csv
示例代码的当前输出
SocSecNo | Phone
--------------------------
111-22-3333 | 444-4444444
211-22-2121 | 101-232-4545
012-33-4556 | 191-330-4345
???-??-1531 | 039-084-8309
220-98-1278 | 298-372-9234
222-01-3450 | 78-909-7790
123-45-6789 | 717_272-7277
???-??-4556 | 210-973-2123
808-76-8252 | 219-362-1895
???-??-3456 | 895-536-5356
204-87-4556 | 909-544-9124
080-83-2934 | 0271932132
我一直在想,如果我只有一个简单的方法来首先从传入的源数据中删除所有非数字字符,然后我可以根据需要格式化字符串...但我不是查找执行此操作的任何 SQL Server Native 函数。
【问题讨论】:
标签: sql-server data-migration dml