【问题标题】:Insert from string only inserting first row从字符串插入仅插入第一行
【发布时间】:2021-04-19 19:11:59
【问题描述】:

我们每月都会收到一组纯文本文件。它们没有换行符或分隔符。

我正在尝试将子字符串插入到具有大单字符串列的表中的多列和多行中。但只有第一行被插入...

CREATE TABLE [dbo].[claims_stage]
(
     [stage] [nvarchar](max) NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

从平面文件插入到 claim_stage:

bulk insert CLAIMS_STAGE from '\\filepath\filename'

将拆分字符串从 CLAIMS_STAGE 插入新表:

insert into
CCLF8
(
    BENE_MBI_ID,
    BENE_HIC_NUM,
    BENE_FIPS_STATE_CD,
    BENE_FIPS_CNTY_CD,
    BENE_ZIP_CD,
    BENE_DOB,
    BENE_SEX_CD,
    BENE_RACE_CD,
    BENE_AGE,
    BENE_MDCR_STUS_CD,
    BENE_DUAL_STUS_CD,
    BENE_DEATH_DT,
    BENE_RNG_BGN_DT,
    BENE_RNG_END_DT,
    BENE_1ST_NAME,
    BENE_MIDL_NAME,
    BENE_LAST_NAME,
    BENE_ORGNL_ENTLMT_RSN_CD,
    BENE_ENTLMT_BUYIN_IND,
    BENE_PART_A_ENRLMT_BGN_DT,
    BENE_PART_B_ENRLMT_BGN_DT,
    BENE_LINE_1_ADR,
    BENE_LINE_2_ADR,
    BENE_LINE_3_ADR,
    BENE_LINE_4_ADR,
    BENE_LINE_5_ADR,
    BENE_LINE_6_ADR,
    GEO_ZIP_PLC_NAME,
    GEO_USPS_STATE_CD,
    GEO_ZIP5_CD,
    GEO_ZIP4_CD, 
    [INSERT_DT],
    [FILE_NAME]
)
select
    substring(stage, 1, 11)     as 'BENE_MBI_ID',
    substring(stage, 12, 11)    as 'BENE_HIC_NUM',
    substring(stage, 23, 2)     as 'BENE_FIPS_STATE_CD',
    substring(stage, 25, 3)     as 'BENE_FIPS_CNTY_CD',
    substring(stage, 28, 5)     as 'BENE_ZIP_CD',
    substring(stage, 33, 10)    as 'BENE_DOB',
    substring(stage, 43, 1)     as 'BENE_SEX_CD',
    substring(stage, 44, 1)     as 'BENE_RACE_CD',
    substring(stage, 45, 3)     as 'BENE_AGE',
    substring(stage, 48, 2)     as 'BENE_MDCR_STUS_CD',
    substring(stage, 50, 2)     as 'BENE_DUAL_STUS_CD',
    substring(stage, 52, 10)    as 'BENE_DEATH_DT',
    substring(stage, 62, 10)    as 'BENE_RNG_BGN_DT',
    substring(stage, 72, 10)    as 'BENE_RNG_END_DT',
    substring(stage, 82, 30)    as 'BENE_1ST_NAME',
    substring(stage, 112, 15)   as 'BENE_MIDLNAME',
    substring(stage, 127, 40)   as 'BENE_LAST_NAME',
    substring(stage, 167, 1)    as 'BENE_ORGNL_ENTLMT_RSN_CD',
    substring(stage, 168, 1)    as 'BENE_ENTLMT_BUYIN_IND',
    substring(stage, 169, 10)   as 'BENE_PART_A_ENRLMT_BGN_DT',
    substring(stage, 179, 10)   as 'BENE_PART_B_ENRLMT_BGN_DT',
    substring(stage, 189, 45)   as 'BENE_LINE_1_ADR',
    substring(stage, 234, 45)   as 'BENE_LINE_2_ADR',
    substring(stage, 279, 40)   as 'BENE_LINE_3_ADR',
    substring(stage, 319, 40)   as 'BENE_LINE_4_ADR',
    substring(stage, 359, 40)   as 'BENE_LINE_5_ADR',
    substring(stage, 399, 40)   as 'BENE_LINE_6_ADR',
    substring(stage, 439, 100)  as 'GEO_ZIP_PLC_NAME',
    substring(stage, 539, 2)    as 'GEO_USPS_STATE_CD',
    substring(stage, 541, 5)    as 'GEO_ZIP5_CD',
    substring(stage, 546, 4)    as 'GEO_ZIP4_CD',
    GETDATE()                   as 'INSERT_DT',
    'filename'                  as 'FILE_NAME'
from 
    CLAIMS_STAGE

我应该循环,我如何循环?

【问题讨论】:

  • 为什么要接受这样的非结构化文件?
  • @YitzhakKhabinsky 固定宽度,在大型机系统中很常见。
  • 如果文件中没有换行符,那么就bulk insert而言,当然只有一行/记录。
  • 那么在批量插入之后CLAIMS_STAGE 中有多少行?即使是固定宽度的文件也必须具有行尾标记或某种类型,批量插入允许您指定 ROWTERMINATOR
  • 根据设计,在 CLAIMS_STAGE 中只有一行。我正在尝试将该表中的子字符串插入 CCLF8 中的多个记录中。根本没有 ROWTERMINATOR 或任何划界。如何告诉查询从以下字符开始下一行?

标签: sql-server


【解决方案1】:

这是解析数据的一种方法。它需要一个数字/计数表 - 这是一个永久表,其中包含从 1 开始的单个 int 列,最多可达 bazillion 或您需要的任何值。

这里我只是为示例动态创建它:

/* Sample data, 3 "rows" of fixed-width data as a single row in table */

create table claims_stage (stage varchar(max))
insert into claims_stage
select Cast('  123' as char(5)) + cast('Homer' as Char(10)) + Cast('Simpson' as char(10)) + Cast('  456' as char(5)) + 
       Cast('   56' as char(5)) + cast('Bart' as Char(10)) + Cast('Simpson' as char(10)) + Cast('  888' as char(5)) +
       Cast('  122' as char(5)) + cast('Stuey' as Char(10)) + Cast('Griffin' as char(10)) + Cast('  60' as char(5))


with d as (select * from (values (0),(1),(2))n(n)), /* Tally table */
CCLF8 as (
  select Substring(stage, 1 + d.n * 30, 30) stage           /* Knowing the length of each row, */
  from claims_stage s                                 /* break it into multiple rows */
  cross join d
)

现在从中选择我们的列

select 
    Substring(stage,1,5) Col1,
    Substring(stage,6,10) Col2,
    Substring(stage,16,10) Col3,
    Substring(stage,25,5) Col4
from CCLF8

【讨论】:

  • 谢谢斯图,我会试试的。亿万是一个很大的数字!
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2011-08-02
相关资源
最近更新 更多