sql server - design deduplication sql statement using NULLS vs MAX VALUE -
i'm trying sculpt sql statement de-dupicate table.
the table has 3 keys: key1, key2, key3 used business key. date being used.
the rules (assuming matches in key1, key2, key3):
if rows have dates, retain max(date) if 1 row has date, , others null, retain row date if rows has date = null, keep rows.
i've been using code basis:
with cte as( select [key1], [key2], [key3], [date], rn = row_number()over(partition [key1], [key2], [key3], [date] order [date] desc) dbo.table1 ) delete cte rn > 1
i not educated on how apply rules on sql statement. wisdom appreciated.
example deduplications:
case 1: before dedupication: key1 key2 key3 date 1 1 null 1 1 null 1 1 null after deduplication: key1 key2 key3 date 1 1 null 1 1 null 1 1 null case 2: before dedupication: key1 key2 key3 date 1 1 1/1/2016 1 1 1/1/2016 1 1 1/1/2016 after deduplication: key1 key2 key3 date 1 1 1/1/2016 case 3: before dedupication: key1 key2 key3 date 1 1 1/1/2016 1 1 1/2/2016 1 1 1/3/2016 after deduplication: key1 key2 key3 date 1 1 1/3/2016 case 4: before deduplication 1 1 1/1/2016 1 1 1/1/2016 1 1 null after deduplication: key1 key2 key3 date 1 1 1/1/2016 case 5: before deduplication 1 1 1/1/2016 1 1 1/2/2016 1 1 null after deduplication: key1 key2 key3 date 1 1 1/2/2016
with cte as( select [key1], [key2], [key3], [date], rn = row_number() on (partition [key1], [key2], [key3] order [date] desc), mxd = max([date]) on (partition [key1], [key2], [key3]) dbo.table1 ) delete cte rn > 1 , mxd not null
Comments
Post a Comment