SQL Azure Index Fragmentation with No Data Changes

Is it possible for a SQL table’s index to get fragmented even if its data doesn’t get updated, inserted or deleted?

Recently I found that this is the case in very specific circumstances. I was using a merge statement to update some existing data, though in my scenario the data had not changed so the merge statement shouldn’t have made any changes to the table data and the index fragmentation should not have changed dramatically.

It should also be noted this only happened with SQL Azure with an index fill factor of 0, but the same didn’t happen with my on premise SQL server. With SQL Azure, if my index fill factor is set to 0 the index gets heavily fragmented. The full results can be found at the end of this post but first I will present a couple of SQL scripts that recreate this index fragmentation.

Script 1 - Data insert into a test table with ~100,000 records.

Script 2 - Update test table with 30,000 records. The merge statement in the script will only update existing data (as we are selecting from the original data) but I have intentionally commented out the insert part of the merge to make sure we are only updating. The values being updated in this case are actually unchanged.

--Script 1:
 
--Do initial insert
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'TestTable')
BEGIN
  DROP TABLE TestTable
END
 
CREATE TABLE [dbo].[TestTable]
(
       [UnitID] [int] not null,
       [ReportedDate] [datetime] not null,
       [Serial] [bigint] null,
       [Value] [smallint] NULL
)
GO
 
ALTER table [dbo].[TestTable] ADD CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED ([ReportedDate], [UnitID])
GO
 
declare @currentDate datetime = dateadd(day, -14, getdate())
declare @currentUnitID int = 1
declare @unitTotal int = 300
 
while (@currentDate < GETDATE())
begin
       set @currentUnitID = 1;
 
       while (@currentUnitID < @unitTotal)
       begin
              insert into TestTable
              (UnitID, ReportedDate, Serial, [Value])
              values
              (@currentUnitID, @currentDate, 1234, 50)
 
              set @currentUnitID = @currentUnitID + 1
       end
 
       set @currentDate = dateadd(hour, 1,@currentDate)
end
 
 
select COUNT(*) from [dbo].[TestTable]
 
--Get Fragmentation
SELECT a.index_id, name, avg_fragmentation_in_percent, b.fill_factor
FROM sys.dm_db_index_physical_stats (DB_ID(N'MSLMessagingTest'),  OBJECT_ID('TestTable'), NULL, NULL, NULL) AS a 
    JOIN sys.indexes AS b  ON a.object_id = b.object_id AND a.index_id = b.index_id;

--Script 2

--Do updates (approx 30,000) using merge
 
--Create table varible and populate
declare @UpdateTable TABLE
(
       [UnitID] [int] not null,
       [ReportedDate] [datetime] not null,
       [Serial] [bigint] null,
       [Value] [smallint] NULL
)
 
insert into @UpdateTable
(UnitID, ReportedDate, Serial, [Value])
select top 30000 *
from [dbo].TestTable t
order by t.[ReportedDate] desc
 
select COUNT(*) from @UpdateTable
 
MERGE INTO  [dbo].TestTable WITH (HOLDLOCK) AS target
USING @UpdateTable AS source
    ON target.ReportedDate = source.ReportedDate and target.UnitID = source.UnitID
WHEN MATCHED THEN
    UPDATE SET target.[Value] = source.[Value],
       target.[Serial] = source.[Serial]
	   /*
WHEN NOT MATCHED BY TARGET THEN
    INSERT ([ReportedDate], [UnitID],[Value], [Serial])
    VALUES (source.[ReportedDate], source.[UnitID], source.[Value], source.[Serial])*/;
 
 
--Get Fragmentation
SELECT a.index_id, name, avg_fragmentation_in_percent, b.fill_factor
FROM sys.dm_db_index_physical_stats (DB_ID(N'MSLMessagingTest'),  OBJECT_ID('TestTable'), NULL, NULL, NULL) AS a 
    JOIN sys.indexes AS b  ON a.object_id = b.object_id AND a.index_id = b.index_id;

Results (index fill factor set to 0):

  • Run script 1 to insert ~100,000 records:

    • Local SQL Fragmentation: 0.52%
    • Azure SQL Fragmentation: 0.53%
  • Run script 2 to ‘update’ the data

    • Local SQL Fragmentation: 0.52%
    • Azure SQL Fragmentation: 43.82%

As you can see the difference in fragmentation is very different with the Azure SQL database after we run the second script.

If I set the fill factor to 80% it helps to alleviate the fragmentation. In my real world case, the fragmentation creeps up my a fraction of a percentage per day and is a lot more manageable.

I reached out to Azure support and they responded with the following:

Product group has confirmed that with accelerated database recovery feature, the new behavior is expected. It you want the ability to disable in-row versioning for a table, we have a traceflag that can be used to disable in-row versioning (but it can cause performance issues on the updates as the versions would go off-row). Currently the traceflag is not exposed in golden bits. We cannot disable Accelerated Database Recovery feature as that is needed to handle long recovery issues during failovers due to deployment or other reasons. Possible workarounds are the following:

  • Setting the fillfactor. You are currently using this option.
  • Enabling the traceflag to disable in-row versioning on the logical server al together. Not recommended
Alex Orpwood Written by:

Software developing and architecting for 20 years. Satellite monitoring by day, writing my own app by night. More about me