I have had an annoying problem for a while. In a database used for a statistical survey system reporting is painfully slow in the beginning of each reporting period.
The system
The tables contain a few million rows. Ola Hallengren’s index maintenance (which includes UPDATE STATISTICS) is running weekly. Each month is a new reporting period. When a new reporting period opens, there are no rows for the current period. From the first day of the month, we receive input, each input being less than 2000 new rows in the table.
The problem
Reporting of any previous period is always consistent in execution time – around 3 seconds to produce a full report. That’s an OK performance. But when reporting is done for current period early in a reporting period, execution takes up to 10 minutes.
The reason
Looking at the execution plans for the problem queries, I see that there are some lookups for current period, where the same query for a previous reporting period uses a scan operator. Looking at the estimated number of rows, compared to actual number of rows shows that the issue is bad statistics. The estimated plan shows 1 row from each of the lookups while actual shows some ten thousand rows. Since the queries always have Reporting Period as part of the WHERE-clause, it’s logical that this happens. We run UPDATE STATISTICS each week. When the statistics is collected, there are zero rows for current period. So when we add some thousand rows for current period, the statistics still shows zero rows for current period. This has been verified using DBCC SHOW_STATISTICS.
Possible solutions
I have been looking at different solutions to this problem. One would be to add a query hint, but the system uses an OR-mapper (Entity Framework) and I don’t even think it’s possible to add query hints to Linq2Entities. A stored procedure mapped as a function in Entity Framework would perhaps have worked, but I would rather solve the problem without having to deploy new .NET-code.
Next possible solution would be to enable trace flags 2389 and 2390, to tell SQL Server to try and identify ASCENDING columns and treat them differently in the optimizer (use average estimated number of rows for previous values). This might or might not solve our problem with this specific query (we run four consecutive UPDATE STATISTICS Before any increments to the period column so it’s still not sure that the column would be marked ASCENDING). But 2389 and 2390 have other problems, mainly that the trace flags are enabled on the whole instance, not just one database or one table.
What we did until recently was to manually run UPDATE STATISTICS when the users reported bad performance. Sometimes that wasn’t enough – we also had to rebuild an index containing the reporting period column. But that’s not really a long term solution either, since it requires a DBA/DBDev to be around to quickly respond to the situation.
The “final” solution
What we have done to answer to this problem as a long-term solution is to implement a SQL Agent job which compares distribution of values in the histogram from DBCC SHOW_STATISTICS with the actual values in the table. If the difference between histogram rowcount differs much (500% difference is the threshold we have used) the job updates statistics on the table and rebuilds a specific index. This job runs once each 10 minutes. When there’s no big difference, the job runs in a second. When statistics are updated and the index is being rebuilt, it takes a little longer, but using an online rebuild, it doesn’t affect system performance too much.
The job is very specific to the table(s) in our system. I have been thinking about making it more general, but haven’t had the time to really do it. It requires some logic to find out which column(s) to group on, which datatype the columns have etc. So for now, a solution specific to our table(s) will do.
Here’s the code for the job (obfuscated in terms of table/column names).
DECLARE @UpperThreshold numeric(10,0)=5; DECLARE @LowerThreshold numeric(10,0)=0.2; DECLARE @indexname sysname=N'ix_Period' CREATE TABLE #histogram ( RANGE_HI_KEY char(6), RANGE_ROWS bigint, EQ_ROWS bigint, DISTINCT_RANGE_ROWS bigint, AVG_RANGE_ROWS bigint); CREATE INDEX ix_range_hi_key ON #histogram(range_hi_key) INCLUDE (eq_rows); INSERT INTO #histogram EXEC ('DBCC SHOW_STATISTICS([myTable],' + @indexname + ') WITH HISTOGRAM'); --Now we have the statistics estimation of number of rows for each period in #histogram IF (SELECT count(*) FROM( SELECT h.range_hi_key,COALESCE(h.eq_rows,1) AS eq_rows,d.period,COUNT(*) AS cnt FROM #histogram h --RIGHT OUTER JOIN to the table in order to catch those periods not in the histogram at all RIGHT OUTER join myTable d ON h.RANGE_HI_KEY = d.period GROUP BY h.RANGE_HI_KEY, h.EQ_ROWS, d.period HAVING cast(count(*) AS numeric(10,0)) / cast(COALESCE(h.eq_rows,1) AS numeric(10,0)) > @UpperThreshold OR cast(count(*) AS numeric(10,0)) / cast(COALESCE(h.eq_rows,1) AS numeric(10,0))<@LowerThreshold) as t)>0 BEGIN PRINT 'Index ix_Period ON myTable is rebuilt and index for myTable is updated'; UPDATE STATISTICS myTable; EXEC('ALTER INDEX ' + @indexname + ' ON myTable REBUILD'); END DROP TABLE #histogram