I recently started seeing lots of implicit conversions in execution plans in a system for which I oversee databases. Implicit conversions happen when two values with different datatypes are to be compared. Here’s first an example of implicit conversion that does NOT affect performance all that much. I’m running the query in the AdventureWorks2014 database.
What is Implicit Conversion? Is it good or bad?
SELECT
BusinessEntityID,
FirstName,
MiddleName,
LastName
FROM
Person.Person
WHERE
LastName='Sánchez';
What’s gonna happen when this query executes? The SQL Server Optimizer is going to find an index, ix_Person_LastName_FirstName_MiddleName with the names as key columns and BusinessEntityId on the leaf level (since it’s the clustered index, the clustering key is always included on the leaf level of non clustered indexes).
So the optimizer is going to choose to do an index seek on that covering index. But before seeking into the index, SQL Server have to CONVERT values, because ‘Sánchez’ is a varchar value and the LastName column is an nvarchar column. So implicit conversion is going to happen. The optimizer is going to choose to convert the varchar value to nvarchar and then perform the seek operation. The reason for that is that nvarchar has higher precedence than varchar. This makes sense. Nvarchar can store text that varchar can’t. Doing it the other way around – convert nvarchar to varchar – would result in loosing some data in the conversion and therefore give us the wrong results.
Here’s where you can see implicit conversion in the execution plan.
Will this implicit conversion affect the performance of our query? No, it won’t. At least not noticeable. Our constant is converted to nvarchar(4000) and then it can be used to seek into the index.
Let’s look at a more problematic example. Let’s run two queries against the Sales.Customer table.
SELECT
CustomerId,
AccountNumber
FROM
Sales.Customer
WHERE
AccountNumber = 'AW00000013';
SELECT
CustomerId,
AccountNumber
FROM
Sales.Customer
WHERE
AccountNumber = N'AW00000013';
What do you think will happen with these two very similar queries?
The first one will perform an index seek into the index AK_Customer_AccountNumber.
The second one will perform an index scan of the index AK_Customer_AccountNumber.
What happens here? The upper query performs just the way we expect. Use the index, seek into it. But in the lower query, we send in an nvarchar value. Since the table column is varchar, we get implicit conversion on the COLUMN, not the parameter/constant expression. This is really bad. It means we are scanning every single row of the table and pass it on to the FILTER-operator, where the predicate is evaluated.
Dapper
Now we have looked at what implicit conversion is. We also now know a little more about when it can be bad for us.
I said in the beginning that I saw implicit conversion in Query Store. And it wasn’t the good kind of implicit conversion, where parameters or constant values are converted. I saw implicit conversion on the table side of things, resulting in index scans.
Investigating things got me to queries sent from a service, in a code path using Dapper. Dapper is a library that can be used to map database results to application code objects. It’s relatively easy to use and lots of developers favour it vs just using SqlClient.SqlCommand or other frameworks like Entity Framework.
Dapper. And parameters.
Dapper provide a really simple way to pass parameters to queries. It looks a little something like this:
var template = new SalesCustomer { AccountNumber = "AW00000013" };
var parameters = new DynamicParameters(template);
var sql = "select CustomerId, AccountNumber from Sales.Customer where AccountNumber = @AccountNumber";
using (var connection = new SqlConnection(connString))
{
var customers = connection.QuerySingle<SalesCustomer>(sql, parameters);
}
See how simple the parameter passing is? Just create a template from a SalesCustomer class with a specific AccountNumber and create a DynamicParameters result set from this template. But the parameters aren’t strongly typed. This means Dapper/.NET will choose a datatype for us. This will translate to an sp_executesql call with defined parameters. AccountNumber will be passed in as an nvarchar(4000) parameter to the query.
How is this bad?
As we saw in the execution plans above, when we send in an nvarchar value to a query that uses it to compare it to a varchar column, we will get implicit conversion on the table side of things. SQL Server will scan every row of the index and then apply the filter predicate. That’s not an effective use of a covering index. Effective use would have been an index seek.
What should we do then?
We saved a couple of characters of code by not specifying the datatype for the parameters when we created the Dapper code. But we make the database suffer every single time we run the query. What’s the correct way of doing this?
In Dapper, we could instead create an empty parameters collection and use the Add-method to add parameters, with datatype, direction and more. In Dapper, it would look a little something like this.
var parameters = new DynamicParameters();
var AccountNumber = "AW00000013";
parameters.Add("@AccountNumber", accountNumber, DbType.AnsiString, ParameterDirection.Input, AccountNumber.Length);
var sql = "select CustomerId, AccountNumber from Sales.Customer where AccountNumber = @AccountNumber";
using (var connection = new SqlConnection(connString))
{
var customers = connection.QuerySingle<SalesCustomer>(sql, parameters);
}
In the above example, we explicitly defines the AccountNumber parameter as DbType.AnsiString, which will translate to varchar in SQL. Using DbType.String would give us an nvarchar parameter. But in our case, we want to pass a varchar parameter.
And while we’re at it, we’re setting the LENGTH of that varchar parameter. Because otherwise it’s going to be sent as varchar(8000). For the comparison itself this doesn’t matter too much. But the larger datatypes sent in as parameters, the larger the memory grant you’ll need to execute the query. And if you’re going to explicitly set the datatype, you might as well explicitly set the length too.
Conclusion
Be explicit. A few extra lines of code could potentially save you from that 2AM call when you’re on call, because that one important batch brought the database on it’s knees.
Use AnsiString for varchar and String for nvarchar.