Algorithm to avoid SQL injection on MSSQL Server from C# code?
Solution 1:
There's no algorithm needed - just don't use string concatenation to build SQL statements. Use the SqlCommand.Parameters collection instead. This does all the necessary escaping of values (such as replacing '
with ''
) and ensures the command will be safe because somebody else (i.e. Microsoft) has done all the testing.
e.g. calling a stored procedure:
using (var connection = new SqlConnection("..."))
using (var command = new SqlCommand("MySprocName", connection))
{
command.CommandType = CommandType.StoredProcedure;
command.Parameters.AddWithValue("@Param1", param1Value);
return command.ExecuteReader();
}
This technique also works for inline SQL statements, e.g.
var sql = "SELECT * FROM MyTable WHERE MyColumn = @Param1";
using (var connection = new SqlConnection("..."))
using (var command = new SqlCommand(sql, connection))
{
command.Parameters.AddWithValue("@Param1", param1Value);
return command.ExecuteReader();
}
Solution 2:
Top 10 things we can do to be safe (No one of these will do it all.)
Adopt the notion that, "All data is evil." All data, even the data stored in the database or on our file system is suspect. Not just data input from apps outside our firewall like query strings, form fields, cookies, etc. Anything can be used to compromise a system.
Don't rely on client-side validation of javascript or html field lengths or even server-side web APIs that use client-side validation. Use it to improve usability, but don't rely on it as the sole guard. Know how validators provided by APIs like NET work. Don't take them for granted. There are ways around them.
Do positive matching to catch all data as it goes in. If the Data matches character ranges of a regular expression, then it's okay. This disallows weird unicode characters into our database that might accidentally delimit something in sql or create other problems like Homographic XSS/Phishing Attacks. In contrast, Negative matching requires lists of all the bad characters, which seem to grow all the time. This is a bad approach. Positive matching is better. We reject bad data, don't sanitize or escape it.
When possible, consider filtering, flagging, or catching string data with "update", "delete", "drop", "select", "alter", etc. This may not be possible given the nature of the string. "1212 Lemondrop Ln", "Waltersburg, PA", and "Table Rock, NE" are valid address fields. Running a daily scan of all table data for fields that match any of these could reveal delayed attacks or vulnerabilties. Also logging, ip banning, email alerts, etc etc could be used as the data comes inbound.
Use stored procedures and/or parameterized queries as much as possible. Avoid dynamic sql both in db client code and in sql. (Avoid exec statements with dynamic code with external sections in your stored procedures!!!) Parameterization will escape string terminators like the apostrophe, catch field lengths, and type check. We can't always rely on the APIs that provide parameterization to be perfect, but they are written by people much more aware of database idiosyncracies than most of us.
Be sure that no stray code is sitting around in a world readable/executable web directory. If it's not part of the active site, archive it somewhere secure and delete it from public view. Same goes for unused stored procedures.
Stay up to date on the database APIs. Some ways of executing SQL statements in some APIs are not as secure as others.
Store passwords securely with one-way encryption. This way, a table dump of usernames and passwords should still keep people out.
Harden the server in all the usual ways. For example, when possible, give least privilege on database tables. Limit access of web server database accounts strictly to the tables in question. Use read only as much as possible. Create multiple accounts that create a divide between the access rights of public and in-house/trusted traffic.
Catch errors gracefully. This goes for all code, not just code that uses the database. Sql injection attacks specifically do rely on error messages however and so it's desirable to hide as much as is possible about the database from the public. Always write code that handles exceptions or empty datasets in a vanilla fashion as to reveal as little as possible about what type of database we're using, what the fields are in our tables, or how what kind of queries we're running. Log errors on the server. Even in the non-database code, it's best to keep quiet about third party components, file folder structures, other services we may be running, etc. Giving malacious users as little information as possible is key to keeping them clueless.
And #11, always revisit/revise this list. Always be up to date. Be proactive. Make it an upfront priority and requirement, not an after thought.