Why I Avoid Stored Procedures (And You Should Too)

Subtitle:  Object Relational Mappers are the New Standard Practice for Application Development

Okay – DBA’s are already upset after reading the title.  I understand.  Stored procedures have been the standard practice of most professional software developers for more than a decade.  But like my recommendations on Flash, there are newer and better options than some old tried-and-true technologies.

I’ll have you know that I’m not an extremist on this subject.  I have personally worked on many applications that use stored procedures, and I have written plenty of them myself to follow a client’s “established best practices.”  Stored procedures still have a place in some scenarios.  However, given a choice, I avoid stored procedures as often as possible.

Why Do Developers Use Stored Procedures?

Traditionally, there have been a lot of reasons given for the use of stored procedures.  The most popular arguments include:

  • Performance – The query plan for stored procedures is compiled in SQL Server so that subsequent requests can run faster.  Also, a single stored procedure can perform multiple SQL commands, reducing traffic between an application and the database server.
  • Security – Stored procedures are well defined database objects that can be locked down with security measures.  Use of typed parameters can help prevent SQL injection attacks.
  • Code Re-Use – Database queries can be written once and re-used multiple times without writing the same SQL commands over and over again.
  • Business Logic Encapsulation – The database can house a lot of business logic so that the brain of your application is kept “neatly in one place.”

Ten years ago, I was completely on-board with this train of thought.  But around 2005, I started working with Object Relational Mappers (ORMs) and it changed my whole thinking about stored procedures.

What is an Object Relational Mapper?

An ORM is a code-based tool for application developers to work with databases.  The purpose of an ORM is to create a code representation of the data model.  Once you have this code in place you can access your database without writing a line of SQL code.  No stored procedures (or ad-hoc SQL) needed.  A lengthy study is beyond the scope of this article, but you can read about ORMs on Wikipedia.

Oh CRUD

ORMs are very efficient at Create [INSERT], Read [SELECT], Update and Delete (CRUD) operations.  Usually at least 90% of SQL commands used in an application are simple CRUD operations.  ORMs automatically write parameterized queries for you so that you never have to spend time writing CRUD from scratch.

Writing stored procedures for simple operations is a waste of time and it muddies up the database.  I cringe when I see a list of 500 stored procedures in a single database.  I cringe further when I discover that most of the stored procedures contain mind-numbingly simple queries like “select * from [tablename] where [col1]=@param” and “update [tablename] set [col 1]=@param where [id]=@userId”.

It drives me batty when I know that someone took the time to write all of those stored procedures, then an application developer manually wrote code to call the stored procedure, convert and pass in parameters, check the result and pass back a DataSet.  I commonly see ~20 lines of C# or VB application code to call a simple one-line stored procedure.

With LINQ, I can accomplish more functionality with less code.  By writing “var result = context.TableName.Where(tn => tn.UserId == userId);” the data context writes a parameterized SQL query, the query is executed, the result is returned and translated into a strongly-typed set of objects (much better than a DataSet).  I get more benefit out of one line of LINQ code than a stored procedure coupled with 20 lines of code to call it.

Good N-Tier Application Design

N-Tier Application DiagramIf you are familiar with n-tier (or multi-tier) application design, you know that the business layer belongs in the middle of your application stack.  In other words, business logic should be compiled application code that is testable via automated unit tests.

Your data access layer should be comprised of an ORM and basic repository methodology.  Some people would throw stored procedures into the diagram as a second data access layer, but this is unnecessary and (IMHO) incorrect.

A database should be limited to the role of a “persistence layer” – a technology agnostic storage mechanism.  When business logic is scattered through stored procedures, database triggers and application code, the n-tier model is broken.

Since T-SQL is technology-specific, your stored procedures would need to be re-written in order to migrate to MySQL, Oracle or another database.  If you avoid stored procedures and use an ORM like NHibernate, moving to a new database architecture is basically as simple as changing a connection string (okay, maybe not quite that simple in the real world).

Benefits of Object Relational Mappers

Every ORM is different, some with more benefits than others.  My personal favorite is still PLINQ for LINQ-to-SQL, and there are versions for Entity Framework and NHibernate now.

These are the most common benefits to look for in a good ORM:

  • Less Hand-Written Code – There is substantial time and cost savings when using a good ORM versus hand-writing stored procedures and stored procedure calls.
  • Rapid Application Development – Developers are all under pressure to build more features with less time.  Tools like PLINQO offer code generation with intelligent re-generation.  This is extremely important because PLINQO can update your data layer to match the database with two clicks of a mouse.  If you’ve been using stored procedures and your data model changes, you’re in for a lot of fun tracking down and fixing all of the references.
  • Maintainability & Refactoring – The foundational principle of agile development is the expectation that applications will need to be changed in the future.  The data model and business rules will change.  If you use a good ORM, these changes can be handled quickly and easily with a refactoring tool (like ReSharper).  Compiling your application is often enough to verify that your changes were done properly.  But if you used stored procedures as the basis of your data access and/or business logic layer, you will be frightened to make any changes for fear of breaking something.  This “stored procedure paralysis” leads to hacks and workarounds instead of properly addressing new requirements.  I just saw this happen a few weeks ago on a client’s project.
  • Performance – A LINQ data context uses deferred execution for selects, and it batches record updates and deletes in a single database round-trip on Submit().  Good ORMs offer additional performance enhancements like caching, futures and batch operations (select, update and delete).  These can have a dramatic impact on application speed.
  • Proper N-Tier Design – Your database should merely be a “persistence layer.”  It should not contain the business logic of your application.  Using an ORM helps you architect a solution where your business logic resides in strongly typed managed code (instead of being scattered through stored procedures and triggers).  This keeps the separation of concerns more clear, avoiding a broken or inverted n-tier structure.
  • Business Logic Encapsulation – The data entities that are generated by your ORM are effectively business objects.  Instead of working with “dumb DataSets,” these entities can enforce business rules that can interact with your presentation layer and data layer.

Common Objections to Object Relational Mappers

  • “Stored procedures are faster.” – In speed tests that I’ve seen, parameterized queries perform just as well as stored procedures (sometimes better).  From what I’ve read, SQL Server does cache the execution plan of parameterized queries.
  • “LINQ syntax is not as flexible as T-SQL.” – This is true but misleading.  95% of T-SQL queries I’ve seen can be accomplished easily in LINQ syntax (even aggregate functions).  If you look at some sample LINQ commands you may be surprised what is possible.  Also, you can execute custom SQL or call a stored procedure from LINQ and return the results in a strongly-typed object.
  • “ORMs add too much overhead.” – Compared to what?  If you’re comparing with straight ADO.NET and a DataReader, there is a slight performance hit.  If you’re comparing with explicit hydration of custom business objects, there is likely a performance gain.  Plus, good use of caching, futures and batch operations more than make up for any overhead.  Remember that data access speed is not the only measure of good application design.
  • “ORMs pull unnecessary data.” – By default most ORMs will return every column in the table(s) that you requested.  However, most ORMs support projection queries where you can limit the columns returned.  The results of your projection are returned into a strongly typed collection.
  • “Stored procedures still provide better security.” – If you need to give data access to someone other than your development team, you should use a web service.  As for SQL injection attacks, parameterized queries prevent this problem too.
  • “ORMs should only be used to call stored procedures.” – Most ORMs can call stored procedures, so these two technologies are not mutually exclusive.  However, calling stored procedures from an ORM negates many of the benefits of ORM technology and design patterns.  It adds work and complexity while reducing benefits.

When Should I Use Stored Procedures?

I see several cases where stored procedures are still acceptable:

  • You’re doing data warehousing / ETL / data aggregation.  An ORM is usually not the right tool for this kind of bulk data management.  You should consider a tool like SQL Server Integration Services (SSIS), BizTalk, or another dedicated ETL tool.
  • You’re working on a legacy application where everything is already built using stored procedures.  Sometimes you have to go with the flow.  Sometimes adding a second tool is worse than supporting the wrong tool.
  • You need to feed data to SQL Server Reporting Services (SSRS).  Strangely, the CLR integration for SQL Server does not have LINQ support, so T-SQL is your only option with SSRS.  But frankly, SSRS isn’t always the best choice either.  There are a lot of good third party tools for report generation to HTML, PDF and Excel that can interface with your ORM.  Since SSRS is not available in the Express Edition of SQL Server (and definitely not available in MySQL or Oracle) you are limiting your deployment options again.
  • Your query absolutely cannot be performed with LINQ syntax.  This is rare, but does happen occasionally.  Thankfully, stored procedures or custom SQL can be called from a LINQ data context.
  • You need to perform a lot of complex SQL statements from a small set of input.  Although there are lots of ways to handle batch operations with PLINQO, sometimes “you gotta do what you gotta do” to address a performance issue.
  • You need to push thousands of records to SQL in a single statement.  This is most efficient through table-valued stored procedure parameters in SQL Server.

I hope you find this information useful.  Happy coding!

Kick it on DotNetKicks.com [Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Version 4.0.1 of PLINQO Released

Okay, I’m a little behind on blogging. Since I previously wrote multiple posts regarding PLINQO (Professional LINQ-to-SQL), I wanted to drop a quick update that version 4.0.1 was recently released. I also skipped writing about the release of version 4.0.0 because I couldn’t find a definitive enhancement list. I stumbled across it today, so I’m posting both.

Version 4.0.0 Highlights:

  • Futures Support – Allows creation of a queue of objects to be loaded all at once. This differs from the old Multiple Result Sets feature in that it defers execution until the data is really needed. It’s also easier to support more than 2 result sets in a single call.
  • Caching Improvements – Added support for various caching providers including Memcached.
  • Detach/Attach Entities – Added more methods for serialization/deserialization so detached entities can be stored as binary or XML.
  • More DetailsClick here for the full set of enhancements

Version 4.0.1 Highlights:

  • DataContextName – You can finally control the name of the DataContext that is generated. This is long overdue and greatly appreciated.
  • Pagination Improvements – Added methods for NextPage, PreviousPage and GoToPage for PagedList.
  • Null Handling – Added NotNull rule and attribute and improved SQL queries that use null comparisons.
  • More DetailsClick here for the full set of enhancements

I know that some people are a little hesitant of continuing to use LINQ-to-SQL (L2S) given Microsoft’s shift in direction to LINQ-to-Entities (L2E). However, Microsoft has not dropped support for L2S in .NET 4.0. They actually added some features to LINQ-to-SQL in the recent release of .NET 4.0 and Visual Studio 2010. L2S is widely adopted and (from what I can tell) MS intends to continue supporting it in future versions of .NET, even though they aren’t going to develop it further.

At this time, the PLINQO team intends to provide LINQ-to-Entities support in a future release. This means that PLINQO users should require little-to-no-work in making the switch to L2E. In the meantime, I’m happy using PLINQO as my primary OR/M on new projects.

Kick it on DotNetKicks.com [Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

PLINQO 3.0 – Even Better

I wrote a blog article a few months ago giving kudos to the guys at CodeSmith after my discovery of PLINQO 2.0. Since then, I haven’t done much with LINQ-to-SQL because the legacy projects at my day job use Castle ActiveRecord with NHibernate. But I recently started a new project (giving me the freedom to investigate other technologies) and was pleasantly surprised to find PLINQO 3.0. In addition to a new major code revision, I found that CodeSmith released a new website with more information about the benefits of PLINQO and sample usage.

In case you’re not familiar with PLINQO, this set of code-generation templates is designed to enhance the LINQ-to-SQL development experience. They’re not only a time-saver like most code generation templates, but they allow you to overcome many of the limitations of “raw” LINQ-to-SQL. See Does LINQ StinQ? Not with PLINQO!

My first article covered some of the main benefits of PLINQO 2.0, including:

  • Generates one file per entity instead of one massive DBML file.
  • Generates partial classes where custom code can be written and won’t be overwritten.
  • Generated entity files are added to the project as code behind files to their corresponding custom entity files.
  • Adds customizable business rules engine to enforce entity validation, business and security rules.
  • Generation of entity manager classes… Provides access to common queries based on primary keys, foreign keys, and indexes.
  • Ability to automatically remove object prefix and suffixes (ie. tbl and usp) [based on RegEx].

In addition to those features, PLINQO 3.0 has the following benefits:

  • Entity Detach – Detach entities from one DataContext and attach to another (very useful for caching scenarios).
  • Entity Clone – Create copies of entities in-memory, set only the properties that need to be changed and persist as a new object with a new primary key.
  • Many-to-Many Relationships – Yes, M:M can be done in LINQ-to-SQL without writing goofy code to manage the link tables.
  • Auditing – The app can track all property changes complete with a copy of the old value and new value. Tracked changes can be read iteratively or dumped to an XML string.
  • Batch Updates and Deletes – You can perform updates and deletes on records based on criteria on the SQL Server without pulling each record into your app first. I’d already been using another implementation of this concept, but it’s nice to have it built into PLINQO.
  • Multiple Result Sets – PLINQO can pull multiple recordsets back in a single request. This can be done either by using a stored procedure or using the ExecuteQuery method passing a list of queries as parameters.

I think some of those benefits may have existed in the 2.0 release, but weren’t documented. I’m glad to see they’re starting to provide more documentation and samples. It would still be nice to see more, however (as it occurs to me) your custom PLINQO code really sits on top of LINQ-to-SQL, so all of the standard LINQ documentation applies.

I do have some suggestions for CodeSmith to implement in future versions of PLINQO:

  • I’m fond of the IRepository pattern because of unit testing with frameworks such as Rhino Mocks. I’ve seen a couple of implementations of IRepository with LINQ (example 1, example 2). This should be a code generation option.
  • I’d like to see a DataContext session factory with per-web-request lifestyle. This is available in other ORM systems like ActiveRecord. After some digging, I found an example of this that also demonstrates integration with Microsoft’s MVC and Castle Windsor (IoC). Sweet.
  • There are some helpful LINQ libraries out there, such as LINQKit and the LINQ Dynamic Query Library. It would be nice to include these and/or other free libraries with PLINQO.
  • I’ve gotten the impression that Microsoft is going to favor the Entity Framework (LINQ-to-Entities) over LINQ-to-SQL. I’d love to see PLINQO adapted to support the Entity Framework. That would certainly placate the domain-driven design fans along with those who use db’s other than MS SQL.

Finally, a bit of a rant: I’m kind-of annoyed that PLINQO only has one way to select the tables you want to include in code generation: you have to write a RegEx to identify tables to exclude. I’ve worked on several projects where I want to generate entities for less than 50% of the tables in my database. For instance, when writing modules for DotNetNuke, I only want to generate entities for my 5 tables, not the 100+ tables that come with a DNN installation.

NetTiers had a dialog to select tables for code generation. It sure would be nice to bring that back in PLINQO. If a dialog box is too much trouble, at least there could be a switch to specify whether my RegEx is an include list or an exclude list. I submitted a ticket to CodeSmith on this one. Please vote and add comments on their website if you support this idea. How about it, CodeSmith? :-)

See the new PLINQO website at http://www.plinqo.com/ for downloads, documentation and an offer to get a free copy of CodeSmith. I also suggest that you watch both introductory videos: Video 1. Video 2.

Kick it on DotNetKicks.com [Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Does LINQ StinQ? Not with PLINQO!

There has been much debate over Microsoft’s first major Object Relational Mapper (ORM) called LINQ (Language-Integrated Query). Microsoft has released a several flavors of LINQ suitable for different purposes, most notably LINQ-to-SQL and LINQ-to-Entities. Most developers use LINQ-to-SQL given that LINQ-to-Entities is brand new and most info available online is about LINQ-to-SQL, so for the purposes of this article (and in most people’s minds), “LINQ” == “LINQ-to-SQL”. [Yes, I know LINQ is really an expression language and not an ORM, but let's not get technical on a technology blog. :-) ]

LINQ has been met with mixed reactions from the development community. Some are enthralled that Microsoft has finally built their own “supported” ORM. Some love the ease-of-use in being able to drag tables onto a designer surface in Visual Studio 2008 (reminiscent of TableAdapters). Others like the (usually) clean and efficient SQL queries generated by the ORM.

To me, the best feature of LINQ is the integration of a strongly-typed data query language built into .NET. Gone are the days of query building from strings. Even an impressive ORM like NHibernate with its own query language (HQL) suffers from the lack of a strongly-typed query language (leading to hybrid attempts like LINQ-to-NHibernate). Altogether, it is a powerful and efficient system of data access when you understand it and use it properly.

Of course, LINQ is not without its problems and critics. I’ve heard a lot of complaints about “lack of support” for implementing dynamic queries, that is, creating queries with joins and other non-comparison criteria at runtime. There are some solutions for this like the LINQ Dynamic Query Library or Predicate Builder.

One of the biggest concerns from enterprise-level developers is that LINQ is a “black box” and the Visual Studio designer writes a ton of “hidden code” to map the entities to the database. Bear in mind that all ORMs are a black box to some degree. Even though NHibernate is open-source, most developers use a DLL from a release and never tinker with its inner workings. NHibernate extensions like Castle ActiveRecord even hide the XML column mappings from the developer. But in all honesty, LINQ-to-SQL is very bad at allowing the developer open access to the mysterious column mapping code and has a number of issues when you make changes to the underlying database schema.

A couple of months ago, I decided to try out LINQ-to-SQL on one of my pet projects. It took some getting used to, but after pulling some hair out, learning some important lessons and finding some handy tools, it worked pretty well. I even found that I can usually avoid many dynamic query issues by chaining sub-queries together to execute a single query thanks to Deferred Execution.

My biggest gripes centered around the creation and maintenance of the entity mappings in the DBML file. Microsoft’s O/R Designer documentation openly admits:

…the designer is a one-way code generator. This means that only changes that you make to the designer surface are reflected in the code file. Manual changes to the code file are not reflected in the O/R Designer. Any changes that you make manually in the code file are overwritten when the designer is saved and code is regenerated.

More than that, when you make changes to your database schema, you either need to manually update the entity through the O/R Designer or you need to delete and re-add the entity, losing any customizations you’ve made (including entity relationships). Suddenly the code-generation time savings doesn’t make up for customization frustration.

Enter “Professional LINQ to Objects,” a.k.a. PLINQO, a code generation tool for LINQ that does what you want. I’ve used other ORM frameworks from CodeSmith before, particularly NetTiers. I believe in the value of code-generation when it’s implemented properly. One of the most important attributes of a good code-gen tool is that it lets you re-generate code without overwriting your customizations. PLINQO delivers intelligent regeneration to LINQ-to-SQL.

PLINQO works by generating your DBML file for you, but surprisingly this doesn’t break Visual Studio’s O/R Designer. In fact, you can still open and modify the .dbml file with the Designer and your changes will not be overwritten next time you generate. You can also modify many aspects of the entities via code (including validation attributes) and your code customizations will be untouched by re-generation. Sweet!

Some other benefits of PLINQO (as noted on their site):

  • Generates one file per entity instead of one massive [DBML] file.
  • Generates partial classes where custom code can be written and won’t be overwritten.
  • Generated entity files are added to the project as code behind files to their corresponding custom entity files.
  • Adds customizable business rules engine to enforce entity validation, business and security rules.
  • Generation of entity manager classes… Provides access to common queries based on primary keys, foreign keys, and indexes.
  • Ability to automatically remove object prefix and suffixes (ie. tbl and usp) [based on RegEx].

Another nice bonus is that you configure and execute the code generation right inside VS 2008. CodeSmith Pro has Visual Studio integration that lets you add a .csp (CodeSmith Project) file inside your VS project and manage all the settings from Solution Explorer. Just right-click your .csp and select “Generate Output” and your .dbml, entities, managers and query classes appear in your solution.

If you’ve been hesitant to try LINQ because of “black box” or code re-generation concerns, now is the time to give it a shot. Download a trial of CodeSmith and try the PLINQO 2.0 release or latest development trunk. Happy coding!

Kick it on DotNetKicks.com [Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: