Entity Framework - Code First or Database First?

One of the questions that tends to pop up in most new (and old) .NET projects using Entity Framework nowadays, is which approach to go for, either having a Code First approach, or Database First.

The Code First approach is relatively new, and gives developers the ability to essentially forget that there's a database involved, you simply create the classes you want to be represented as database tables, add properties to related classes, and the framework generates all the required tables, constraints and you never even have to know what SQL is to be up and running.

On the other hand, the Database First approach involves you creating the data model in the database, normally using either SQL Management Studio or some other tool, creating all tables, specifying values and adding on Primary Keys, Foreign Keys and various constraints if needed, the framework then extracts the database schema, and creates the .NET classes for you.

Apart from this process, the development is pretty much exactly the same, and developers working at the higher tiers don't have to know which approach was used, as they would use the generated code in the exact same way regardless.

Why then does it matter? Where is the difference?

For developers, the ease of access by never having to use any database tools gives Code First immediately an advantage over DB First. It's quite simply much easier and quicker to add a property to a class in the code than it is to add a column to a table, as you generally then also need to generate a database script to be able to use the added column in production, as well as regenerating the model once the column is added, while all this is covered automatically by Code First and its "Migrations" for database changes.

For DBAs, well, if you have DBAs in your organization or projects, as a developer you likely don't have much say in the matter, and projects will be DB first regardless. However, if you're a developer, you might want explicit control over how the data model looks and how it is constructed, and the best way to do this, is to do it at the database level.

What to choose then?

Personally, all my recent personal projects are Code First, and I don't see that changing any time soon. I might also endorse this approach for limited team projects or smaller data storage projects, but that's where it ends. For any major team effort, or major high-performing data storage project, I simply can't encourage this approach as opposed to creating the database first.

I will start off my reasoning by saying that in my career, I've generally been the go-to guy when something feels sluggish, or a page is slow. I'm just as comfortable looking at SQL as I am looking at code, and I've come to realize that I'm not the typical developer due to my high focus on performance which involves among other things always keeping SQL Profiler running on a second monitor while I'm developing and testing.

When I use Code First, I always verify the migration scripts thoroughly, verify the migration scripts, ensure that it's doing *exactly* what I want it to do. It's very easy, especially with relationships, to mess up and add unintended columns, join tables and lacking foreign keys by accident if you don't have the right attributes on your classes. I also verify that everything looks alright in SQL Management Studio once the changes have been made. 

This is the bare minimum I'd expect *anyone* to do if they're serious about their development work. The problem here is that most developers won't do this. It's an unfortunate fact for those of us who "live and breathe" development, you know the type - we get twitchy eyes and blood boiling when we see someone using a capitalized variable name for a scope variable, messing up the code "feng shui" that's implicit when we open any solution - not everyone is like this. For some (most?) this is "just a job", where the goal is to simply be done with your tasks/backlogs/defects in a timely manner and cash your paycheck. This is a reality of life, all developers are different, and I'm not saying everyone should be the embodiment of Code Complete 2nd edition, just that you should expect and anticipate that some developers want to do exactly what they're tasked with in the least complicated matter regardless of the implications.

As such, for general development in major projects and/or high performing solutions, I will strongly recommend database first. The developers who may not know how to do anything in SQL shouldn't be tasked with model changes that could potentially hugely affect database performance regardless. Leave that to the people who can and, most importantly, care about the overall health of the model and the performance of the database.

Add comment

Loading