SOLID Architecture and the Repository Pattern

The repository pattern is super common.

It’s a layer in your code that exists between your business code and your database. It helps isolate code that interfaces with your database in a single place. Without it, you end up with data access code sprinkled throughout your codebase.

What happens if you don’t use the repository pattern? In the case that you update a table or stored procedure in the database, you need to find every spot in your code that relies on it and update all of them.

With the repository pattern you only need to update a single class and the rest of your code is none the wiser.

Yet I find a lot of confusion in platform architecture with the repository pattern. When I see implementations I always end up seeing things like validation, business logic, and a mix of things like caching in with bare metal data access. This might sound like a good thing, except a generally see these things all jammed into a single class.

Developers who are familiar with the Single Responsibility Principle seem to forget all about it when it comes to implementing the repository pattern. All of a sudden a “data access layer” (DAL) is born, but it’s not really a layer. It’s just a single class.

When I talk to developers about this, or read blog posts about DAL creation, it seems to be a very common case that people forget about the SRP, or believe that the DAL should not follow the SOLID principles. Many times the reason ends up being that they believe their code is too simple to have multiple components in the DAL.

But then something changes, they want caching, validation, auditing, or logging. These can be great things and should be desired, but they are added directly into the unified DAL class or into Helper classes without considering the overall design of the “layer,” which is now really more of a god class.

What is the Single Responsibility Principle? How can it help us design a better data access layer?

The Single Responsibility Principle states that each thing should have a single reason for change. This helps us isolate changes, and limit the scope of a class to a single level of abstraction. In the end, it’s largely about readability and maintainability.

What can possibly go wrong?

Let’s start with a single DAL method. It was built with SQLClient for bare metal database access and runs off of stored procedures.

public List<BlogArticle> GetArticles()
{
    using (var connection = new SqlConnection(connectionString))
    {
        var command = new SqlCommand("dbo.BlogArticles_Get", connection);
        command.CommandType = CommandType.StoredProcedure;
        connection.Open();

        var articles = new List<BlogArticle>();

        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read())
            {
                articles.Add(new BlogArticle(reader[title], reader[publishDate]));
            }
        }

        return articles;
    }
}

This looks good. Nice and easy and only one reason to change — because there was a change to the stored procedure. But, at some point someone realizes that there’s a performance issue developing. This is called a lot due to some of the front end design and how users interact with our service. A simple solution is simple to make fewer calls to the database and add in a server level cache. We only need caching in this one spot right now, so we won’t bother adding caching everywhere. As a result, this is what the method becomes:

public List<BlogArticle> GetArticles()
{
    ObjectCache cache = MemoryCache.Default;
    List<BlogArticles> articles = cache["articles"] as List<BlogArticle>;

    if (articles != null)
    {
        return articles;
    }

    using (var connection = new SqlConnection(connectionString))
    {
        var command = new SqlCommand("dbo.BlogArticles_Get", connection);
        command.CommandType = CommandType.StoredProcedure;
        connection.Open();

        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read())
            {
                articles.Add(new BlogArticle(reader[title], reader[publishDate]));
            }
        }
    }

    var policy = new CacheItemPolicy();
    policy.AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(5.0);

    cache.Set("articles", articles, policy);
    return articles;
}

This code gets added to production and your performance improves as expected. Next, you are alerted that the database is being accessed and blog articles are being pulled by people who don’t have permissions. A discussion is had about permissions and you can’t think of a case where you would ever want to get the blog articles unless the current user has permission to do so. That code then gets added to the DAL.

public List<BlogArticle> GetArticles()
{
    if (!Request.User.Roles.Contains("blogReader"))
    {
        throw new Exception("User lacks permissions to view articles.");
    }

    ObjectCache cache = MemoryCache.Default;
    List<BlogArticles> articles = cache["articles"] as List<BlogArticle>;

    if (articles != null)
    {
        return articles;
    }

    using (var connection = new SqlConnection(connectionString))
    {
        var command = new SqlCommand("dbo.BlogArticles_Get", connection);
        command.CommandType = CommandType.StoredProcedure;
        connection.Open();

        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read())
            {
                articles.Add(new BlogArticle(reader[title], reader[publishDate]));
            }
        }
    }

    var policy = new CacheItemPolicy();
    policy.AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(5.0);

    cache.Set("articles", articles, policy);
    return articles;
}

This code now has a lot of reasons to change. If the caching changes, the database changes, or the permissions that we want to check change we will need to update this class. And of course you’re going to set an example for other people on your team who will possibly imitate the functionality of your method when adding new ones.

What would I propose in its place?

Separation of concerns. What we have is going to become difficult to grow and maintain. The first change is to move the bare metal database access to its own class.

public List<BlogArticle> GetArticles()
{
    if (!Request.User.Roles.Contains("blogReader"))
    {
        throw new Exception("User lacks permissions to view articles.");
    }

    ObjectCache cache = MemoryCache.Default;
    List<BlogArticles> articles = cache["articles"] as List<BlogArticle>;

    if (articles != null)
    {
        return articles;
    }

    articles = _dataContext.GetArticles();

    var policy = new CacheItemPolicy();
    policy.AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(5.0);

    cache.Set("articles", articles, policy);
    return articles;
}

public List<BlogArticle> GetArticles()
{
    using (var connection = new SqlConnection(connectionString))
    {
        var command = new SqlCommand("dbo.BlogArticles_Get", connection);
        command.CommandType = CommandType.StoredProcedure;
        connection.Open();

        var articles = new List<BlogArticle>();

        using (SqlDataReader reader = command.ExecuteReader())
        {
            while (reader.Read())
            {
                articles.Add(new BlogArticle(reader[title], reader[publishDate]));
            }
        }

        return articles;
    }
}

Now database changes have been isolated to a single class with a single reason to change. Next is to move the caching over to its own class.

public List<BlogArticle> GetArticles()
{
    if (!Request.User.Roles.Contains("blogReader"))
    {
        throw new Exception("User lacks permissions to view articles.");
    }

    List<BlogArticles> articles = _cacheContext.GetArticles();

    if (articles == null)
    {
        articles = _dataContext.GetArticles();
        _cacheContext.CacheArticles(articles);
    }

    return articles;
}

public List<BlogArticle> GetArticles()
{
    return _cache["articles"] as List<BlogArticle>;
}

public void CacheArticles(List<BlogArticle> articles)
{
    var policy = new CacheItemPolicy();
    policy.AbsoluteExpiration = DateTimeOffset.Now.AddSeconds(5.0);

    _cache.Set("articles", articles, policy);
}

Cache access has now been isolated to a single class. It changes when the details of caching change. Next, move out the permissions check. Permissions should be checked in the layer where you actually care about a user object. Putting them in your repository is going to limit the use of your repository to times when you have a contextual user. This might not seem limiting now, but you might find that it requires significant rework later.

public List<BlogArticle> GetArticles()
{
    List<BlogArticles> articles = _cacheContext.GetArticles();

    if (articles == null)
    {
        articles = _dataContext.GetArticles();
        _cacheContext.CacheArticles(articles);
    }

    return articles;
}

What reason does the repository have to change now? It changes if the workflow of data access changes. The method operates on only a single layer of abstraction and doesn’t need to change when bare metal data access changes to caching or to the database.

We end up with three classes instead of one, but a significant readability and maintainability improvement.

What reasons does this code have to change? The repository method only needs to change if the flow of data access changes — as in, logging or auditing is added.

The code in the other classes need to change if caching or the stored procedures change, but that code is now isolated away from the data access workflow and no longer impacts it directly.

As an added bonus, it’s likely that others will see how the data access has been done on this method, and when caching needs to be added elsewhere they will follow the existing pattern. Having worked using both styles of data access, I can verify that working with a repository built on SOLID adds a lot more velocity to future feature addition than one without proper separation.

Advertisements

2 thoughts on “SOLID Architecture and the Repository Pattern

  1. Actually, you just made things more complicated, without a real benefit.
    Your repository still ‘knows’ about caching, just not directly. It still has multiple responsibilities: read/write data from db AND cache.
    You abstracted out the implementation of the caching mechanism, and introduced a dependency on the caching class.
    What if you need to test your app with caching disabled?

    Let’s look at your case, how I would solve it:

    We have an interface named IArticlesRepository.
    I will have a class ArticlesRepository : IArticlesRepository and a class: ArticlesCache : IArticlesRepository.
    The ArticlesCache is a decorator for the ArticlesRepository class.

    Now, my ArticlesRepository class only deals with access to db.
    My ArticlesCache, deals with caching.

    Any consumer of my repository accepts an IArticleRepository and doesn’t care which concrete implementation it gets.

    This way you have much greater flexibility, true separation of concerns and single responsibility.
    I used this for example to implement multiple levels of caching in a website (local inproc cache and distributed Redis cache).
    You could also implement the authorization logic in another decorator, but as you mentioned, that belongs to another layer, where you actually care about users.

    • You’re right that adding a dependency on the cache could be troublesome and using the Decorator pattern definitely has its advantages.

      I do have some hesitation about its use in my current code base and areas where I’m not sure it will work as well as it seems at first glance.

      I think it warrants some investigation on my part and possibly another blog post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s