The other day at work I was looking to reduce the load time for a popular web control. It seemed to me that the more data a customer had to load, the slower it took to load. Now, that makes complete logical sense, but the increase in load time seemed a little excessive when I see other sites loading quickly with large data sets. Clearly this isn’t the first time in the world that this problem had been solved.
My first thought was to look at a LINQ Union statement. The could looked similar to this:
I thought that this had to be the cause of the slowdown. All of these lists should be unique and not contain duplicates between them. The large the lists get, the more comparisons need to be done. Eventually this could be an exponential increase in load time. My plan was to switch it to Concat.
Obviously, this did not provide the performance benefit I was seeking.
So, what next? At the suggestion of a colleague, I loaded a performance profiler to track the slowdown more distinctly.
What did I find? Something that looked like this:
foreach(Report r in reports) ICountList list = lists.FirstOrDefault(s => s.ListID == r.ListID);
This deceptive little piece of code is O(n) = n^2. It’s a foreach on a report list, and FirstOrDefault is essentially a foreach on the list of lists. If the two lists are of equal size (expected case) then run time is n^2.
After much thought and experimentation I came across the following solution:
var listDict = new Dictionary<int, ICountList>(); foreach(var list in lists) listsDict.Add(list.ListID, list); foreach(var r in reports) ICountList list = listsDict[r.ListID];
This new solution has a run time of O(n) = n, or linear time. Regardless of how many lists are added, the load time will increase linearly. This solution did have a significant performance boost when checked again with the profiler.