Lightweight Database Cache Dependencies: Part II

Posted on November 08, 2004  |  

Posted in Development

5 comments

In the previous installment of this discussion of database cache dependencies a-la ASP.NET 2.0, I alluded to some of the problems current implementations have, including the one in ASP.NET 2.0. Since the previous post my idea and its implementation got traction with patent lawyers, so I'm going to have to wait and see how it pans out with them and whether I may share any code (I also get to be the product architect at my company, so my hands are tied). At least, I can share my idea, and y'all can take it from there.

All implementations of database cache dependencies I've seen rely on triggers, additional tables, extended stored procs, and other heavy-duty stuff that requires more privileges than what is reasonable.

I can't help wondering why you can't use binary checksums to watch for changes. SQL Server supports a number of very helpful functions that read table, row or expression checksums.

  • BINARY_CHECKSUM
  • CHECKSUM
  • CHECKSUM_AGG

In my code I cache table checksums and poll to detect changes. It's all done in one stored proc. My SQL Server login with minimal rights fits the bill just fine and I don't need to set up anything in addition to the stored proc. I could also build a SQL query on the fly and do away with the stored proc altogether!

I don't know if I'm overlooking some critical issue with this approach, but this solution has been powering my site by caching blog lists, articles and syndication feeds.

As with any approach there are gains and losses. I'm ignorant as to whether Oracle allows you to compute checksums the same way. Also, I'm polling for table changes, but you can get crafty enough and watch for changes in individual rows.

I'm going to leave it at that. I'd love to hear about your experiences with this technique.

5 comments

Michael O'Brien
on November 9, 2004

This sounds like a really good approach, and one I might consider myself, but it's lightweight only in the sense that it's a small implementation -- actually using this can create pretty significant demands on a server.

Unless I've got something wrong (which is very possible), the checksum functions read the values of each field and calculates a checksum. CHECKSUM_AGG then sums these (or does something a little cleverer, I'm not sure). While this is going to work well for smaller tables, running this on a table with several million large records is going to cause serious problems.

That said, for a small table this is a pretty nice solution!

I'm new to the site by the way and really enjoying it -- keep up the good work!


Milan Negovan
on November 9, 2004

Actually, I'm amazed how fast checksum functions work. I mean *really* fast. In our commercial product I poll every 500ms on a whole bunch of tables (in one sweep) with no performance burden.

I seriously doubt they actually get a hash on whole tables. I've done some syncing with Active Directory, and they have this so called High-Watermark which increments every time you update a record in AD. That saves you a lot of work calculating checksums on your own, and works extremely well, considering how slow AD is.

I'd imagine they use a similar approach. If not, I'd be very surprised.


Brant LeClercq
on November 9, 2004

It is my understanding that checksums are very fast as well, but I do not believe that they can definitely determine a change was made. The chances of one change to the next resulting in the same checksum value is extremely small, but nonetheless possible.

Checksums also seem to be unable to detect changes in reference columns like text, ntext, or image. Therefore checksums would not work for tables that store article text and/or images. As an aside, if you add a rowversion (timestamp) column to your table then you can detect those changes.

Your previous article hinted at using minimal permissions (db_datareader, db_datawriter), so I would be interested to see how you tackle these particular issues without making any changes to the structure of the database.


Milan Negovan
on November 9, 2004

You're right, checksums do not reflect changes to certain fields:

"BINARY_CHECKSUM ignores columns of noncomparable data types in its computation. Noncomparable data types are text, ntext, image, and cursor, as well as sql_variant with any of the above types as its base type." - Books Online

However, most tables contain more than an ntext field. Most of them come with some sort of "name", "description", "date last updated" fields, etc, and those SQL Server does track changes to.


Michael O'Brien
on November 10, 2004

I'd be interested to hear if anyone knows how the checksums are calculated. I ran a few tests in query analyzer and it looked like the number of recorded reads rose with the size of the table which is why I guessed it wouldn't be a good idea with large tables...