Sep 05

Debugging Django in Production Revisited

In a previous post I talked about a neat middleware to debug production environments in Django. It basically checked to see if you were a superuser, or if you were in settings.INTERNAL_IPS, and if so, then it displayed a technical 500 page for you (The yellow one you know and love). Anyway, at that point it was more of a simple idea, and not really used in production.

At work the other day I was working on a bug that was only showing up in production, and not on staging. I remember back to this middleware and thought it would be perfect. Anyway, at work we have a lot of non-technical people that are superusers (think my bosses boss). We also all have the same external IP's when at work, so none of the previous methods I had would work for this.

Thinking about it, and talking to my co-worker Ben Spaulding, we thought that Django has Groups built in, so why not use that? So I went ahead and re-jiggered the middleware to be based around groups.

from django.views.debug import technical_500_response
from django.contrib.auth.models import Group
from django.core.cache import cache
import sys

class UserBasedExceptionMiddleware(object):
    def process_exception(self, request, exception):
        users = cache.get('technical_error_users')
        if not users:
            skip = cache.get('no_technical_error_users')
            if skip:
                return None
            try:
                g = Group.objects.get(name='Technical Errors')
                users = g.user_set.all()
                cache.set('technical_error_users', users, 60)
            except Group.DoesNotExist:
                cache.set('no_technical_error_users', True, 60*60)
                return None
        if request.user in users and request.user.is_superuser:
            return technical_500_response(request, *sys.exc_info())
        return None

Since it is middleware, I went ahead and decided to use the cache framework to make sure that we weren't doing a DB query on every request. Also, I had to account for the case when the group hasn't been added yet, so when that happens, it caches the fact and doesn't check again for another hour. If the Technical Errors group exists, it caches the members that are in it for a minute. This means that a DB query only happens every minute, which is fine.

I'd be curious how other people might improve this, as it seems a little bit janky still. However, it works for us, and is incredibly useful when debugging. Instead of getting a link to a broken page, you go to the page and get a nice 500, telling you exactly what went wrong.

I can think of one basic improvement in just writing this post, which would be to import settings and to just return None if DEBUG was True, or if the CACHE_BACKEND was set to None. This would allow it to stay out of the way if there was no caching, or the Technical 500 was already going to be raised.

I do think that this middleware removes a lot of the reason to run a site under DEBUG=True, so hopefully it will result in less sites launching with DEBUG on.


Comments

1 Alex says...

I'm not sure I see the purpose of the caching, yes it would mean an extra DB query. But we're talking about scenarios where a) an exception was thrown (hopefully rare), and b) the user is already known to be a super user. This seems like a sufficiently exceptional set of circumstances such that caching is unnecessary.

Posted at 5:51 p.m. on September 6, 2009

2 Morak says...

Nice article ! :)

Posted at 10:12 p.m. on September 6, 2009

3 Paul Tarjan says...

+1 on the caching being unnecessary. Don't over complicate the code for early performance reasons. Do profiling then caching.

Otherwise, nice idea.

Posted at 6:38 a.m. on September 7, 2009

4 Andy Baker says...

Surely this is a permission you want rather than a group? Groups should be for the grouping of permissions. Admittedly Django makes it tricky to have permissions that aren't coupled to specific models but there are workarounds.

Posted at 7:57 a.m. on September 7, 2009

5 Andrew Shultz says...

What we did was to display the user appropriate 404 page but save the full technical debug page to a directory and have a way to retrieve them. Then when a user reported getting an error page we would just go read the saved debug page and fix the problem.

write up at http://everything-not-nailed-down.blogspot.com/2009/08/dj...

Posted at 3:01 p.m. on September 7, 2009

6 Harro says...

Why are so many people superuser? Just make them staff and give them the proper permissions.

Posted at 7:01 p.m. on September 7, 2009

7 Jeremy Dunck says...

I posted a variation on this here: http://www.djangosnippets.org/snippets/1719/

I grab exc_info early in case something else in the middleware raises and catches before I get a chance to call the 500 view. Otherwise, it's just a simpler version w/o caching for people that don't have as many superusers.

Posted at 1:48 a.m. on September 8, 2009

8 john moylan says...

erm, maybe i'm missing the point. Error's can be mailed to a user out of the box in django anyway.

Posted at 5:50 p.m. on September 10, 2009

9 Charlie says...

Awesome -- just now learning about exception middleware and this is a great example.

Posted at 4:18 p.m. on September 25, 2009

Comment are disabled for this post.