Advanced Django querying: sorting events by date

Filipe Ximenes
April 19, 2017
<p>Imagine the situation where our application has events (scheduled tasks, appointments, python conferences across the world) happening in different moments of time. Almost anything with a date attached to it. We want to display them in a simple list to the user. Given we are in February 2017 (the date this post was written), what would be the best way to order these events from the user's point of view?</p><figure class="kg-card kg-image-card"><img src="https://vinta-cms.s3.amazonaws.com/media/filer_public/96/00/960085aa-e752-4015-8a13-eb16b9f66e25/chronological.png" class="kg-image"></figure><p>To make this easier to answer, we can change the question a little bit: what entry in the list would be the most valuable to the user? I suppose a good answer would be <code>DjangoCon Europe</code>, which is the closest upcoming event. Cool, so let's display it first. What is the second most valuable entry? Well, considering that <code>PyCamp Argentina</code> and <code>PyCon Brasil</code> are long passed, there is not much the user can do about them. So <code>PyCon US</code> would be a good choice. Now we are left only with the two passed entries. Which should come next? My personal opinion is that the most recent events should come first. The older an entry is, the less I care about it. So, here is the final order we came up with:</p><figure class="kg-card kg-image-card"><img src="https://vinta-cms.s3.amazonaws.com/media/filer_public/e2/86/e286f5b8-79f8-4e82-a172-978c20a281dd/best.png" class="kg-image"></figure><p>Let's take a close look at what we are doing here. First, we have upcoming events ordered chronologically [a.k.a. in the order they happen] and then we have passed events ordered in the reverse chronological order. Weird.</p><figure class="kg-card kg-image-card"><img src="https://vinta-cms.s3.amazonaws.com/media/filer_public/ba/8a/ba8a131e-a498-48fd-a541-690687cef14b/chronological_directions.png" class="kg-image"></figure><p>Alright, now how do we query these events in our database to show them in our Django view?</p><h2 id="the-naive-approach">The naive approach</h2><p>A simple solution would be to make two separate queries and concatenate the results. This is how it would look like:</p><pre><code>from django.utils import timezone from myapp.models import Event class EventListView(generics.ListView): def get_queryset(self): now = timezone.now() upcoming = Event.objects.filter(date__gte=now).order_by('date') passed = Event.objects.filter(date__lt=now).order_by('-date') return list(upcoming) + list(passed) </code></pre><p>The main problem here is that by making these two queries we will be fetching ALL events from the database. This will be a problem once we have many entries. The usual solution to avoid retrieving all the data from the database is to do pagination. In this case, pagination won't help. Both queries will always need to run before so we have a single list and can then slice the page from it. Not cool.</p><h2 id="the-sagacious-way">The sagacious way</h2><p>Alright, so what we want is a way to return the events in the correct order in a single query. To achieve this we will need to use some more advanced tools from the Django ORM. We'll break the solution into two parts. In the first part, we are going to separate upcoming events from previous ones and make upcoming ones show up first.</p><p>We are going to use Django's <code>Case</code>, <code>When</code> queryset operators. I'm not going to explain them in detail in this blog post. If you don't know how they work I recommend reading <a href="https://micropyramid.com/blog/django-conditional-expression-in-queries/">this other blog post</a>. The other feature we are going to use is annotations. You can read about it in the <a href="https://docs.djangoproject.com/en/1.10/topics/db/aggregation/#generating-aggregates-for-each-item-in-a-queryset">Django documentation</a></p><p>Here is the first part of our query:</p><pre><code>now = timezone.now() (Event.objects.annotate( relevance=models.Case( models.When(date__gte=now, then=1), models.When(date__lt=now, then=2), output_field=models.IntegerField(), )).order_by('relevance')) </code></pre><p>We are annotating upcoming events with <code>relevance = 1</code> and passed ones with <code>relevance = 2</code>. When we order the query by <code>relevance</code> upcoming events will be placed in front of passed ones. This is what we achieved so far:</p><figure class="kg-card kg-image-card"><img src="https://vinta-cms.s3.amazonaws.com/media/filer_public/0f/64/0f6439c4-c273-4124-b730-1341b03e7297/intermediary.png" class="kg-image"></figure><p>Pretty neat huh?! But we're still not done. Upcoming events are showing in the correct order, but passed ones are reversed. The second part is a little more tricky because upcoming events go in <code>ascendant date order</code> and previous in the <code>descendant date order</code>. The solution is to annotate the time diff from current date to the event date.</p><pre><code>now = timezone.now() (Event.objects.annotate( relevance=models.Case( models.When(date__gte=now, then=1), models.When(date__lt=now, then=2), output_field=models.IntegerField(), )).annotate( timediff=models.Case( models.When(date__gte=now, then=F('date') - now), models.When(date__lt=now, then=now - F('date')), output_field=models.DurationField(), )).order_by('relevance', 'timediff')) </code></pre><p>Notice that upcoming events are annotated with <code>event_date - current_date</code> and passed ones with <code>current_date - event_date</code>. When we use <code>timediff</code> as the second parameter in the <code>order_by</code> it will make the former to be returned in the chronological order and the later in the reverse chronological order.</p><p>Hooray! Mission accomplished, we can now get events in the correct order from a single database query. It can be paginated and will scale nicely.</p><p>**BTW: Vinta will be present [and giving talks] at both DjangoCon Europe and PyCon US this year. **<br>If you are attending <a href="https://2017.djangocon.eu/">DjangoCon Europe</a>, you should come by to watch <a href="https://www.vinta.com.br/blog/author/laisvarejao/">Lais</a> talking about our <a href="https://github.com/vintasoftware/vinta-boilerplate">Django/React boilerplate project</a> (<a href="https://2017.djangocon.eu/programme/">Defining a customizable boilerplate using Django, React and Bootstrap</a>) and <a href="https://www.vinta.com.br/blog/author/flaviojuvenal/">Flávio</a> talking about <a href="https://2017.djangocon.eu/programme/">Qualities of great reusable Django apps</a>.<br>At <a href="https://us.pycon.org/2017/">Pycon US</a> look for <a href="https://www.vinta.com.br/blog/author/flaviojuvenal/">Flávio</a> in his <a href="https://us.pycon.org/2017/schedule/presentation/49/">How to make a good library API</a> talk!</p>