Caching is Your Friend - Creating Tons of Maps on the Fly and Auto-Updating Some
Julie Goldberg is the co-founder and chief engineer behind Empower Engine, a 2-person start-up that maps electoral campaign data for Democrats and their allies. She’ll talk a bit about the domain and a lot about some technical challenges she faced, mostly solved by caching. The system is built using Django, Leaflet, Tilestache and Postgres.
Campaigns organize geographically, so maps are very important. Organizers have their own turf, and it’s important to see how things vary in just their local area. Some data is static, but other data must be recomputed every day. We never know when a user will ask for a map of a specific data set in some small city or a specific organizer’s turf.
Geometric queries are much slower than most database queries, especially when you have hundreds of thousands of geometries. Empower Engine never does geometric calculations during a single web request, and the system minimizes geometric queries as much as possible. District overlaps are pre-computed, and person-district lookups are recomputed daily (using the previous day’s data it hasn’t changed). Shapes are stored in a separate table from other information about the district, so Postgres can store the district table efficiently. We have a custom wrapper around Tilestache to allow one configuration file to handle tons of maps. Tiles are only created on request and then stored them on S3. Every map has all its data in a single cache table to speed up Tilestache. The cache table is recreated and all tiles invalidated whenever the data or anything else about the map changes.