The SnapLogic.org site runs on open source software, including a lot of Python. The bottom of the stack is RedHat Linux, the web server is Apache, the database engine is MySQL, and the main applications are Mailman, Subversion, Trac, WordPress, and a custom Django application.
All of these pieces play together pretty well, and over the past year we’ve only had a few minor problems to fix. But over the past couple of weeks, we’ve really been hitting Trac hard as we work through our upcoming release, and Trac performance was bad. Really Bad. Like, ‘30 seconds to open a ticket‘ bad.
It’s hard to say if this was a sudden performance drop because we crossed some magic threshold in database size, or if it was slowing getting worse and we really noticed it when we shifted from design to coding. In any case, there are only 1000 tickets in the system so we knew something was wrong.
We turned on the MySQL Slow query log, and quickly found slow queries. A little more digging, and it seems that the problem was the TEXT datatype that Trac uses for all of it’s text fields. It seems that access to TEXT columns requires a table access rather than an index access. (This might be resolved in versions of MySQ > 5.0.37, that thread is 20 months old…) In any case, I was taught to use VARCHAR for text columns, and to use TEXT for documents and other BLOB-like things. The problem was probably complicated by the way Trac uses different index sizes for the same column in different tables.
In the traditional enterprise software model, we would have opened cases with both vendors, haggled with the vendors individually for a couple of days, then put them on the phone together to let them haggle over whose problem it was. Who knows when a fix would arrive….it would probably be scheduled in the next major release.
But we have the source code for both! And given the time / complexity constraints, Trac was the more hackable of the two products in this case.
It took 1/2 day to update the Trac schema definition to use VARCHAR() insted of TEXT, and run some tests. We saw query execution time for one common backend query drop from 8 seconds to 0.1 seconds. It took 15 minutes to implement the changes in production.
The actual changes are in Trac ticket #6986. if anyone is interested. I don’t know if the Trac team will pick these particular changes up, since they have long expressed a desire to minimize the difference in back end databases.
We will probably have some extra work to do on our next Trac upgrade, but we’ll cross that bridge when we reach it.