Using TG2 models in external scripts; SQLAlchemy logging woes

A common need, when building a web-application (or any database-driven application) is to create batch scripts that can do processing external to the request/response cycle of the app itself.

Below is a skeleton for writing your own scripts that make use of your TG2.0 app environment and models.

#!/usr/bin/env python2.5
# -*- coding: utf-8 -*-
 
import os
import pkg_resources
pkg_resources.require('TurboGears2')
from tg import config
from tg.exceptions import HTTPNotFound
from paste.deploy import appconfig
from myprojectname.config.environment import load_environment
 
config_dir = '..'
config_file = 'deployment.ini'
# Load the application config
conf = appconfig('config:'+config_file, relative_to=config_dir)
# Load the logging options
# (must be done before environment is loaded or sqlalchemy won't log)
from paste.script.util.logging_config import fileConfig
fileConfig(config_dir+os.sep+config_file)
# Load the environment
load_environment(conf.global_conf, conf.local_conf)
 
# Now TG2 is loaded, along with your config (available as the 'config' object)
# Now you can import your models and do whatever you like
from myprojectname.model import DBSession, modela, modelb
 
def commit_session():
    DBsession.flush()
    transaction.commit()
 
#
# cool code that does stuff goes here
#
 
# Just remember to DBSession.flush() and transaction.commit()
# before exiting, to actually save your changes to the database.
commit_session()

Caveat Coder!

I got hung up on that logging code for a long time. It’s just two lines, but I found out the placement is very important. My hint came from the pylons-devel post linked to below.

It turns out that SQLAlchemy does some tricky stuff with logging. At first glance it’s ridiculous, but I’m disinclined to believe the SQLAlchemy guys would do something without reason. Maybe it’s a performance thing. You can see evidence of it inside sqlalchemy/log.py in the lines that reference “_should_log_info” and “_should_log_debug” variables. You can dig around the code for the details of where these are used, but essentially, SQLAlchemy sets flags that enable or disable logging based on whether the appropriate loggers are intialized and the appropriate logging levels are set at the time the engine is instantiated. Evidently, you can’t change log level after instantiation of the SQLAlchemy engine.

Related posts

These are the relevant links that my Googling turned up. I ended up figuring out the solution from the code, like I say above, but maybe these can help somebody else–not all the problems are identical.

  • just for kicks, I went through engine/base.py and pool.py and commented out all the “if _should_log_XXX” conditionals, allowing all the info() and debug() calls to happen, just not logging anything. A run of test_zoomark shows it adds about 10% method call overhead. I’m still disinclined to take the conditionals out, but I will try perhaps storing “shouldLogInfo()” etc. per Connection, which might cut out most of the superfluous calls while allowing logging to be “live” at least to the extent that you checked out a new connection from the pool.

    It might help to know that you can call (in 0.6) sqlalchemy.log._refresh_class_loggers which will re-set all the _should_log flags to the current reality, but let me try this thing first.

  • Anthony Theocharis March 1st, 2010 at 3:28 pm

    Hey Mike,

    Thanks for the follow up! It’s interesting to know there’s such an overhead saved by adding those conditionals.

    We’re running with 0.5.8 right now, but it’s good to hear there’s some love being given to the process for the next version of sqlalchemy.