Dariusz on Software: Entries Tagged python

About This Site

Software development stuff

Entries tagged "python".

How to make SQLObject instances "picklable"?

Sat, 01 Nov 2008 20:17:53 +0000

I tried to store SQLObject instances in a session, but had problems with pickling them. Here is smart base class for model classes that allow to fast pickle / unpickle operation:

from sqlobject import *
import logging

class Base(SQLObject):

    def __getstate__(self):
        return self.id

    def __setstate__(self,id):
        obj = self.__class__.get(id)
        self.__dict__ = obj.__dict__

Just inherit from that class and that's all. Remember: restoring object from db is cheap: SQLObject caches instances in memory.

Tags: python, sql.

Automatically migrate SQL schema in SQLObject

Sun, 02 Nov 2008 18:58:35 +0000

When you're building an application you have to extends your model (and alter database schema that is placed under the model). I've found a method to automatically upgrade SQL schema (only ADD COLUMN operations for now):

from sqlobject import *
import logging

class Base(SQLObject):

    def migrate(self, *args, **kv):

        conn = self._connection
        schema = {}
        cols = self.sqlmeta.columns

        for a in conn.columnsFromSchema(self.sqlmeta.table, self):
            schema[a.name] = a
        logging.debug("existing schema: %s" % schema)

        for name, col in cols.iteritems():
            if name not in schema.keys():
                logging.info("add column: %s.%s" % (self.__class__.__name__, name))
                conn.addColumn(self.sqlmeta.table,col)

    migrate = classmethod(migrate)

migrate() class method only handles simplest case (adding a new column), but it may be sufficient in typical development process (more advanced transformations will still require SQL migration scripts written by hand). Enjoy!

Tags: python, sql.

Django 1.1 released

Thu, 30 Jul 2009 07:40:03 +0000

New version (1.1) of very popular web framework Django has been released. New version includes ORM improvements, better testing performance, conditional view processing and URL namespaces.

ORM Improvements

One of SQLObject benefits over Django ORM was in my opinion possibility to make aggregations without issuing raw SQL-s. This feature is now present in Django allowing to make more efficient reports implementation.

Model improvements

A table can be marked now as "unmanaged": syncdb/reset will not touch such table. Also: proxy models and deferred fields now are present.

Testing improvements

Big optiomalisation for unit testing fans like me: tests are run in one transaction, so performance is much better than before. I don't use Django unit test infrastructure (selected nosetests for this task), but it's pleased to hear about this improvement. Hope it will make unit testing more popular among Django programmers.

Tags: django, python.

Migrate Database Schema in Django

Wed, 07 Oct 2009 19:49:59 +0000

When your application grows your database must be extended together with application needs. Schema migrations could be a pain if done improperly. Let's review some methods to do schema migrations in your database.

Manual schema migrations, single database instance

This approach assumes you are using database administration tool to manually change your database schema to reflect application. Typically there's one single "master" database that has "official" schema and is used for tests and (possibly cloned version) for production. It's the way most novice programmers attack the problem. This approach is simple and strighforward, but does't have more benefits. Problems that may occure:

It's problematic to develop in paralell on branches: you need to change schema but other developers will see your changes and may break their code
schema migration is chaotic - it's very easy to forget add some field on production database

Migrations by series of SQL scripts, manual apply

Another, more advanced method, is to use series of SQL files that will ALTER database and optionally UPDATE some fields to reflect migration. You can order those scripts by prepending date and make all developers manually path their databases. Benefits:

Arbitrary schema modifications can be implementaed this way

Some problems to notice:

It's still very easy to forget applying some patches (or applying them in incorrect order) making your env broken
Additional work is needed to prepare scripts

Migrations by series of SQL scripts, applied automatically

A variation of above method. SQL migrations scripts are recorded (by name) after apply in special database table. They are applied automatically (system lists special upgrade/*.sql directory) and already applied patches are silently ignored. Benefits:

Automatic retest of whole sequence of patches (load old database dump and run upgrade)
Any compicated schema migrations could be implemented this way
Automatic order of execution
Exclude "double execution" problem (patch names recorded)

This solution is used by us and works perfectly for 6 years of continuous development (400 patches applied so far!).

Automatic schema migrations for ORM

ORM (Object Relational Mappers) have few possibilities to detect current database schema and compare it with existing model. Then migrations can be applied automatically on database to make it work with application. Examples:

Django Evolution - fill a gap in basic Django (where only new tables are inserted
Deseb - Django External Schema Evolution Branch

Using Django Evolution is very simple:

./manage.py evolve --execute --hint --noinput

Similar for Deseb:

./manage.py evolvedb --noinput

Which approach should I choose?

I suggest to start with Django Evolution (or others ORM equivalent) then if migrations become more advanced switch to registered, applied automatically SQL scripts. This way you will have speed at the initialisation of a project and later - more confidence with manually created SQL patches.

Tags: django, python.

How To Migrate Django To Different Database Backend

Sat, 24 Apr 2010 21:41:57 +0000

Changing database location is simple - just launch dump on source database server, import it into destination database, redirect domain and voila! You can use this method to migrate your database into newer database engine version. But what can you do if you realize whole backend must be changed (i.e. from MySQL to PostgreSQL)?

Migrating SQL dump to different database dialect is not very easy (column types / dates formats as first examples come to mind). But you don't have to operate on SQL dumps. The simple answer here is: "dumpdata".

Django uses special manage.py script to manage typical operations like: initialisation of database, preloading data, dropping database etc. The command:

manage.py dumpdata appname

prints on stdout all data contained in appname in universal Json format. Then you can load dump just created by using:

manage.py sqlreset gabinet | psql ...
manage.py loaddata filename.json

Database state must be reset before import. That's why sqlreset is used. sqlreset alone prints DROP DATABASE statements on stdout allows to purge database from tables (if passed to SQL execution tool).

Additionally you can gzip JSON data created to make migration (much) faster:

manage.py dumpdata appname | gzip -c | ssh destinationserv 'cat > data.json.gz'
(login to destinationserv ...)
manage.py sqlreset appname | psql ...
gzip -dc data.json.gz | manage.py loaddata -

Happy migrating!

Tags: django, python.

Python Web Framework Selection

Tue, 11 Oct 2011 22:32:18 +0000

I've been using many different Python Web Frameworks so far:

WebStack + mysqldb + custom template framework (long-term ERP project for small company)
Django (online shop)
CherryPy + Cheetah (online PDF design)
Zope (financial)
WebStack + SimpleTAL (online PDF design)

All frameworks have its strengths and weakness. For new project that will handle appointments using existing calendar I decided to give web2py a try, rationale:

all stuff included on board, no manual integration of 3rd party libraries
stable API
small and elegant
integrates with GAE (with subset of DB layer)
template selection separated from controller (easier unit testing)
easy template syntax (reuses Python code embedded into markup language)

After first phase of product I'll report if my expectations above were correct and what kind of problems were located (if any).

Tags: python, web.

Static verificaiton tool for web2py templates

Sun, 15 Jan 2012 23:09:24 +0000

Web2Py is a full-stack Python web framework that can be compared to Django, but is easier to learn due to convention-over-explicit-statement preference. In this article I'll check how static verification techniques developed by me for many different environments (JSP, Django templates, TAL, ...) can be applied for web2py environment.

Static verification means locating simple bugs without running application thus very high (>95%) testing coverage (and high related cost) is not required. Instead with trying to cover by tests every possible screen/workflow/code line/... we can scan all codebase and search for some constraints. Most of them (based on my experience) are static - do not depend on runtime data thus can be effectively checked without running an application.

Here's short example of web2py template language:

<a href="{{=URL('books','show')}}">...</a>

As you can see web2py will substitute {{=<python-expression>}}s by evaluated result. In this example URL points to existing module controllers/books.py and function inside this module named 'show'. I assume you see the problem here: one can select undefined module / function and it will result in a runtime error.

First example is purely static: refence (URL('books','show')) will not change during runtime, neither the source code. Then our static checker might be applied succesfully: check if all URL's in all *.html files have proper functions defined in source code.

Technical solution can be composed to the following steps:

locating all resources to check: scanning given directory in filesystem tree
locating all interesting fragments in HTML file: I used regexp with arguments to easily extract interesting data
locating functions in *.py code: because controllers are not plain python modules (expects some data in global namespace) I decided just to scan them textually

Another check that can be done is references inside HTML files (to CSS resources, JS files, ...). This also can be automated:

<script type="text/javascript" src="svgcanvas.min.js"></script>

Source code refactorings might break your links/references and static scan might ensure you are not breaking application by refactoring.

Complete source code for URL() / SRC= / HREF= checker:

import sys

sys.path.append("web2py/applications")
sys.path.append("web2py")

import os
import copy
import re
import string
from gluon.globals import *
from gluon.http import *
from gluon.html import *

RE_SRC = re.compile(r'src *= *"([^"]*)"')
RE_HREF = re.compile(r'href *= *"([^"]*)"')
RE_URL = re.compile(r'{{=URL\(\'([^\']*)\'')
RE_URL2 = re.compile(r'{{=URL\(\'([^\']*)\' *, *\'([^\']*)\'')

FILENAME= None
FNR = 0

request = Request()

def report_error(s):

    print "%s:%d: %s" % (FILENAME, FNR, s)

def check_src_exists(arg):

    if arg[0] == "{":
        # variable value
        return

    elif arg[0].find("{{"):
        # skip this URL
        pass

    elif arg[0] == "/":
        # absolute file
        fullPath = "web2py/applications" + "/" + arg
        if not os.path.exists(fullPath):
            report_error("file %s doesn't exists" % fullPath)

    else:
        # relative file
        fullPath = os.path.dirname(FILENAME) + "/" + arg
        if not os.path.exists(fullPath):
            report_error("file %s doesn't exists" % fullPath)

def check_href_exists(arg):

    #print arg
    if arg.startswith("{{="):
        pass
    elif arg.find("{{") > 0:
        pass
    elif arg.startswith("/"):
        if not os.path.exists("web2py/applications" + arg):
            report_error("absolute file %s doesn't exists" % arg)
    elif arg.startswith("http://"):
        # external link, do not check
        pass
    elif arg.startswith("https://"):
        # external link, do not check
        pass
    elif arg.startswith("mailto:"):
        # external link, do not check
        pass
    elif arg.startswith("javascript:"):
        # external link, do not check
        pass
    elif arg.find("#") == 0:
        # anchor, skip
        pass
    else:
        fullPath = os.path.dirname(FILENAME) + "/" + arg
        if not os.path.exists(fullPath):
            report_error("relative file %s doesn't exists" % fullPath)

def templatePathToPythonPath(templatePath):

    return string.join(templatePath.replace("/views/", "/controllers/").split("/")[:-1], "/") + ".py"

def eq(got, expected):
    if got != expected:
        print "got:'%s' != expected:'%s'" % (got, expected)
        return False
    return True

assert eq(templatePathToPythonPath(
    "web2py/applications/ad/views/default/details.html"),
    "web2py/applications/ad/controllers/default.py")
assert eq(templatePathToPythonPath(
    "web2py/applications/ad/views/default/index.html"),
    "web2py/applications/ad/controllers/default.py")
assert eq(templatePathToPythonPath(
    "web2py/applications/examples/views/ajax_examples/index.html"),
    "web2py/applications/examples/controllers/ajax_examples.py")

name_to_contents = {}
def get_file_contents(fileName):

    global name_to_contents
    if not name_to_contents.has_key(fileName):
        if os.path.exists(fileName):
            f = file(fileName)
            name_to_contents[fileName] = f.read()
            f.close()
        else:
            report_error("Cannot load %s" % fileName)
            name_to_contents[fileName] = ""
    return name_to_contents[fileName]

def check_url_exists(url):

    if FILENAME.find("appadmin.html") > 0:
        return

    if url.find(".") > 0:
        functionName = url.split(".")[-1]
    else:
        functionName = url

    # print "check_url_exists(%s) moduleName=%s" % (url, moduleName)
    pythonFilePath = templatePathToPythonPath(FILENAME)

    if get_file_contents(pythonFilePath).find("def " + functionName + "()") < 0:
        report_error("cannot find %s in %s" % (functionName, pythonFilePath))

def check_url2_exists(moduleName, functionName):

    if moduleName == "static":
        return

    # print "check_url_exists(%s) moduleName=%s" % (url, moduleName)
    pythonFilePath = string.join(FILENAME.replace("/views/", "/controllers/").split("/")[:-1], "/") + "/" + moduleName + ".py"

    if get_file_contents(pythonFilePath).find("def " + functionName + "()") < 0:
        report_error("cannot find %s in %s" % (functionName, pythonFilePath))

def scan_file(path):

    global FILENAME
    global FNR

    FILENAME = path
    FNR = 0

    f = file(path)

    while 1:
        FNR += 1
        line = f.readline()
        if not line:
            break

        m = RE_SRC.search(line)
        if m:
            check_src_exists(m.group(1))

        m = RE_HREF.search(line)
        if m:
            check_href_exists(m.group(1))

        m = RE_URL2.search(line)
        if m:
            check_url2_exists(m.group(1), m.group(2))
        else:
            m = RE_URL.search(line)
            if m:
                check_url_exists(m.group(1))

    f.close()

def test_html(directory):

    for a in os.listdir(directory):
        if a == "epydoc":
            continue
        p = directory + "/" + a
        if os.path.isdir(p):
            test_html(p)
        if a.endswith(".html"):
            scan_file(p)

test_html("web2py/applications")

Tags: python, quality, static.

Web2py Lighttpd Deployment

Sun, 19 Feb 2012 07:29:07 +0000

Web2py is "full stack" Python Web Framework, Lighttpd is fast, multi-threaded HTTP server. I'll present a method to connect web2py-based application under lighttpd.

I assume the following setup is already done:

A domain named "myapp.com" is configured to point to your server
Python / lighttpd is already installed on server
Your web2py app is placed under /var/www/web2py
Your web2py app has application "myapproot" configured

First of all, you have to configure lighttpd to locate web2py application, create file /etc/lighttpd/conf-enabled/myapp.conf:

$HTTP["host"] =~ "(www\.)?myapp\.com" {
    server.indexfiles = ( "/myapproot" )
    server.document-root = "/var/www/myapp"
    server.dir-listing = "disable"
    fastcgi.server = (
        ".fcgi" => ("localhost" => (
            "check-local" => "disable",
            "min-procs" => "1",
            "max-procs" => "2",
            "socket" => "/tmp/myapp.sock")
        )
    )

    url.rewrite-once = (
        "^/$" => "/ad",
        "^(/.+?/static/.+)$" => "/applications$1",
        "(^|/.*)$" => "/fcgihandler.fcgi$1",
    )
    $HTTP["url"] !~ "^(/ad|/fcgihandler.fcgi|/applications/myapproot/static/)" {
        url.access-deny = ("")
    }
}

Explanation:

(www\.)?myapp\.com: regular expression to match domain with or without "www." prefix
server.indexfiles: specifies relative URL that should be called when only domain is given
server.document-root: specifies location of web2py app in filesystem
server.dir-listing: we do not want user to list our files using HTTP
fastcgi.server: specifies where socket file is located
url.rewrite-once: allow to use elegant (short) URLs
url.access-deny: files other than static directory should be forbidden (security)

Then you have to configure fcgihandler.fcgi script properly:

(...)
fcgi.WSGIServer(application, bindAddress='/tmp/myapp.sock').run()

Note that /tmp/myapp.sock must be the same as specified in lighttpd configuration.

Then you have to start the fcgihandler.fcgi proces and ensure it will start on every boot. That's all.

Tags: httpd, python, web.

Migration to python subprocess module

Sat, 25 May 2013 10:55:19 +0000

After recent OS upgrade one of my unit tests started to fail (to be precise it started to hang). Quickly check showed me that CGI process started by os.popen() hanged. The old source code:



f = os.popen("./cgi_script.cgi > /dev/null", "w")
f.write(postBody)
f.flush()
f.close()

As os.popen() is deprecated now (I know, it's a very old codebase that started with Python 1.5) I've moved to new subprocess module:



fNull = file("/dev/null", "w")
p = subprocess.Popen("./cgi_script.cgi", shell=False, bufsize=1024, stdin = subprocess.PIPE, stdout = fNull)
fw = p.stdin
fw.write(postBody)
fw.flush()
fw.close()
del p

As you can see it's more verbose now but I've eliminated shell (slightly faster operation).

Some notes found during migration:

without "del p" process may be not terminated causing problems with DB state (CGI proces updates database and test checks this state later)
I/O configuration is more flexible than os.popen() - you can make pipes more easily

Tags: python.

Dariusz on Software

Methods and Tools

About This Site

Archive

Entries tagged "python".

ORM Improvements

Model improvements

Testing improvements

Manual schema migrations, single database instance

Migrations by series of SQL scripts, manual apply

Migrations by series of SQL scripts, applied automatically

Automatic schema migrations for ORM

Which approach should I choose?

Tags