A follow up on the SQLAlchemy vs sqlite post.
Its only now I realize what a bad idea the SQLAlchemy approach is for loading data. While the ORM is good for querying which is what I plan to use, I finally have a solution for loading data using sqlite, and some fast Python algorithms for parsing the data.
It took me a lot of tweaking but I have finally brought the load time down from 30 minutes initially, going up to 2 hours using SQLAlchemy to 5 minutes now. And I think it can be tweaked even further. If anyone is reading this blog at all and is interested, let me know and I can post the files up somewhere.
Here are some links I found that were really helpful in tweaking my Python code:
http://www.python.org/doc/essays/list2str.html - Guido Van Russom's post about optimization
http://www.tbray.org/ongoing/When/200x/2007/10/30/WF-Results - The Wide-Finder project. An interesting project about parsing text using different approaches and different scripting languages
http://effbot.org/zone/wide-finder.htm - A really useful article about optimization techniques for parsing with Python.
0 blog comments below