jugad2 - Vasudev Ram on software innovation: PyMySQL

Tuesday, December 30, 2014

pyDAL, a pure Python Database Abstraction Layer

pyDAL is a pure Python Database Abstraction Layer. So it seems to be something like the lower layer of SQLAlchemy, i.e. SQLAlchemy Core, the library that is used by the upper layer, SQLAlchemy ORM. See the SQLAlchemy (0.8) documentation.

From the pyDAL site:

[ It dynamically generates the SQL in real time using the specified dialect for the database back end, so that you do not have to write SQL code or learn different SQL dialects (the term SQL is used generically), and your code will be portable among different types of databases.

pyDAL comes from the original web2py's DAL, with the aim of being wide-compatible. pyDAL doesn't require web2py and can be used in any Python context. ]

IOW, pyDAL has been separated out into a different project from web2py, a Python web framework, of which it was originally a part.

The use of an ORM (Object Relational Mapper) vs. writing plain SQL code (vs. using an intermediate option like pyDAL or SQLAlchemy Core), can be controversial; there are at least some pros and cons on both (or all 3) sides. I've read some about this, and have got some experience with using some of these options in different projects, but am not an expert on which is the best approach, and also, it can vary depending on your project's needs, so I'm not getting into that topic in this post.

pyDAL seems to support many popular databases, mostly SQL ones, but also a NoSQL one or two, and even IMAP. Here is a list, from the site: SQLite, PostgreSQL, MySQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, Cubrid, Sybase, Teradata, SAPDB, MongoDB, IMAP.

For some of those databases, it uses PyMySQL, pyodbc or fbd, which are all Python database libraries that I had blogged about earlier.

I tried out pyDAL a little, with this simple program, adapted from its documentation:

import sys
import time
from pydal import DAL, Field
db = DAL('sqlite://storage.db')
db.define_table('product', Field('name'))
t1 = time.time()
num_rows = int(sys.argv[1])
for product_number in range(num_rows):
    db.product.insert(name='Product-'.format(str(product_number).zfill(4)))
t2 = time.time()
print "time to insert {} rows = {} seconds".format(num_rows, int(t2 - t1))
query = db.product.name
t1 = time.time()
rows = db(query).select()
for idx, row in enumerate(rows):
    #print idx, row.name
    pass
t2 = time.time()
print "time to select {} rows = {} seconds".format(num_rows, int(t2 - t1))

It worked, and gave this output:

$ python test_pydal2.py 100000
No handlers could be found for logger "web2py"
time to insert 100000 rows = 18 seconds
time to select 100000 rows = 7 seconds

Note: I first ran it with this statement uncommented:

#print idx, row.name

to confirm that it did select the records, and then commented it and replaced it with "pass" in order to time the select without the overhead of displaying the records to the screen.

I'll check out pyDAL some more, for other commonly needed database operations, and may write about it here.
There may be a way to disable that message about a logger.

The timing statements in the code and the time output can be ignored for now, since they are not meaningful without doing a comparison against the same operations done without pyDAL (i.e. just using SQL from Python with the DB API). I will do a comparison later on and blog about it if anything interesting is found.

- Vasudev Ram - Dancing Bison Enterprises - Python training and consulting

Signup to hear about new products or services from me.

Contact Page

Share |

Monday, November 10, 2014

PyMySQL, a pure-Python client library for MySQL

By Vasudev Ram

I came across PyMySQL on github today. Excerpt from the PyMySQL Github page:

[ This package contains a pure-Python MySQL client library. The goal of PyMySQL is to be a drop-in replacement for MySQLdb and work on CPython, PyPy, IronPython and Jython. ]

In other words, PyMySQL is a pure-Python MySQL driver, and as the docs say, it aims to be a drop-in replacement for MySQLdb, which is a Python driver for MySQL, but implemented as a Python C extension. One of the advantages of using a pure-Python library instead of a C extension, is that you do not have to build it from source, and also, potentially, that it may work without any build steps or changes to the code, on various versions of Python, such as CPython, PyPy, etc., as the excerpt above says. Also see this Stack Overflow question:

What actually is pymysql and how it differs from mysqldb?

Here is PyMySQL on PyPI, (the Python Package Index).

The statistics for PyMySQL on PyPI show this:

Downloads (All Versions):
104 downloads in the last day
7168 downloads in the last week
49124 downloads in the last month

So I tried out PyMySQL a bit, on CPython 2.7.8 on Ubuntu Linux.

To install PyMySQL, do:

$ pip install pymysql

Checked whether it was properly installed with:

$ python
>>> import PyMySQL

Oops, didn't work. I then tried:

>>> import pymysql

That did work.
Here is the schema of the MySQL table (called sales, in a database also called sales) that I used to try PyMySQL:

$ mysql -u root
mysql> use sales;

Database changed
mysql> desc sales;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| id         | int(11)     | YES  |     | NULL    |       |
| region     | varchar(6)  | YES  |     | NULL    |       |
| item       | varchar(15) | YES  |     | NULL    |       |
| unit_price | int(11)     | YES  |     | NULL    |       |
| quantity   | int(11)     | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
5 rows in set (0.01 sec)

mysql>

Basically, it is a table to record sales data by region, for items with unit prices and quantities sold.
And here is my Python program to try PyMySQL:

$ cat test_pymysql.py

import pymysql
conn = pymysql.connect(host='localhost', port=3306, user='some_user', passwd='some_password', db='sales')
curs = conn.cursor()
curs.execute('select * from sales')
# Added this to give a proper header for the output, with field names.
field_names = [ item[0] for item in curs.description ]
for field_name in field_names:
    print field_name.rjust(12),
print
for row in curs:
    for col in row:
        print str(col).rjust(12),
    print
print
curs.close()
conn.close()

I ran the program and got the correct output:

$ python test_pymysql.py
id       region         item   unit_price     quantity
1        North        Chair          100            2
2        North        Table          200            3
3        South         Desk          300            1
3        South        Shelf          400            2
$

I also ran this other SQL query with the GROUP BY clause, to get the region-wise sales:

curs.execute('select region, sum(unit_price * quantity) as value from sales group by region order by 1')

and got the expected output:

$ python test_pymysql.py
      region        value
       North          800
       South         1100

The PyMySQL Github page shows how it works: PyMySQL implements the client side of the MySQL Client/Server Protocol. The MySQL database server listens for database requests from clients on port 3306, and the clients (which may be on the same machine or on another machine on the network), send the request to that server at that port. The protocol also supports UNIX domain sockets, so the connect method has a named parameter unix_socket=None.

Related links:

PyPy, a Python interpreter and Just-In-Time compiler

IronPython, Python for .NET

Jython, Python for the JVM

MySQL

- Vasudev Ram - Dancing Bison Enterprises

Signup for email about new products that I create.

Contact Page

Share |

jugad2 - Vasudev Ram on software innovation

Pages

Tuesday, December 30, 2014

pyDAL, a pure Python Database Abstraction Layer

Monday, November 10, 2014

PyMySQL, a pure-Python client library for MySQL

Blog Archive

Labels