Thursday, January 16, 2014

urllib3, the library used by the Python requests library


By Vasudev Ram



While checking out a tool that uses the requests HTTP library for Python, I happened to see that requests itself uses a library called urllib3 internally. (Here is urllib3 on PyPI.)

Since I had requests installed in my Python installation's directory, I searched for filenames like urllib3* in Python's lib/site-packages. Found the module there, in the directory:

requests/packages/urllib3

I also searched the Net and found this article by Kenneth Reitz, creator of the requests library:

Major Progress for Requests

in which he mentions collaborating with the creator of urllib3 to make use of it in requests.

urllib3 seems to have a good set of features, some of which are:

[
Re-use the same socket connection for multiple requests (HTTPConnectionPool and HTTPSConnectionPool) (with optional client-side certificate verification).

File posting (encode_multipart_formdata).

Built-in redirection and retries (optional).

Supports gzip and deflate decoding.

Thread-safe and sanity-safe.

Tested on Python 2.6+ and Python 3.2+, 100% unit test coverage.

Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at Requests which is also powered by urllib3.
]

So after checking the urllib3 docs a bit, I wrote a small program to test urllib3 by using it to download the home page of my web site, dancingbison.com:

# try_urllib3.py
# A program to try basic usage of the urllib3 Python library.

from requests.packages import urllib3

http = urllib3.PoolManager()
r = http.request('GET', 'http://dancingbison.com/index.html')

print "r.status: ", r.status
print "r.data", r.data

with open("dancingbison_index.html", "w") as out_fil:
    out_fil.write(r.data)

It worked, and downloaded the file index.html.

Interestingly, urllib3 itself uses httplib under the hood. So it's turtles at least 3 levels down ... :-)


- xtopdf: programmable PDF creation for business


Vasudev Ram - Python / open source / Linux training and consulting





2 comments:

matej said...

http://thread.gmane.org/gmane.comp.python.ideas/23946

I feel a lot of dislike for urllib3, and other libraries requests depends upon.

Vasudev Ram said...

@matej: Interesting, will check out that thread.