jugad2 - Vasudev Ram on software innovation

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

- Vasudev Ram - Online Python training and consulting

Saturday, May 5, 2018

A Python version of the Linux watch command

By Vasudev Ram

Watcher image attribution: Yours truly

Hi readers,

[ Update: A note to those reading this post via Planet Python or other aggregators:

Before first pulbishing it, I reviewed the post in Blogger's preview mode, and it appeared okay, regarding the use of the less-than character, so I did not escape it. I did not know (or did not remember) that Planet Python's behavior may be different. As a result, the code had appeared without the less-than signs in the Planet, thereby garbling it. After noticing this, I fixed the issue in the post. Apologies to those seeing the post twice as a result. ]

I was browsing Linux command man pages (section 1) for some work, and saw the page for an interesting command called watch. I had not come across it before. So I read the watch man page, and after understanding how it works (it's pretty straightforward [1]), thought of creating a Python version of it. I have not tried to implement exactly the same functionality as watch, though, just something similar to it. I called the program watch.py.

[1] The one-line description of the watch command is:

watch - execute a program periodically, showing output fullscreen

How watch.py works:

It is a command-line Python program. It takes an interval argument (in seconds), followed by a command with optional arguments. It runs the command with those arguments, repeatedly, at that interval. (The Linux watch command has a few more options, but I chose not to implement those in this version. I may add some of them [2], and maybe some other features that I thought of, in a future version.)

[2] For example, the -t, -b and -e options should be easy to implement. The -p (--precise) option is interesting. The idea here is that there is always some time "drift" [3] when trying to run a command periodically at some interval, due to unpredictable and variable overhead of other running processes, OS scheduling overhead, and so on. I had experienced this issue earlier when I wrote a program that I called pinger.sh, at a large company where I worked earlier.

[3] You can observe the time drift in the output of the runs of the watch.py program, shown below its code below. Compare the interval with the time shown for successive runs of the same command.

I had written it at the request of some sysadmin friends there, who wanted a tool like that to monitor the uptime of multiple Unix servers on the company network. So I wrote the tool, using a combination of Unix shell, Perl and C. They later told me that it was useful, and they used it to monitor the uptime of multiple servers of the company in different cities. The C part was where the more interesting stuff was, since I used C to write a program (used in the overall shell script) that sort of tried to compensate for the time drift, by doing some calculations about remaining time left, and sleeping for those intervals. It worked somewhat okay, in that it reduced the drift a good amount. I don't remember the exact logic I used for it right now, but do remember finding out later, that the gettimeofday function might have been usable in place of the custom code I wrote to solve the issue. Good fun. I later published the utility and a description of it in the company's Knowledge Management System.

Anyway, back to watch.py: each time, it first prints a header line with the interval, the command string (truncated if needed), and the current date and time, followed by some initial lines of the output of that command (this is what "watching" the command means). It does this by creating a pipe with the command, using subprocess.Popen and then reading the standard output of the command, and printing the first num_lines lines, where num_lines is an argument to the watch() function in the program.

The screen is cleared with "clear" for Linux and "cls" for Windows. Using "echo ^L" instead of "clear" works on some Linux systems, so changing the clear screen command to that may make the program a little faster, on systems where echo is a shell built-in, since there will be no need to load the clear command into memory each time [4]. (As a small aside, on earlier Unix systems I've worked on, on which there was sometimes no clear command (or it was not installed), as a workaround, I used to write a small C program that printed 25 newlines to the screen, and compile and install that as a command called clear or cls :)

[4] Although, on recent Windows and Linux systems, after a program is run once, if you run it multiple times a short while later, I've noticed that the startup time is faster from the second time onwards. I guess this is because the OS loads the program code into a memory cache in some way, and runs it from there for the later times it is called. Not sure if this is the same as the OS buffer cache, which I think is only for data. I don't know if there is a standard name for this technique. I've noticed for sure, that when running Python programs, for example, the first time you run:

python some_script_name.py

it takes a bit of time - maybe a second or three, but after the first time, it starts up faster. Of course this speedup disappears when you run the same program after a bigger gap, say the next day, or after a reboot. Presumably this is because that program cache has been cleared.

Here is the code for watch.py.

"""
------------------------------------------------------------------
File: watch.py
Version: 0.1
Purpose: To work somewhat like the Linux watch command.
See: http://man7.org/linux/man-pages/man1/watch.1.html
Does not try to replicate its functionality exactly.

Author: Vasudev Ram
Copyright 2018 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
Twitter: https://mobile.twitter.com/vasudevram
------------------------------------------------------------------
"""

from __future__ import print_function

import sys
import os
from subprocess import Popen, PIPE
import time

from error_exit import error_exit

# Assuming 25-line terminal. Adjust if different.
# If on Unix / Linux, can get value of environment variable 
# COLUMNS (if defined) and use that instead of 80.
DEFAULT_NUM_LINES = 20

def usage(args):
    lines = [
        "Usage: python {} interval command [ argument ... ]".format(
            args[0]),
        "Run command with the given arguments every interval seconds,",
        "and show some initial lines from command's standard output.",
        "Clear screen before each run.",
    ]
    for line in lines:
        sys.stderr.write(line + '\n')

def watch(command, interval, num_lines):
    # Truncate command for display in the header of watch output.
    if len(command) > 50:
        command_str = command[:50] + "..."
    else:
        command_str = command
    hdr_part_1 = "Every {}s: {} ".format(interval, command_str)
    # Assuming 80 columns terminal width. Adjust if different.
    # If on Unix / Linux, can get value of environment variable 
    # COLUMNS (if defined) and use that instead of 80.
    columns = 80
    # Compute pad_len only once, before the loop, because 
    # neither len(hdr_part_1) nor len(hdr_part_2) change, 
    # even though hdr_part_2 is recomputed each time in the loop.
    hdr_part_2 = time.asctime()
    pad_len = columns - len(hdr_part_1) - len(hdr_part_2) - 1
    while True:
        # Clear screen based on OS platform.
        if "win" in sys.platform:
            os.system("cls")
        elif "linux" in sys.platform: 
            os.system("clear")
        hdr_str = hdr_part_1 + (" " * pad_len) + hdr_part_2
        print(hdr_str + "\n")
        # Run the command, read and print its output up to num_lines lines.
        # os.popen is the old deprecated way, Python docs recommend to use 
        # subprocess.Popen.
        #with os.popen(command) as pipe:
        with Popen(command, shell=True, stdout=PIPE).stdout as pipe:
            for line_num, line in enumerate(pipe):
                print(line, end='')
                if line_num >= num_lines:
                    break
        time.sleep(interval)
        hdr_part_2 = time.asctime()

def main():

    sa, lsa = sys.argv, len(sys.argv)

    # Check arguments and exit if invalid.
    if lsa < 3:
        usage(sa)
        error_exit(
        "At least two arguments are needed: interval and command;\n"
        "optional arguments can be given following command.\n")

    try:
        # Get the interval argument as an int.
        interval = int(sa[1])
        if interval < 1:
            error_exit("{}: Invalid interval value: {}".format(sa[0],
                interval))
        # Build the command to run from the remaining arguments.
        command = " ".join(sa[2:])
        # Run the command repeatedly at the given interval.
        watch(command, interval, DEFAULT_NUM_LINES)
    except ValueError as ve:
        error_exit("{}: Caught ValueError: {}".format(sa[0], str(ve)))
    except OSError as ose:
        error_exit("{}: Caught OSError: {}".format(sa[0], str(ose)))
    except Exception as e:
        error_exit("{}: Caught Exception: {}".format(sa[0], str(e)))

if __name__ == "__main__":
    main()

Here is the code for error_exit.py, which watch imports.

# error_exit.py

# Author: Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

# Purpose: This module, error_exit.py, defines a function with 
# the same name, error_exit(), which takes a string message 
# as an argument. It prints the message to sys.stderr, or 
# to another file object open for writing (if given as the 
# second argument), and then exits the program.
# The function error_exit can be used when a fatal error condition occurs, 
# and you therefore want to print an error message and exit your program.

import sys

def error_exit(message, dest=sys.stderr):
    dest.write(message)
    sys.exit(1)

def main():
    error_exit("Testing error_exit with dest sys.stderr (default).\n")
    error_exit("Testing error_exit with dest sys.stdout.\n", 
        sys.stdout)
    with open("temp1.txt", "w") as fil:
        error_exit("Testing error_exit with dest temp1.txt.\n", fil)

if __name__ == "__main__":
    main()

Here are some runs of watch.py and their output:
(BTW, the dfs command shown, is from the Quick-and-dirty disk free space checker for Windows post that I had written recently.)

$ python watch.py 15 ping google.com

Every 15s: ping google.com                             Fri May 04 21:15:56 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=117ms
Reply from 2404:6800:4007:80d::200e: time=109ms
Reply from 2404:6800:4007:80d::200e: time=117ms
Reply from 2404:6800:4007:80d::200e: time=137ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 109ms, Maximum = 137ms, Average = 120ms

Every 15s: ping google.com                             Fri May 04 21:16:14 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=501ms
Reply from 2404:6800:4007:80d::200e: time=56ms
Reply from 2404:6800:4007:80d::200e: time=105ms
Reply from 2404:6800:4007:80d::200e: time=125ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 56ms, Maximum = 501ms, Average = 196ms

Every 15s: ping google.com                             Fri May 04 21:16:33 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=189ms
Reply from 2404:6800:4007:80d::200e: time=141ms
Reply from 2404:6800:4007:80d::200e: time=245ms
Reply from 2404:6800:4007:80d::200e: time=268ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 141ms, Maximum = 268ms, Average = 210ms

$ python watch.py 15 c:\ch\bin\date

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:00 2018

Tue May  1 00:33:00 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:15 2018

Tue May  1 00:33:16 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:31 2018

Tue May  1 00:33:31 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:46 2018

Tue May  1 00:33:47 India Standard Time 2018

In one CMD window:

$ d:\temp\fill-and-free-disk-space

In another:

$ python watch.py 10 dfs d:\

Every 10s: dfs d:\                                     Tue May 01 00:43:25 2018
Disk free space on d:\
37666.6 MiB = 36.78 GiB

Every 10s: dfs d:\                                     Tue May 01 00:43:35 2018
Disk free space on d:\
37113.7 MiB = 36.24 GiB

$ python watch.py 20 dir /b "|" sort

Every 20s: dir /b | sort                               Fri May 04 21:29:41 2018

README.txt
runner.py
watch-outputs.txt
watch-outputs2.txt
watch.py
watchnew.py

$ python watch.py 10 ping com.nosuchsite

Every 10s: ping com.nosuchsite                         Fri May 04 21:30:49 2018

Ping request could not find host com.nosuchsite. Please check the name and try again.

$ python watch.py 20 dir z:\

Every 20s: dir z:\                                     Tue May 01 00:54:37 2018
The system cannot find the path specified.

$ python watch.py 2b echo testing
watch.py: Caught ValueError: invalid literal for int() with base 10: '2b'

$ python watch.py 20 foo

Every 20s: foo                                         Fri May 04 21:33:35 2018

'foo' is not recognized as an internal or external command,
operable program or batch file.

$ python watch.py -1 foo
watch.py: Invalid interval value: -1

- Enjoy.

Interested in a Python programming or Linux commands and shell scripting course? I have good experience built over many years of real-life experience, as well as teaching, in both those subject areas. Contact me for course details via my contact page here.

Fast web hosting with A2 Hosting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Get fast reliable hosting with A2Hosting.com

Share |

Sunday, April 15, 2018

Quick-and-dirty disk free space checker for Windows

By Vasudev Ram

'I mean, if 10 years from now, when you are doing something quick and dirty, you suddenly visualize that I am looking over your shoulders and say to yourself "Dijkstra would not have liked this", well, that would be enough immortality for me.'

Dijkstra quote attribution

Hi readers,

[ This is the follow-up post that I said I would do after this previous post: Quick-and-clean disk usage utility in Python. This follow-up post describes the quick-and-dirty version of the disk space utility, which is the one I wrote first, before the quick-and-clean version linked above. Note that the two utilities do not give the exact same output - the clean one gives more information. Compare the outputs to see the difference. ]

I had a need to periodically check the free space on my disks in Windows. So I thought of semi-automating the process and came up with this quick-and-dirty utility for it. It used the DOS DIR command, a grep utility for Windows, and a simple Python script, all together in a pipeline, with the Python script processing the results provided by the previous two.

I will first show the Python script and then show its usage in a command pipeline together with the DIR command and a grep command. Then will briefly discuss other possible ways of doing this same task.

Here is the Python script, disk_free_space.py:

from __future__ import print_function
import sys

# Author: Vasudev Ram
# Copyright 2018 Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram
# Software mentoring: https://www.codementor.io/vasudevram

#for line in sys.stdin:

# The first readline (below) is to read and throw away the line with 
# just "STDIN" in it. We do this because the grep tool that is used 
# before this program in the pipeline (see dfs.bat below), adds a line 
# with "STDIN" before the real grep output.
# Another alternative is to use another grep which does not do that; 
# in that case, delete the first readline statement.
line = sys.stdin.readline()

# The second readline (below) gets the line we want, with the free space in bytes.
line = sys.stdin.readline()

if line.endswith("bytes free\n"):
    words = line.split()
    bytes_free_with_commas = words[2]
    try:
        free_space_mb = int(bytes_free_with_commas.replace(
            ",", "")) / 1024.0 / 1024.0
        free_space_gb = free_space_mb / 1024.0 
        print("{:.1f} MiB = {:.2f} GiB".format(
            free_space_mb, free_space_gb))
    except ValueError as ve:
        sys.stdout.write("{}: Caught ValueError: {}\n".format(
            sys.argv[0], str(ve)))
    #break

An alternative method is to remove the first readline call above, and un-comment the for loop line at the top, and the break statement at the bottom. In that approach, the program will loop over all the lines of stdin, but skip processing all of them except for the single line we want, the one that has the pattern "bytes free". This is actually an extra level of checking that mostly will not be needed, since the grep preceding this program in the pipeline, should filter out all lines except for the one we want.

For why I used MiB and GiB units instead of MB and GB, refer to this article Wikipedia article: Mebibyte

Once we have the above program, we call it from the pipeline, which I have wrapped in this batch file, dfs.bat, for convenience, to get the end result we want:

@echo off
echo Disk free space on %1
dir %1 | grep "bytes free" | python c:\util\disk_free_space.py

Here is a run of dfs.bat to get disk free space information for drive D:\ :

$ dfs d:\
Disk free space on d:\
40103.0 MiB = 39.16 GiB

You can run dfs for both C: and D: in one single command like this:

$ dfs c:\ & dfs d:\

(It uses the Windows CMD operator & which means run the command to the left of the ampersand, then run the command to the right.)

Another way of doing the same task as this utility, is to use the Python psutil library. That way is shown in the quick-and-clean utility post linked near the top of this post. That way would be cross-platform, at least between Windows and Linux, as shown in that post. The only small drawback is that you have to install psutil for it to work, whereas this utility does not need it. This one does need a grep, of course.

Yet another way could be to use lower-level Windows file system APIs directly, to get the needed information. In fact, that is probably how psutil does it. I have not looked into that approach yet, but it might be interesting to do so. Might have to use techniques of calling C or C++ code from Python, like ctypes, SWIG or cffi for that, since those Windows APIs are probably written in C or C++. Check out this post for a very simple example on those lines:

Calling C from Python with ctypes

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Get fast reliable hosting with A2Hosting.com

Share |

Sunday, April 8, 2018

Quick-and-clean disk usage utility in Python

By Vasudev Ram

Hard disk image attribution

Hi readers,

Recently, I thought that I should check the disk space on my PC more often, possibly because of having installed a lot of software on it over a period. As you know, these days, many software apps take up a lot of disk space, sometimes in the range of a gigabyte or more for one app. So I wanted a way to check more frequently whether my disks are close to getting full.

I thought of creating a quick-and-dirty disk free space checker tool in Python, to partially automate this task. Worked out how to do it, and wrote it - initially for Windows only. I called it disk_free_space.py. Ran it to check the disk free space on a few of my disk partitions, and it worked as intended.

Then I slapped my forehead as I realized that I could do it in a cleaner as well as more cross-platform way, using the psutil library, which I knew and had used earlier.

So I wrote another version of the tool using psutil, that I called disk_usage.py.

Here is the code for disk_usage.py:

#----------------------------------------------------------------------
#
# disk_usage.py
#
# Author: Vasudev Ram
# Copyright 2018 Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram
# Software mentoring: https://www.codementor.io/vasudevram
#
# Description: A Python app to show disk usage.
# Usage: python disk_usage.py path
#
# For the path given as command-line argument, it shows 
# the percentage of space used, and the total, used and 
# free space, in both MiB and GiB. For definitions of
# MiB vs. MB and GiB vs. GB, see:
# https://en.wikipedia.org/wiki/Mebibyte
#
# Requires: The psutil module, see:
# https://psutil.readthedocs.io/
#
#----------------------------------------------------------------------

from __future__ import print_function
import sys
import psutil

BYTES_PER_MIB = 1024.0 * 1024.0

def disk_usage_in_mib(path):
    """ Return disk usage data in MiB. """
    # Here percent means percent used, not percent free.
    total, used, free, percent = psutil.disk_usage(path)
    # psutil returns usage data in bytes, so convert to MiB.
    return total/BYTES_PER_MIB, used/BYTES_PER_MIB, \
    free/BYTES_PER_MIB, percent

def main():
    if len(sys.argv) == 1:
        print("Usage: python {} path".format(sys.argv[0]))
        print("Shows the disk usage for the given path (file system).")
        sys.exit(0)
    path = sys.argv[1]
    try:
        # Get disk usage data.
        total_mib, used_mib, free_mib, percent = disk_usage_in_mib(path)
        # Print disk usage data.
        print("Disk Usage for {} - {:.1f} percent used. ".format( \
        path, percent))
        print("In MiB: {:.0f} total; {:.0f} used; {:.0f} free.".format(
            total_mib, used_mib, free_mib))
        print("In GiB: {:.3f} total; {:.3f} used; {:.3f} free.".format(
            total_mib/1024.0, used_mib/1024.0, free_mib/1024.0))
    except OSError as ose:
        sys.stdout.write("{}: Caught OSError: {}\n".format(
            sys.argv[0], str(ose)))
    except Exception as e:
        sys.stdout.write("{}: Caught Exception: {}\n".format(
            sys.argv[0], str(e)))

if __name__ == '__main__':
    main()

Here is the output from running it a few times:

On Linux:

$ df -BM -h /
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/precise32-root   79G  5.2G   70G   7% /

$ python disk_usage.py /
Disk Usage for / - 6.8 percent used.
In MiB: 80773 total; 5256 used; 71472 free.
In GiB: 78.880 total; 5.132 used; 69.797 free.

$ df -BM -h /boot
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       228M   24M  192M  12% /boot

$ python disk_usage.py /boot
Disk Usage for /boot - 11.1 percent used.
In MiB: 228 total; 24 used; 192 free.
In GiB: 0.222 total; 0.023 used; 0.187 free.

On Windows:

$ python disk_usage.py d:\
Disk Usage for d:\ - 59.7 percent used.
In MiB: 100000 total; 59667 used; 40333 free.
In GiB: 97.656 total; 58.268 used; 39.388 free.

$ python disk_usage.py h:\
Disk Usage for h:\ - 28.4 percent used.
In MiB: 100 total; 28 used; 72 free.
In GiB: 0.098 total; 0.028 used; 0.070 free.

I had to tweak the df command invocation to be as you see it above, to make the results of my program and those of df to match. This is because of the difference in calculating MB vs. MiB and GB vs. GiB - see Wikipedia link in header comment of my program above, if you do not know the differences.

So this program using psutil is both cleaner and more cross-platform than my original quick-and-dirty one which was only for Windows, but which did not need psutil installed. Pros and cons for both. I will show the latter program in a following post.

The image at the top of the post is of "a newer 2.5-inch (63.5 mm) 6,495 MB HDD compared to an older 5.25-inch full-height 110 MB HDD".

I've worked some years earlier in system engineer roles where I encountered such older models of hard disks, and also had good experiences and learning in solving problems related to them, mainly on Unix machines, including sometimes using Unix commands and tricks of the trade that I learned or discovered, to recover data from systems where the machine or the hard disk had crashed, and of course, often without backups available. Here is one such anecdote, which I later wrote up and published as an article for Linux For You magazine (now called Open Source For You):

How Knoppix saved the day.

Talk of Murphy's Law ...

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

urllib3, the library used by the Python requests library

Share |

Saturday, March 31, 2018

Checking if web sites are online with Python

By Vasudev Ram

Hi readers,

Recently, I thought of writing a small program to check if one or more web sites are online or not. I used the requests Python library with the HTTP HEAD method. I also checked out PycURL for this. It is a thin wrapper over libcurl, the library that powers the well-known and widely used curl command line tool. While PycURL looks powerful and fast (since it is a thin wrapper that exposes most or all of the functionality of libcurl), I decided to use requests for this version of the program. The code for the program is straightforward, but I found a few interesting things while running it with a few different sites as arguments. I mention those points below.

Here is the tool: I named it is_site_online.py:

"""
is_site_online.py
Purpose: A Python program to check if a site is online or not.
Uses the requests library and the HTTP HEAD method.
Tries both with and without HTTP redirects.
Author: Vasudev Ram
Copyright 2018 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
"""

from __future__ import print_function
import sys
import requests
import time

if len(sys.argv) < 2:
    sys.stderr.write("Usage: {} site ...".format(sys.argv[0]))
    sys.stderr.write("Checks if the given site(s) are online or not.")
    sys.exit(0)

print("Checking if these sites are online or not:")
print("   ".join(sys.argv[1:]))

print("-" * 60)
try:
    for site in sys.argv[1:]:
        for allow_redirects in (False, True):
            tc1 = time.clock()
            r = requests.head(site, allow_redirects=allow_redirects)
            tc2 = time.clock()
            print("Site:", site)
            print("Check with allow_redirects =", allow_redirects)
            print("Results:")
            print("r.ok:", r.ok)
            print("r.status_code:", r.status_code)
            print("request time:", round(tc2 - tc1, 3), "secs")
            print("-" * 60)
except requests.ConnectionError as ce:
    print("Error: ConnectionError: {}".format(ce))
    sys.exit(1)
except requests.exceptions.MissingSchema as ms:
    print("Error: MissingSchema: {}".format(ms))
    sys.exit(1)
except Exception as e:
    print("Error: Exception: {}".format(e))
    sys.exit(1)

The results of some runs of the program:

Check for Google and Yahoo!:

$ python is_site_online.py http://google.com http://yahoo.com
Checking if these sites are online or not:
http://google.com   http://yahoo.com
-----------------------------------------------------------
Site: http://google.com
Check with allow_redirects = False
Results:
r.ok: True
r.status_code: 302
request time: 0.217 secs
------------------------------------------------------------
Site: http://google.com
Check with allow_redirects = True
Results:
r.ok: True
r.status_code: 200
request time: 0.36 secs
------------------------------------------------------------
Site: http://yahoo.com
Check with allow_redirects = False
Results:
r.ok: True
r.status_code: 301
request time: 2.837 secs
------------------------------------------------------------
Site: http://yahoo.com
Check with allow_redirects = True
Results:
r.ok: True
r.status_code: 200
request time: 1.852 secs
------------------------------------------------------------

In the cases where allow_redirects is False, google.com gives a status code of 302 and yahoo.com gives a status code of 301. The 3xx series of codes are related to HTTP redirection.

After seeing this, I looked up HTTP status code information in a few sites such as Wikipedia and the official site www.w3.org (the World Wide Web Consortium), and found a point worth noting. See the part in the Related links section at the end of this post about "302 Found", where it says: "This is an example of industry practice contradicting the standard.".

Now let's check for some error cases:

One error case: we do not give an http:// prefix (assume some novice user who is mixed up about schemes and paths), so they type a garbled site name, say http.om:

$ python is_site_online.py http.om
Checking if these sites are online or not:
http.om
------------------------------------------------------------
Traceback (most recent call last):
  File "is_site_online.py", line 32, in 
    r = requests.head(site, allow_redirects=allow_redirects)
[snip long traceback]
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'http.om':
No schema supplied. Perhaps you meant http://http.om?

This traceback tells us that when no HTTP 'scheme' [1][2] is given, requests raises a MissingSchema exception. So we now know that we need to catch that exception in our code, by adding another except clause to the try statement, which I later did, in the program you see in this post. In general, this technique can be useful when using a new Python library for the first time: just don't handle any exceptions in the beginning, use it a few times with variations in input or modes of use, and see what sorts of exceptions it throws. Then add code to handle them.

[1] The components of a URL

[2] Parts of URL

Another error case - a made-up site name that does not exist:

$ python is_site_online.py http://abcd.efg
Checking if these sites are online or not:
http://abcd.efg
------------------------------------------------------------
Caught ConnectionError: HTTPConnectionPool(host='abcd.efg',
port=80): Max retries exceeded with url: / (Caused
by NewConnectionError(': Failed
to establish a new connection: [Errno 11004] getaddrinfo
failed',))

From the above error we can see or figure out a few things:

- the requests library defines a ConnectionError exception. I first ran the above command without catching ConnectionError in the program; it gave that error, then I added the handler for it.

- requests uses an HTTP connection pool

- requests does some retries when you try to get() or head() a URL (a site name)

- requests uses urllib3 (from the Python standard library) under the hood

I had discovered that last point earlier too; see this post:

And as I mentioned in that post, urllib3 itself uses httplib.

Now let's check for some sites that are misspellings of the site google.com:

$ python is_site_online.py http://gogle.com Checking ... ------------------------------------------------------------ Site: http://gogle.com With allow_redirects: False Results: r.ok: True r.status_code: 301 request time: 3.377 ------------------------------------------------------------ Site: http://gogle.com With allow_redirects: True Results: r.ok: True r.status_code: 200 request time: 1.982 ------------------------------------------------------------

$ python is_site_online.py http://gooogle.com Checking ... ------------------------------------------------------------ Site: http://gooogle.com With allow_redirects: False Results: r.ok: True r.status_code: 301 request time: 0.425 ------------------------------------------------------------ Site: http://gooogle.com With allow_redirects: True Results: r.ok: True r.status_code: 200 request time: 1.216 ------------------------------------------------------------

Interestingly, the results show that that both those misspellings of google.com exist as sites.

It is known that some people register domains that are similar in spelling to well-known / popular / famous domain names, maybe hoping to capture some of the traffic resulting from users mistyping the famous ones. Although I did not plan it that way, I realized, from the above two results for gogle.com and gooogle.com, that this tool can be used to detect the existence of such sites (if they are online when you check, of course).

Sunday, March 12, 2017

Find the number of bits needed to store an integer, and its binary representation (Python)

By Vasudev Ram

Hi readers,

I wrote this post yesterday:

Analyse this Python code snippet

I was initially going to give the solution (to the question asked in that post) today, but then realized that I posted it at the weekend. So, to give a bit of time for anyone to attempt it, including some of my programming students, I decided to post the solution early next week.

But in the meantime, inspired some Q&A in a class I taught, I had the idea of creating this simple program to find the the number of bits needed to represent integers of various sizes (0 to 256, specifically, though the code can easily be modified to do it for any size of int). Note that this is the calculation of the minimum number of bits needed to represent some integers per se, not necessarily the number of bits that Python or any other language actually uses to store those same integers, which can be more than the minimum. This is because, at least in the case of Python, being a dynamic language, most data types have more capabilities than just being data - e.g. ints in Python are objects, so they incur some overhead for being objects (instances of classes, such as having a dictionary of attributes, and so on). The other reason is that data objects in dynamic languages often take up extra pre-allocated space, to store some metadata or to allow for future expansion in the size of the value being stored - e.g. that may or not apply in the case of ints, but it can in the case of lists.

(See this earlier post by me: Exploring sizes of data types in Python for more on this topic.)

Note: the level of this post is targeted towards relative beginners to programming, who might not be too familiar with computer representation of numbers, or even towards somewhat experienced programmers, who are not familiar with that topic either (I've come across some). Believe it or not, I've come across people (ex-colleagues in some cases, as well as others, and developers as well as system administrators) who did not know that a compiled binary for one processor will usually not run on another type of processor [1], even though they may be running the same OS (such as Unix), because their (processor) instructions sets are different. (What's an instruction set? - some of them might ask.) This is like expecting a person who only knows language A to understand something spoken in language B - impossible, at least without some source of help, be it another person or a dictionary.

Having some knowledge in these areas (i.e. system internals or under-the-hood stuff, even at a basic level) is quite useful to have, and almost needed, for many real-life situations, ranging from things like choosing an appropriate data representation for your data, finding and fixing bugs quicker, system integration, data reading / writing / transformation / conversion, to bigger-picture issues like system performance and portability.

[1] Though there are exceptions to that these days, such as fat binaries.

Anyway, end of rant :)

I chose 256 as the upper limit because it is the size (+1) of the highest unsigned integer that can be stored in a single byte, and because values in the range 0 to 255 or 256 are very commonly used in low-level code, such as bit manipulation, assembly or machine language, C, some kinds of data processing (e.g. of many binary file formats), and so on. Of course values of word size (2 bytes / 16 bits) or double-word (4 bytes / 32 bits) are also often used in those areas, but the program can be modified to handle them too.

If you want to get a preview (and a clue) about what is coming up, check this snippet and its output first:

>>> for item in (0, 1, 2, 4, 8, 16):
...     print item.bit_length()
...
0
1
2
3
4
5

Hint: Notice that those are all powers of 2 in the tuple above, and correlate that fact with the output values.

Here is the program to find the number of bits needed to store an integer, and its binary representation (Python):

# int_bit_length_and_binary_repr.py
# Purpose: For integers from 0 to 256, print the number of 
# bits needed to represent them, and their values in binary.
# Author: Vasudev Ram
# Website: https://vasudevram.github.io
# Product store on Gumroad: https://gumroad.com/vasudevram
# Blog: https://jugad2.blogspot.com
# Twitter: @vasudevram

for an_int in range(0, 256 + 1):
    print an_int, "takes", an_int.bit_length(), "bits to represent,",
    print "and equals", bin(an_int), "in binary"

Before showing the output (because it is long, since I've shown all 257 rows of it:

If you found this post informative, you may also be interested in this earlier one on a related topic:

Converting numeric strings to integers with handrolled code

(I didn't remember to say it in that earlier post, but the image at the top of it is of a roti being rolled out with a rolling pin:)

And here is the output when I run the program:

$ python int_bit_length_and_binary_repr.py
0 takes 0 bits to represent, and equals 0b0 in binary
1 takes 1 bits to represent, and equals 0b1 in binary
2 takes 2 bits to represent, and equals 0b10 in binary
3 takes 2 bits to represent, and equals 0b11 in binary
4 takes 3 bits to represent, and equals 0b100 in binary
5 takes 3 bits to represent, and equals 0b101 in binary
6 takes 3 bits to represent, and equals 0b110 in binary
7 takes 3 bits to represent, and equals 0b111 in binary
8 takes 4 bits to represent, and equals 0b1000 in binary
9 takes 4 bits to represent, and equals 0b1001 in binary
10 takes 4 bits to represent, and equals 0b1010 in binary
11 takes 4 bits to represent, and equals 0b1011 in binary
12 takes 4 bits to represent, and equals 0b1100 in binary
13 takes 4 bits to represent, and equals 0b1101 in binary
14 takes 4 bits to represent, and equals 0b1110 in binary
15 takes 4 bits to represent, and equals 0b1111 in binary
16 takes 5 bits to represent, and equals 0b10000 in binary
17 takes 5 bits to represent, and equals 0b10001 in binary
18 takes 5 bits to represent, and equals 0b10010 in binary
19 takes 5 bits to represent, and equals 0b10011 in binary
20 takes 5 bits to represent, and equals 0b10100 in binary
21 takes 5 bits to represent, and equals 0b10101 in binary
22 takes 5 bits to represent, and equals 0b10110 in binary
23 takes 5 bits to represent, and equals 0b10111 in binary
24 takes 5 bits to represent, and equals 0b11000 in binary
25 takes 5 bits to represent, and equals 0b11001 in binary
26 takes 5 bits to represent, and equals 0b11010 in binary
27 takes 5 bits to represent, and equals 0b11011 in binary
28 takes 5 bits to represent, and equals 0b11100 in binary
29 takes 5 bits to represent, and equals 0b11101 in binary
30 takes 5 bits to represent, and equals 0b11110 in binary
31 takes 5 bits to represent, and equals 0b11111 in binary
32 takes 6 bits to represent, and equals 0b100000 in binary
33 takes 6 bits to represent, and equals 0b100001 in binary
34 takes 6 bits to represent, and equals 0b100010 in binary
35 takes 6 bits to represent, and equals 0b100011 in binary
36 takes 6 bits to represent, and equals 0b100100 in binary
37 takes 6 bits to represent, and equals 0b100101 in binary
38 takes 6 bits to represent, and equals 0b100110 in binary
39 takes 6 bits to represent, and equals 0b100111 in binary
40 takes 6 bits to represent, and equals 0b101000 in binary
41 takes 6 bits to represent, and equals 0b101001 in binary
42 takes 6 bits to represent, and equals 0b101010 in binary
43 takes 6 bits to represent, and equals 0b101011 in binary
44 takes 6 bits to represent, and equals 0b101100 in binary
45 takes 6 bits to represent, and equals 0b101101 in binary
46 takes 6 bits to represent, and equals 0b101110 in binary
47 takes 6 bits to represent, and equals 0b101111 in binary
48 takes 6 bits to represent, and equals 0b110000 in binary
49 takes 6 bits to represent, and equals 0b110001 in binary
50 takes 6 bits to represent, and equals 0b110010 in binary
51 takes 6 bits to represent, and equals 0b110011 in binary
52 takes 6 bits to represent, and equals 0b110100 in binary
53 takes 6 bits to represent, and equals 0b110101 in binary
54 takes 6 bits to represent, and equals 0b110110 in binary
55 takes 6 bits to represent, and equals 0b110111 in binary
56 takes 6 bits to represent, and equals 0b111000 in binary
57 takes 6 bits to represent, and equals 0b111001 in binary
58 takes 6 bits to represent, and equals 0b111010 in binary
59 takes 6 bits to represent, and equals 0b111011 in binary
60 takes 6 bits to represent, and equals 0b111100 in binary
61 takes 6 bits to represent, and equals 0b111101 in binary
62 takes 6 bits to represent, and equals 0b111110 in binary
63 takes 6 bits to represent, and equals 0b111111 in binary
64 takes 7 bits to represent, and equals 0b1000000 in binary
65 takes 7 bits to represent, and equals 0b1000001 in binary
66 takes 7 bits to represent, and equals 0b1000010 in binary
67 takes 7 bits to represent, and equals 0b1000011 in binary
68 takes 7 bits to represent, and equals 0b1000100 in binary
69 takes 7 bits to represent, and equals 0b1000101 in binary
70 takes 7 bits to represent, and equals 0b1000110 in binary
71 takes 7 bits to represent, and equals 0b1000111 in binary
72 takes 7 bits to represent, and equals 0b1001000 in binary
73 takes 7 bits to represent, and equals 0b1001001 in binary
74 takes 7 bits to represent, and equals 0b1001010 in binary
75 takes 7 bits to represent, and equals 0b1001011 in binary
76 takes 7 bits to represent, and equals 0b1001100 in binary
77 takes 7 bits to represent, and equals 0b1001101 in binary
78 takes 7 bits to represent, and equals 0b1001110 in binary
79 takes 7 bits to represent, and equals 0b1001111 in binary
80 takes 7 bits to represent, and equals 0b1010000 in binary
81 takes 7 bits to represent, and equals 0b1010001 in binary
82 takes 7 bits to represent, and equals 0b1010010 in binary
83 takes 7 bits to represent, and equals 0b1010011 in binary
84 takes 7 bits to represent, and equals 0b1010100 in binary
85 takes 7 bits to represent, and equals 0b1010101 in binary
86 takes 7 bits to represent, and equals 0b1010110 in binary
87 takes 7 bits to represent, and equals 0b1010111 in binary
88 takes 7 bits to represent, and equals 0b1011000 in binary
89 takes 7 bits to represent, and equals 0b1011001 in binary
90 takes 7 bits to represent, and equals 0b1011010 in binary
91 takes 7 bits to represent, and equals 0b1011011 in binary
92 takes 7 bits to represent, and equals 0b1011100 in binary
93 takes 7 bits to represent, and equals 0b1011101 in binary
94 takes 7 bits to represent, and equals 0b1011110 in binary
95 takes 7 bits to represent, and equals 0b1011111 in binary
96 takes 7 bits to represent, and equals 0b1100000 in binary
97 takes 7 bits to represent, and equals 0b1100001 in binary
98 takes 7 bits to represent, and equals 0b1100010 in binary
99 takes 7 bits to represent, and equals 0b1100011 in binary
100 takes 7 bits to represent, and equals 0b1100100 in binary
101 takes 7 bits to represent, and equals 0b1100101 in binary
102 takes 7 bits to represent, and equals 0b1100110 in binary
103 takes 7 bits to represent, and equals 0b1100111 in binary
104 takes 7 bits to represent, and equals 0b1101000 in binary
105 takes 7 bits to represent, and equals 0b1101001 in binary
106 takes 7 bits to represent, and equals 0b1101010 in binary
107 takes 7 bits to represent, and equals 0b1101011 in binary
108 takes 7 bits to represent, and equals 0b1101100 in binary
109 takes 7 bits to represent, and equals 0b1101101 in binary
110 takes 7 bits to represent, and equals 0b1101110 in binary
111 takes 7 bits to represent, and equals 0b1101111 in binary
112 takes 7 bits to represent, and equals 0b1110000 in binary
113 takes 7 bits to represent, and equals 0b1110001 in binary
114 takes 7 bits to represent, and equals 0b1110010 in binary
115 takes 7 bits to represent, and equals 0b1110011 in binary
116 takes 7 bits to represent, and equals 0b1110100 in binary
117 takes 7 bits to represent, and equals 0b1110101 in binary
118 takes 7 bits to represent, and equals 0b1110110 in binary
119 takes 7 bits to represent, and equals 0b1110111 in binary
120 takes 7 bits to represent, and equals 0b1111000 in binary
121 takes 7 bits to represent, and equals 0b1111001 in binary
122 takes 7 bits to represent, and equals 0b1111010 in binary
123 takes 7 bits to represent, and equals 0b1111011 in binary
124 takes 7 bits to represent, and equals 0b1111100 in binary
125 takes 7 bits to represent, and equals 0b1111101 in binary
126 takes 7 bits to represent, and equals 0b1111110 in binary
127 takes 7 bits to represent, and equals 0b1111111 in binary
128 takes 8 bits to represent, and equals 0b10000000 in binary
129 takes 8 bits to represent, and equals 0b10000001 in binary
130 takes 8 bits to represent, and equals 0b10000010 in binary
131 takes 8 bits to represent, and equals 0b10000011 in binary
132 takes 8 bits to represent, and equals 0b10000100 in binary
133 takes 8 bits to represent, and equals 0b10000101 in binary
134 takes 8 bits to represent, and equals 0b10000110 in binary
135 takes 8 bits to represent, and equals 0b10000111 in binary
136 takes 8 bits to represent, and equals 0b10001000 in binary
137 takes 8 bits to represent, and equals 0b10001001 in binary
138 takes 8 bits to represent, and equals 0b10001010 in binary
139 takes 8 bits to represent, and equals 0b10001011 in binary
140 takes 8 bits to represent, and equals 0b10001100 in binary
141 takes 8 bits to represent, and equals 0b10001101 in binary
142 takes 8 bits to represent, and equals 0b10001110 in binary
143 takes 8 bits to represent, and equals 0b10001111 in binary
144 takes 8 bits to represent, and equals 0b10010000 in binary
145 takes 8 bits to represent, and equals 0b10010001 in binary
146 takes 8 bits to represent, and equals 0b10010010 in binary
147 takes 8 bits to represent, and equals 0b10010011 in binary
148 takes 8 bits to represent, and equals 0b10010100 in binary
149 takes 8 bits to represent, and equals 0b10010101 in binary
150 takes 8 bits to represent, and equals 0b10010110 in binary
151 takes 8 bits to represent, and equals 0b10010111 in binary
152 takes 8 bits to represent, and equals 0b10011000 in binary
153 takes 8 bits to represent, and equals 0b10011001 in binary
154 takes 8 bits to represent, and equals 0b10011010 in binary
155 takes 8 bits to represent, and equals 0b10011011 in binary
156 takes 8 bits to represent, and equals 0b10011100 in binary
157 takes 8 bits to represent, and equals 0b10011101 in binary
158 takes 8 bits to represent, and equals 0b10011110 in binary
159 takes 8 bits to represent, and equals 0b10011111 in binary
160 takes 8 bits to represent, and equals 0b10100000 in binary
161 takes 8 bits to represent, and equals 0b10100001 in binary
162 takes 8 bits to represent, and equals 0b10100010 in binary
163 takes 8 bits to represent, and equals 0b10100011 in binary
164 takes 8 bits to represent, and equals 0b10100100 in binary
165 takes 8 bits to represent, and equals 0b10100101 in binary
166 takes 8 bits to represent, and equals 0b10100110 in binary
167 takes 8 bits to represent, and equals 0b10100111 in binary
168 takes 8 bits to represent, and equals 0b10101000 in binary
169 takes 8 bits to represent, and equals 0b10101001 in binary
170 takes 8 bits to represent, and equals 0b10101010 in binary
171 takes 8 bits to represent, and equals 0b10101011 in binary
172 takes 8 bits to represent, and equals 0b10101100 in binary
173 takes 8 bits to represent, and equals 0b10101101 in binary
174 takes 8 bits to represent, and equals 0b10101110 in binary
175 takes 8 bits to represent, and equals 0b10101111 in binary
176 takes 8 bits to represent, and equals 0b10110000 in binary
177 takes 8 bits to represent, and equals 0b10110001 in binary
178 takes 8 bits to represent, and equals 0b10110010 in binary
179 takes 8 bits to represent, and equals 0b10110011 in binary
180 takes 8 bits to represent, and equals 0b10110100 in binary
181 takes 8 bits to represent, and equals 0b10110101 in binary
182 takes 8 bits to represent, and equals 0b10110110 in binary
183 takes 8 bits to represent, and equals 0b10110111 in binary
184 takes 8 bits to represent, and equals 0b10111000 in binary
185 takes 8 bits to represent, and equals 0b10111001 in binary
186 takes 8 bits to represent, and equals 0b10111010 in binary
187 takes 8 bits to represent, and equals 0b10111011 in binary
188 takes 8 bits to represent, and equals 0b10111100 in binary
189 takes 8 bits to represent, and equals 0b10111101 in binary
190 takes 8 bits to represent, and equals 0b10111110 in binary
191 takes 8 bits to represent, and equals 0b10111111 in binary
192 takes 8 bits to represent, and equals 0b11000000 in binary
193 takes 8 bits to represent, and equals 0b11000001 in binary
194 takes 8 bits to represent, and equals 0b11000010 in binary
195 takes 8 bits to represent, and equals 0b11000011 in binary
196 takes 8 bits to represent, and equals 0b11000100 in binary
197 takes 8 bits to represent, and equals 0b11000101 in binary
198 takes 8 bits to represent, and equals 0b11000110 in binary
199 takes 8 bits to represent, and equals 0b11000111 in binary
200 takes 8 bits to represent, and equals 0b11001000 in binary
201 takes 8 bits to represent, and equals 0b11001001 in binary
202 takes 8 bits to represent, and equals 0b11001010 in binary
203 takes 8 bits to represent, and equals 0b11001011 in binary
204 takes 8 bits to represent, and equals 0b11001100 in binary
205 takes 8 bits to represent, and equals 0b11001101 in binary
206 takes 8 bits to represent, and equals 0b11001110 in binary
207 takes 8 bits to represent, and equals 0b11001111 in binary
208 takes 8 bits to represent, and equals 0b11010000 in binary
209 takes 8 bits to represent, and equals 0b11010001 in binary
210 takes 8 bits to represent, and equals 0b11010010 in binary
211 takes 8 bits to represent, and equals 0b11010011 in binary
212 takes 8 bits to represent, and equals 0b11010100 in binary
213 takes 8 bits to represent, and equals 0b11010101 in binary
214 takes 8 bits to represent, and equals 0b11010110 in binary
215 takes 8 bits to represent, and equals 0b11010111 in binary
216 takes 8 bits to represent, and equals 0b11011000 in binary
217 takes 8 bits to represent, and equals 0b11011001 in binary
218 takes 8 bits to represent, and equals 0b11011010 in binary
219 takes 8 bits to represent, and equals 0b11011011 in binary
220 takes 8 bits to represent, and equals 0b11011100 in binary
221 takes 8 bits to represent, and equals 0b11011101 in binary
222 takes 8 bits to represent, and equals 0b11011110 in binary
223 takes 8 bits to represent, and equals 0b11011111 in binary
224 takes 8 bits to represent, and equals 0b11100000 in binary
225 takes 8 bits to represent, and equals 0b11100001 in binary
226 takes 8 bits to represent, and equals 0b11100010 in binary
227 takes 8 bits to represent, and equals 0b11100011 in binary
228 takes 8 bits to represent, and equals 0b11100100 in binary
229 takes 8 bits to represent, and equals 0b11100101 in binary
230 takes 8 bits to represent, and equals 0b11100110 in binary
231 takes 8 bits to represent, and equals 0b11100111 in binary
232 takes 8 bits to represent, and equals 0b11101000 in binary
233 takes 8 bits to represent, and equals 0b11101001 in binary
234 takes 8 bits to represent, and equals 0b11101010 in binary
235 takes 8 bits to represent, and equals 0b11101011 in binary
236 takes 8 bits to represent, and equals 0b11101100 in binary
237 takes 8 bits to represent, and equals 0b11101101 in binary
238 takes 8 bits to represent, and equals 0b11101110 in binary
239 takes 8 bits to represent, and equals 0b11101111 in binary
240 takes 8 bits to represent, and equals 0b11110000 in binary
241 takes 8 bits to represent, and equals 0b11110001 in binary
242 takes 8 bits to represent, and equals 0b11110010 in binary
243 takes 8 bits to represent, and equals 0b11110011 in binary
244 takes 8 bits to represent, and equals 0b11110100 in binary
245 takes 8 bits to represent, and equals 0b11110101 in binary
246 takes 8 bits to represent, and equals 0b11110110 in binary
247 takes 8 bits to represent, and equals 0b11110111 in binary
248 takes 8 bits to represent, and equals 0b11111000 in binary
249 takes 8 bits to represent, and equals 0b11111001 in binary
250 takes 8 bits to represent, and equals 0b11111010 in binary
251 takes 8 bits to represent, and equals 0b11111011 in binary
252 takes 8 bits to represent, and equals 0b11111100 in binary
253 takes 8 bits to represent, and equals 0b11111101 in binary
254 takes 8 bits to represent, and equals 0b11111110 in binary
255 takes 8 bits to represent, and equals 0b11111111 in binary
256 takes 9 bits to represent, and equals 0b100000000 in binary

If you look carefully at the values in the output, you can notice some interesting bit patterns, e.g.:

1. Look at the bit patterns for the values of (2 ** n) - 1, i.e. values one less than each power of 2.
2. The same for the values halfway between any two adjacent powers of 2.

Notice any patterns or regularities?

The number columns in the output should really be right-justified, and the repeated (and hence redundant) text in between numbers in the rows should be replaced by a header line at the top, but this time, I've leaving this as an elementary exercise for the reader :)

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Share |

Wednesday, March 1, 2017

Show error numbers and codes from the os.errno module

By Vasudev Ram

While browsing the Python standard library docs, in particular the module os.errno, I got the idea of writing this small utility to display os.errno error codes and error names, which are stored in the dict os.errno.errorcode:

Here is the program, os_errno_info.py:

from __future__ import print_function
'''
os_errno_info.py
To show the error codes and 
names from the os.errno module.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys
import os

def main():
    
    print("Showing error codes and names\nfrom the os.errno module:")
    print("Python sys.version:", sys.version[:6])
    print("Number of error codes:", len(os.errno.errorcode))
    print("{0:>4}{1:>8}   {2:<20}    {3:<}".format(\
        "Idx", "Code", "Name", "Message"))
    for idx, key in enumerate(sorted(os.errno.errorcode)):
        print("{0:>4}{1:>8}   {2:<20}    {3:<}".format(\
            idx, key, os.errno.errorcode[key], os.strerror(key)))

if __name__ == '__main__':
    main()

And here is the output on running it:

$ py -2 os_errno_info.py >out2 && gvim out2
Showing error codes and names
from the os.errno module:
Python sys.version: 2.7.12
Number of error codes: 86
 Idx    Code   Name                    Message
   0       1   EPERM                   Operation not permitted
   1       2   ENOENT                  No such file or directory
   2       3   ESRCH                   No such process
   3       4   EINTR                   Interrupted function call
   4       5   EIO                     Input/output error
   5       6   ENXIO                   No such device or address
   6       7   E2BIG                   Arg list too long
   7       8   ENOEXEC                 Exec format error
   8       9   EBADF                   Bad file descriptor
   9      10   ECHILD                  No child processes
  10      11   EAGAIN                  Resource temporarily unavailable
  11      12   ENOMEM                  Not enough space
  12      13   EACCES                  Permission denied
  13      14   EFAULT                  Bad address
  14      16   EBUSY                   Resource device
  15      17   EEXIST                  File exists
  16      18   EXDEV                   Improper link
  17      19   ENODEV                  No such device
  18      20   ENOTDIR                 Not a directory
  19      21   EISDIR                  Is a directory
  20      22   EINVAL                  Invalid argument
  21      23   ENFILE                  Too many open files in system
  22      24   EMFILE                  Too many open files
  23      25   ENOTTY                  Inappropriate I/O control operation
  24      27   EFBIG                   File too large
  25      28   ENOSPC                  No space left on device
  26      29   ESPIPE                  Invalid seek
  27      30   EROFS                   Read-only file system
  28      31   EMLINK                  Too many links
  29      32   EPIPE                   Broken pipe
  30      33   EDOM                    Domain error
  31      34   ERANGE                  Result too large
  32      36   EDEADLOCK               Resource deadlock avoided
  33      38   ENAMETOOLONG            Filename too long
  34      39   ENOLCK                  No locks available
  35      40   ENOSYS                  Function not implemented
  36      41   ENOTEMPTY               Directory not empty
  37      42   EILSEQ                  Illegal byte sequence
  38   10000   WSABASEERR              Unknown error
  39   10004   WSAEINTR                Unknown error
  40   10009   WSAEBADF                Unknown error
  41   10013   WSAEACCES               Unknown error
  42   10014   WSAEFAULT               Unknown error
  43   10022   WSAEINVAL               Unknown error
  44   10024   WSAEMFILE               Unknown error
  45   10035   WSAEWOULDBLOCK          Unknown error
  46   10036   WSAEINPROGRESS          Unknown error
  47   10037   WSAEALREADY             Unknown error
  48   10038   WSAENOTSOCK             Unknown error
  49   10039   WSAEDESTADDRREQ         Unknown error
  50   10040   WSAEMSGSIZE             Unknown error
  51   10041   WSAEPROTOTYPE           Unknown error
  52   10042   WSAENOPROTOOPT          Unknown error
  53   10043   WSAEPROTONOSUPPORT      Unknown error
  54   10044   WSAESOCKTNOSUPPORT      Unknown error
  55   10045   WSAEOPNOTSUPP           Unknown error
  56   10046   WSAEPFNOSUPPORT         Unknown error
  57   10047   WSAEAFNOSUPPORT         Unknown error
  58   10048   WSAEADDRINUSE           Unknown error
  59   10049   WSAEADDRNOTAVAIL        Unknown error
  60   10050   WSAENETDOWN             Unknown error
  61   10051   WSAENETUNREACH          Unknown error
  62   10052   WSAENETRESET            Unknown error
  63   10053   WSAECONNABORTED         Unknown error
  64   10054   WSAECONNRESET           Unknown error
  65   10055   WSAENOBUFS              Unknown error
  66   10056   WSAEISCONN              Unknown error
  67   10057   WSAENOTCONN             Unknown error
  68   10058   WSAESHUTDOWN            Unknown error
  69   10059   WSAETOOMANYREFS         Unknown error
  70   10060   WSAETIMEDOUT            Unknown error
  71   10061   WSAECONNREFUSED         Unknown error
  72   10062   WSAELOOP                Unknown error
  73   10063   WSAENAMETOOLONG         Unknown error
  74   10064   WSAEHOSTDOWN            Unknown error
  75   10065   WSAEHOSTUNREACH         Unknown error
  76   10066   WSAENOTEMPTY            Unknown error
  77   10067   WSAEPROCLIM             Unknown error
  78   10068   WSAEUSERS               Unknown error
  79   10069   WSAEDQUOT               Unknown error
  80   10070   WSAESTALE               Unknown error
  81   10071   WSAEREMOTE              Unknown error
  82   10091   WSASYSNOTREADY          Unknown error
  83   10092   WSAVERNOTSUPPORTED      Unknown error
  84   10093   WSANOTINITIALISED       Unknown error
  85   10101   WSAEDISCON              Unknown error

In the above Python command line, you can of course skip the "&& gvim out2" part. It is just there to automatically open the output file in gVim (text editor) after the utility runs.

The above output was from running it with Python 2.
The utility is written to also work with Python 3.
To change the command line to use Python 3, just change 2 to 3 everywhere in the above Python command :)
(You need to install or already have py, the Python Launcher for Windows, for the py command to work. If you don't have it, or are not on Windows, use python instead of py -2 or py -3 in the above python command line - after having set your OS PATH to point to Python 2 or Python 3 as wanted.)

The only differences in the output are the version message (2.x vs 3.x), and the number of error codes - 86 in Python 2 vs. 101 in Python 3.
Unix people will recognize many of the messages (EACCES, ENOENT, EBADF, etc.) as being familiar ones that you get while programming on Unix.
The error names starting with W are probably Windows-specific errors. Not sure how to get the messages for those, need to look it up. (It currently shows "Unknown error" for them.)

This above Python utility was inspired by an earlier auxiliary utility I wrote, called showsyserr.c, as part of my IBM developerWorks article, Developing a Linux command-line utility (not the main utility described in the article). Following (recursively) the link in the previous sentence will lead you to the code for both the auxiliary and the main utility, as well as the PDF version of the article.

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Managed WordPress Hosting by FlyWheel

Follow me on: LinkedIn * Twitter

Share |

Saturday, February 11, 2017

tp, a simple text pager in Python

By Vasudev Ram

Yesterday I got this idea of writing a simple text file pager in Python.

Here it is, in file tp.py:

'''
tp.py
Purpose: A simple text pager.
Version: 0.1
Platform: Windows-only.
Can be adapted for Unix using tty / termios calls.
Only the use of msvcrt.getch() needs to be changed.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys
import string
from msvcrt import getch

def pager(in_fil=sys.stdin, lines_per_page=10, quit_key='q'):
    assert lines_per_page > 1 and lines_per_page == int(lines_per_page)
    assert len(quit_key) == 1 and \
        quit_key in (string.ascii_letters + string.digits)
    lin_ctr = 0
    for lin in in_fil:
        sys.stdout.write(lin)
        lin_ctr += 1
        if lin_ctr >= lines_per_page:
            c = getch().lower()
            if c == quit_key.lower():
                break
            else:
                lin_ctr = 0

def main():
    try:
        sa, lsa = sys.argv, len(sys.argv)
        if lsa == 1:
            pager()
        elif lsa == 2:
            with open(sa[1], "r") as in_fil:
                pager(in_fil)
        else:
            sys.stderr.write
            ("Only one input file allowed in this version")
                    
    except IOError as ioe:
        sys.stderr.write("Caught IOError: {}".format(repr(ioe)))
        sys.exit(1)

    except Exception as e:
        sys.stderr.write("Caught Exception: {}".format(repr(e)))
        sys.exit(1)

if __name__ == '__main__':
    main()

I added a couple of assertions for sanity checking.

The logic of the program is fairly straightforward:

- open (for reading) the filename given as command line argument, or just read (the already-open) sys.stdin
- loop over the lines of the file, lines-per-page lines at a time
- read a character from the keyboard (without waiting for Enter, hence the use of msvcrt.getch [1])
- if it is the quit key, quit, else reset line counter and print another batch of lines
- do error handling as needed

[1] The msvcrt module is on Windows only, but there are ways to get equivalent functionality on Unixen; google for phrases like "reading a keypress on Unix without waiting for Enter", and look up Unix terms like tty, termios, curses, cbreak, etc.

And here are two runs of the program that dogfood it, one directly with a file (the program itself) as a command-line argument, and the other with the program at the end of a pipeline; output is not shown since it is the same as the input file, in both cases; you just have to press some key (other than q (which makes it quit), repeatedly, to page through the content):

$ python tp.py tp.py

$type tp.py | python tp.py

I could have golfed the code a bit, but chose not to, in the interest of the Zen of Python. Heck, Python is already Zen enough.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf