Monday, January 16, 2017

Classifying letters and counting their frequencies

By Vasudev Ram

Here is a program that takes a string as input and classifies the characters in it, into vowels or consonants. It also counts the frequencies of each vowel separately and the frequencies of all consonants together - it is a contrived problem, of course, for teaching purposes.

I gave it as an example / exercise for a Python class recently, then modified / enhanced it slightly for this post.

It is fairly simple, but happens, partly by chance, to illustrate the use of multiple Python language features in under 35 or so lines of code.
I say "partly by chance" because, after writing it initially and then noticing that it used multiple language features, I thought of, and added a few more (for the same program functionality as earlier), trying not to be too artificial or contrived about it :)

(The statement that creates the input string s, is of course contrived, but it does manage to illustrate the ''.join(lis) idiom, and string 'multiplication' - and also use of backslash to continue lines, if you can call that a language feature. Also, the string multiplication (as used), though contrived, does allow us to quickly find the frequencies of either vowels or consonants, so it has a use.)

Some of the Python language features used are:
- dictionary comprehensions (dict comps)
- continue statement (beginners sometimes ask what it is used for)
- the .get() method of dicts - the 2-argument version, that allows you to avoid an if/else when counting frequencies
- returning multiple values from a function
- tuple unpacking (of the multiple values returned as a tuple)
- the ''.join(lis) idiom to join the characters (or strings) in a list, into a single string
- string 'multiplication' by an integer (a shorthand for repeating the string n times)
- assert statements for checking post-conditions

Here is the program, classify_letters1.py:
# classify_letters1.py
# Classify input characters as vowels or consonants.
# Count frequencies of each vowel.
# Count total frequency of all consonants together.
# Author: Vasudev Ram
# Copyright 2017 Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

import string

VOWELS = 'aeiou'

def classify_letters(input):
    vowel_freqs = { vowel: 0 for vowel in VOWELS }
    consonants = 0
    for c in input:
        if not (c in string.ascii_lowercase):
            continue
        if c in VOWELS:
            vowel_freqs[c] = vowel_freqs.get(c, 0) + 1
        else:
            consonants += 1
    return vowel_freqs, consonants

s = ''.join(['a' * 1, 'b' * 1, 'c' * 2, 'd' * 3, 'e' * 2, 'f' * 4, \
    'g' * 5, 'h' * 6, 'i' * 3, 'j' * 7, 'k' * 8, 'l' * 9, 'm' * 10, \
    'n' * 11, 'o' * 4, 'p' * 12, 'q' * 13, 'r' * 14, 's' * 15, \
    't' * 16, 'u' * 5, 'w' * 17, 'y' * 18, 'z' * 19])

print "Classifying letters in string:", s
print '-' * 70
vowel_freqs, consonants = classify_letters(s)
print 'vowel freqs:', vowel_freqs
print 'consonants total freq:', consonants
print '-' * 70
print 'Checking results:'
assert len(s) == sum(vowel_freqs.values()) + consonants
print 'OK'
And here is the output on running it:
$ py -2 classify_letters1.py
Classifying letters in string: abccdddeeffffggggghhhhhhiiijjjjjjjkkkkkkkklllllll
llmmmmmmmmmmnnnnnnnnnnnooooppppppppppppqqqqqqqqqqqqqrrrrrrrrrrrrrrssssssssssssss
sttttttttttttttttuuuuuwwwwwwwwwwwwwwwwwyyyyyyyyyyyyyyyyyyzzzzzzzzzzzzzzzzzzz
----------------------------------------------------------------------
vowel freqs: {'a': 1, 'i': 3, 'e': 2, 'u': 5, 'o': 4}
consonants total freq: 190
----------------------------------------------------------------------
Checking results:
OK
I used an assert to do a sanity check of the values computed with the number of letters in the original string.
Putting the print 'OK' after the assert has a nice side effect that if the assert does not trigger, the program prints OK, but if it does trigger, it does not print OK, but an error message.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Thursday, January 12, 2017

Python video: Loop like a native - by Ned Batchelder

By Vasudev Ram

I saw this really good Python video a while ago, and remembered it today in the context of some Python work I was doing.

It is by Ned Batchelder at PyCon US 2013, The title of the talk is:

Loop like a native: while, for, iterators, generators

The video is also embedded below.

Basically it is about more idiomatic ways of looping in Python, which can often lead to shorter, clearer, less redundant code.




- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Wednesday, January 11, 2017

Two simple Python object introspection functions

By Vasudev Ram



While browsing some Python code and docs, I recently got the idea for, and wrote, these two simple convenience functions for introspecting Python objects.

The function oa (for object attributes) can be used to get the attributes of any Python object:
def oa(o):
    for at in dir(o):
        print at,
(The reason why I don't just type dir(o) instead of using oa(o) (for some object o), is because in IPython (though not in vanilla Python), doing just dir(o) displays the attributes in a vertical line, so the output scrolls off the screen if there are many attributes, while the oa() function prints them horizontally, so the output fits in a few lines without scrolling off.)

And running oa() a few times in the Python shell, gives (shell prompts removed):
oa({})
__class__ __cmp__ __contains__ __delattr__ __delitem__ __doc__ __eq__ __format__
__ge__ __getattribute__ __getitem__ __gt__ __hash__ __init__ __iter__ __le__ __len__ 
__lt__ __ne__ __new__ __reduce__ __reduce_ex__ __repr__ __setattr__ __setitem__ 
__sizeof__ __str__ __subclasshook__ clear copy fromkeys get has_key items
iteritems iterkeys itervalues keys pop popitem setdefault update values viewitems 
viewkeys viewvalues

# object attributes of a list:
oa([])
__add__ __class__ __contains__ __delattr__ __delitem__ __delslice__ __doc__ __eq__ 
__format__ __ge__ __getattribute__ __getitem__ __getslice__ __gt__ __hash__ __iadd__ 
__imul__ __init__ __iter__ __le__ __len__ __lt__ __mul__ __ne__ __new__
__reduce__ __reduce_ex__ __repr__ __reversed__ __rmul__ __setattr__ __setitem__
__setslice__ __sizeof__ __str__ __subclasshook__ append count extend index insert 
pop remove reverse sort

# object attributes of an int:
oa(1)
__abs__ __add__ __and__ __class__ __cmp__ __coerce__ __delattr__ __div__ __divmod__ 
__doc__ __float__ __floordiv__ __format__ __getattribute__ __getnewargs__ __hash__ 
__hex__ __index__ __init__ __int__ __invert__ __long__ __lshift__ __mod__
__mul__ __neg__ __new__ __nonzero__ __oct__ __or__ __pos__ __pow__ __radd__ __rand__ 
__rdiv__ __rdivmod__ __reduce__ __reduce_ex__ __repr__ __rfloordiv__ __rlshift__ 
__rmod__ __rmul__ __ror__ __rpow__ __rrshift__ __rshift__ __rsub__ __rtruediv__ 
__rxor__ __setattr__ __sizeof__ __str__ __sub__ __subclasshook__ __truediv__ 
__trunc__ __xor__ bit_length conjugate denominator imag numerator real

The function oar (for object attributes regular, meaning exclude the special or "dunder" methods, i.e. those starting and ending with a double underscore) can be used to get only the "regular" attributes of any python object.
def oar(o):
    for at in dir(o):
        if not at.startswith('__') and not at.endswith('__'):
            print at,
The output from running it:
# regular object attributes of a dict:
oar({})
clear copy fromkeys get has_key items iteritems iterkeys itervalues keys pop popitem 
setdefault update values viewitems viewkeys viewvalues

# regular object attributes of an int:
oar(1)
bit_length conjugate denominator imag numerator real

# regular object attributes of a string:
oar('')
_formatter_field_name_split _formatter_parser capitalize center count decode encode 
endswith expandtabs find format index isalnum isalpha isdigit islower isspace 
istitle isupper join ljust lower lstrip partition replace rfind rindex rjust rpartition 
rsplit rstrip split splitlines startswith strip swapcase title translate upper zfill

Here are some more posts about Python introspection.

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Sunday, January 8, 2017

An Unix seq-like utility in Python

By Vasudev Ram


Due to a chain (or sequence - pun intended :) of thoughts, I got the idea of writing a simple version of the Unix seq utility (command-line) in Python. (Some Unix versions have a similar command called jot.)

Note: I wrote this program just for fun. As the seq Wikipedia page says, modern versions of bash can do the work of seq. But this program may still be useful on Windows - not sure if the CMD shell has seq-like functionality or not. PowerShell probably has it, is my guess.)

The seq command lets you specify one or two or three numbers as command-line arguments (some of which are optional): the start, stop and step values, and it outputs all numbers in that range and with that step between them (default step is 1). I have not tried to exactly emulate seq, instead I've written my own version. One difference is that mine does not support the step argument (so it can only be 1), at least in this version. That can be added later. Another is that I print the numbers with spaces in between them, not newlines. Another is that I don't support floating-point numbers in this version (again, can be added).

The seq command has more uses than the above description might suggest (in fact, it is mainly used for other things than just printing a sequence of numbers - after all, who would have a need to do that much). Here is one example, on Unix (from the Wikipedia article about seq):
# Remove file1 through file17:
for n in `seq 17`
do
    rm file$n
done
Note that those are backquotes or grave accents around seq 17 in the above code snippet. It uses sh / bash syntax, so requires one of them, or a compatible shell.

Here is the code for seq1.py:
'''
seq1.py
Purpose: To act somewhat like the Unix seq command.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys

def main():
    sa, lsa = sys.argv, len(sys.argv)
    if lsa < 2:
        sys.exit(1)
    try:
        start = 1
        if lsa == 2:
            end = int(sa[1])
        elif lsa == 3:
            start = int(sa[1])
            end = int(sa[2])
        else: # lsa > 3
            sys.exit(1)
    except ValueError as ve:
        sys.exit(1)

    for num in xrange(start, end + 1):
        print num, 
    sys.exit(0)
    
if __name__ == '__main__':
    main()
And here are a few runs of seq1.py, and the output of each run, below:
$ py -2 seq1.py

$ py -2 seq1.py 1
1

$ py -2 seq1.py 2
1 2

$ py -2 seq1.py 3
1 2 3

$ py -2 seq1.py 1 1
1

$ py -2 seq1.py 1 2
1 2

$ py -2 seq1.py 1 3
1 2 3

$ py -2 seq1.py 4
1 2 3 4

$ py -2 seq1.py 1 4
1 2 3 4

$ py -2 seq1.py 2 2
2

$ py -2 seq1.py 5 3

$ py -2 seq1.py -6 -2
-6 -5 -4 -3 -2

$ py -2 seq1.py -4 -0
-4 -3 -2 -1 0

$ py -2 seq1.py -5 5
-5 -4 -3 -2 -1 0 1 2 3 4 5

There are many other possible uses for seq, if one uses one's imagination, such as rapidly generating various filenames or directory names, with numbers in them (as a prefix, suffix or in the middle), for testing or other purposes, etc.

- Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Friday, January 6, 2017