Monday, January 16, 2017

Classifying letters and counting their frequencies

By Vasudev Ram

Here is a program that takes a string as input and classifies the characters in it, into vowels or consonants. It also counts the frequencies of each vowel separately and the frequencies of all consonants together - it is a contrived problem, of course, for teaching purposes.

I gave it as an example / exercise for a Python class recently, then modified / enhanced it slightly for this post.

It is fairly simple, but happens, partly by chance, to illustrate the use of multiple Python language features in under 35 or so lines of code.
I say "partly by chance" because, after writing it initially and then noticing that it used multiple language features, I thought of, and added a few more (for the same program functionality as earlier), trying not to be too artificial or contrived about it :)

(The statement that creates the input string s, is of course contrived, but it does manage to illustrate the ''.join(lis) idiom, and string 'multiplication' - and also use of backslash to continue lines, if you can call that a language feature. Also, the string multiplication (as used), though contrived, does allow us to quickly find the frequencies of either vowels or consonants, so it has a use.)

Some of the Python language features used are:
- dictionary comprehensions (dict comps)
- continue statement (beginners sometimes ask what it is used for)
- the .get() method of dicts - the 2-argument version, that allows you to avoid an if/else when counting frequencies
- returning multiple values from a function
- tuple unpacking (of the multiple values returned as a tuple)
- the ''.join(lis) idiom to join the characters (or strings) in a list, into a single string
- string 'multiplication' by an integer (a shorthand for repeating the string n times)
- assert statements for checking post-conditions

Here is the program,
# Classify input characters as vowels or consonants.
# Count frequencies of each vowel.
# Count total frequency of all consonants together.
# Author: Vasudev Ram
# Copyright 2017 Vasudev Ram
# Web site:
# Blog:
# Product store:

import string

VOWELS = 'aeiou'

def classify_letters(input):
    vowel_freqs = { vowel: 0 for vowel in VOWELS }
    consonants = 0
    for c in input:
        if not (c in string.ascii_lowercase):
        if c in VOWELS:
            vowel_freqs[c] = vowel_freqs.get(c, 0) + 1
            consonants += 1
    return vowel_freqs, consonants

s = ''.join(['a' * 1, 'b' * 1, 'c' * 2, 'd' * 3, 'e' * 2, 'f' * 4, \
    'g' * 5, 'h' * 6, 'i' * 3, 'j' * 7, 'k' * 8, 'l' * 9, 'm' * 10, \
    'n' * 11, 'o' * 4, 'p' * 12, 'q' * 13, 'r' * 14, 's' * 15, \
    't' * 16, 'u' * 5, 'w' * 17, 'y' * 18, 'z' * 19])

print "Classifying letters in string:", s
print '-' * 70
vowel_freqs, consonants = classify_letters(s)
print 'vowel freqs:', vowel_freqs
print 'consonants total freq:', consonants
print '-' * 70
print 'Checking results:'
assert len(s) == sum(vowel_freqs.values()) + consonants
print 'OK'
And here is the output on running it:
$ py -2
Classifying letters in string: abccdddeeffffggggghhhhhhiiijjjjjjjkkkkkkkklllllll
vowel freqs: {'a': 1, 'i': 3, 'e': 2, 'u': 5, 'o': 4}
consonants total freq: 190
Checking results:
I used an assert to do a sanity check of the values computed with the number of letters in the original string.
Putting the print 'OK' after the assert has a nice side effect that if the assert does not trigger, the program prints OK, but if it does trigger, it does not print OK, but an error message.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel

No comments: