Friday, June 1, 2018

Porting a prime checker from Q to Python (with improvements)

By Vasudev Ram


Q => Py

Hi readers,

I was checking out the Q programming language.

It's an interesting language, a descendant of APL, that supports array and functional programming styles. Some information about Q from its Wikipedia page:

Paradigm: Array, functional
Designed by: Arthur Whitney
Developer: Kx Systems
First appeared: 2003
Typing discipline: Dynamic, strong
Influenced by: A+, APL, Scheme, K

Q is a proprietary array processing language developed by Arthur Whitney and commercialized by Kx Systems. The language serves as the query language for kdb+, a disk based and in-memory, column-based database. kdb+ is based upon K, a terse variant of APL. Q is a thin wrapper around K, providing a more readable, English-like interface.

I had tried out Q a bit recently, using one of the tutorials.

Then, while reading the Q Wikipedia page again, I saw this paragraph:

[
When x is an integer greater than 2, the following function will return 1 if it is a prime, otherwise 0:

{min x mod 2_til x}
The function is evaluated from right to left:

"til x" enumerate the positive integers less than x.
"2_" drops the first two elements of the enumeration (0 and 1).
"x mod" performs modulo division between the original integer and each value in the truncated list.
"min" find the minimum value of the list of modulo result.
]

It's basically an algorithm in Q to find whether a given number x (> 2) is prime or not [1]. The algorithm returns 1 if x is a prime, else 0. Notice how concise it is. That conciseness is a property of the APL family of languages, such as APL, J and Q. In fact Q is slightly less concise than some of the others, I've read, because it puts a thin English-like wrapper on the hard-core APL symbolic syntax. But all of them are still very concise.

So I thought of implementing that algorithm in Python, just for fun. I first wrote a naive version, more or less a port of the Q version. Then I rewrote that first version in a more functional way. Then I realized that there are other opportunities for improving the code [2], and implemented a few of them.

So I combined the few different versions of the is_prime_* functions (where * = 1, 2, 3, etc.) that I had written, in a single program, with a driver function to exercise all of them. The code is in file is_prime.py, shown further below. There are comments in the program that explain the logic and improvements or differences of the various is_prime function versions.

[1] Prime_number

[2] There are obviously many other ways of checking if numbers are prime, and many of them are faster than these approaches; see [3]. These are not among the most efficient ways, or even close; I was just experimenting with different ways of rewriting and refactoring the code, after the initial port from Q to Python. Even the original Q version in the Wikipedia page was not meant to be a fast version, since it does not use any of the improvements I mention below, let alone using any special Q-specific algorithm or language feature, or advanced general prime-finding algorithms.

[3] Primality test

There might still be some scope for further changes or improvements, some of which I may do in a future post. A few such improvements are:

1. Don't divide x by all values up to x - 1. Instead, only check up to the square root of x. This is a standard improvement often taught to programming beginners; in fact, it is mentioned in the Wikipedia articles about primes.

2. Terminate early when the first remainder equal to 0 is found. This avoids unnecessarily computing all the other remainders. I did that in is_prime_3().

3. Don't check for divisibility by any even number > 2 (call it b), because if any such b divides x evenly, then so will 2, and we would have checked earlier if 2 divides x evenly.

4. Create a function for the output statements, most of which are common across the different is_prime versions, and call that function instead from those places.

5. Use a generator to lazily yield divisors, and when any divisor gives a zero remainder, exit early, since it means the number is not prime. Done in is_prime_4().

Other ways of doing it may include use of some other functional programming features of Python such as filter(), itertools.takewhile/dropwhile, etc. (The itertools module has many functions and is very interesting, but that is a subject for a different post.)

I also observed some interesting behavior when running the program with large ranges of inputs for prime number checking. Will analyze that a bit and write about my opinions on that in a future post.

Here is the code for is_prime.py:
# File: isprime.py
# A port of a primality checking algorithm from the Q language to Python, 
# plus a few improvements / variations, using Python features.
# Ref: https://en.wikipedia.org/wiki/Q_(programming_language_from_Kx_Systems)
# Search for the word "prime" in that page to see the Q code.

# Author: Vasudev Ram
# Copyright 2018 Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

from __future__ import print_function
from debug1 import debug1

import sys

# Naive, mostly procedural port from the Q version.
def is_prime_1(x):
    # Guard against invalid argument.
    assert x > 2, "in is_prime_1: x > 2 failed"
    # The range of integers from 0 to x - 1
    til_x = range(x)
    # The range without the first 2 items, 0 and 1.
    divs = til_x[2:]
    # The remainders after dividing x by each integer in divs.
    mods = map(lambda d: x % d, divs)
    # x is prime if the minimum-valued remainder equals 1.
    return min(mods) == 1

# Shorter, more functional version, with nested calls 
# to min, map and range.
def is_prime_2(x):
    assert x > 2, "in is_prime_2: x > 2 failed"
    # Eliminate slicing used in is_prime_1, just do range(2, x).
    return min(map(lambda d: x % d, range(2, x))) == 1

# Early-terminating version, when 1st remainder equal to 0 found, 
# using a list for the range of divisors.
def is_prime_3(x):
    assert x > 2, "in is_prime_3: x > 2 failed"
    divs = range(2, x)
    # Check if x is divisible by any integer in divs; if so, 
    # x is not prime, so terminate early.
    debug1("in is_prime_3, x", x)
    for div in divs:
        debug1("  in loop, div", div)
        if x % div == 0:
            return False
    # If we reach here, x was not divisible by any integer in 
    # 2 to x - 1, so x is prime.
    return True

# Generator function to yield the divisors one at a time, to 
# avoid creating the whole list of divisors up front.
def gen_range(start, x):
    assert start > 0, "in gen_range, start > 0 failed"
    assert x > start, "in gen_range, x > start failed"
    i = start
    while i < x:
        yield i
        i += 1

# Early-terminating version, when 1st remainder equal to 0 found, 
# using a generator for the range of divisors.
def is_prime_4(x):
    assert x > 2, "in is_prime_4, x > 2 failed"
    divs = gen_range(2, x)
    debug1("in is_prime_4, x", x)
    for div in divs:
        debug1("  in loop, div", div)
        if x % div == 0:
            return False
    return True

def check_primes(low, high):
    assert low <= high, "in check_primes, low <= high failed"

    """
    print("\nWith a for loop:")
    for x in range(low, high + 1):
        print(x, "is" if is_prime_1(x) else "is not", "prime,", end=" ")
    print()
    """

    print("\nWith nested function calls:")
    output = [ str(x) + (" prime" if is_prime_2(x) else " not prime") \
        for x in range(low, high + 1) ]
    print(", ".join(output))

    print("\nWith a list of divisors and early termination:")
    output = [ str(x) + (" prime" if is_prime_3(x) else " not prime") \
        for x in range(low, high + 1) ]
    print(", ".join(output))

    print("\nWith a generator of divisors and early termination:")
    output = [ str(x) + (" prime" if is_prime_4(x) else " not prime") \
        for x in range(low, high + 1) ]
    print(", ".join(output))

def main():
    try:
        low = int(sys.argv[1])
        high = int(sys.argv[2])
        if low <= 2:
            print("Error: Low value must be > 2.")
            sys.exit(1)
        if high < low:
            print("Error: High value must be >= low value.")
            sys.exit(1)
        print("Checking primality of integers between {} and {}".format(low, high))
        check_primes(low, high)
        sys.exit(0)
    except ValueError as ve:
        print("Caught ValueError: {}".format(str(ve)))
    except IndexError as ie:
        print("Caught IndexError: {}".format(str(ie)))
    sys.exit(1)

if __name__ == '__main__':
    main()
And here are some runs of the output, below, both for normal and error cases. Note that I used my debug1() debugging utility function in a few places, to show what divisors are being used, in a few places. This helps show that the early termination logic works. To turn off debugging output, simply use the -O option, like this example:

python -O is_prime.py other_args

This improved version of the debug1 function (get it here), unlike the earlier version that was shown in this blog post:

A simple Python debugging function

, does not require the user to set any environment variables like VR_DEBUG, since it uses Python's built-in __debug__ variable instead. So to enable debugging, nothing extra needs to be done, since that variable is set to True by default. To disable debugging, all we have to do is pass the -O option on the python command line.

Here is the prime program's output:

Try this one later (if you're trying out the program), since it takes 
longer to run. You may observe some interesting behavior:

$ python -O is_prime.py 3 10000 | less

where less is the Unix 'less' command, a text pager. Any command-line text pager 
that can read standard input (from a pipe) will work.

Try some of these below (both normal and error cases) before the one above:

Some error-handling cases:

$ python -O is_prime.py 0 0
Error: Low value must be > 2.

$ python -O is_prime.py 2 0
Error: Low value must be > 2.

$ python -O is_prime.py 3 0
Error: High value must be >= low value.

$ python -O is_prime.py 4 2
Error: High value must be >= low value.

Some normal cases:

$ python -O is_prime.py 3 3
Checking primality of integers between 3 and 3

With nested function calls:
3 prime

With a list of divisors and early termination:
3 prime

With a generator of divisors and early termination:
3 prime

To show that the early termination logic works, run the program without 
the -O option. 
Here is one such run. Due to more debugging output, I've only checked 
two numbers, 4 and 5. But you can try with any number of values if you 
page the output, or redirect it to a file.

$ python is_prime.py 4 5
Checking primality of integers between 4 and 5

With nested function calls:
4 not prime, 5 prime

With a list of divisors and early termination:
in is_prime_3, x: 4
  in loop, div: 2
in is_prime_3, x: 5
  in loop, div: 2
  in loop, div: 3
  in loop, div: 4
4 not prime, 5 prime

With a generator of divisors and early termination:
in is_prime_4, x: 4
  in loop, div: 2
in is_prime_4, x: 5
  in loop, div: 2
  in loop, div: 3
  in loop, div: 4
4 not prime, 5 prime

You can see from the above run that for 4, the checking stops early, 
at the first divisor (2), in fact, because it evenly divides 4.
But for 5, all divisors from 2 to 4 are checked, because 5 has 
no prime factors (except itself and 1).

And here is a run checking for primes between 3 and 30:

$ python -O is_prime.py 3 30
Checking primality of integers between 3 and 30

With nested function calls:
3 prime, 4 not prime, 5 prime, 6 not prime, 7 prime, 8 not prime, 9 not prime, 
10 not prime, 11 prime, 12 not prime, 13 prime, 14 not prime, 15 not prime, 
16 not prime, 17 prime, 18 not prime, 19 prime, 20 not prime, 21 not prime, 
22 not prime, 23 prime, 24 not prime, 25 not prime, 26 not prime, 27 not prime, 
28 not prime, 29 prime, 30 not prime

With a list of divisors and early termination:
3 prime, 4 not prime, 5 prime, 6 not prime, 7 prime, 8 not prime, 9 not prime, 
10 not prime, 11 prime, 12 not prime, 13 prime, 14 not prime, 15 not prime, 
16 not prime, 17 prime, 18 not prime, 19 prime, 20 not prime, 21 not prime, 
22 not prime, 23 prime, 24 not prime, 25 not prime, 26 not prime, 27 not prime, 
28 not prime, 29 prime, 30 not prime

With a generator of divisors and early termination:
3 prime, 4 not prime, 5 prime, 6 not prime, 7 prime, 8 not prime, 9 not prime, 
10 not prime, 11 prime, 12 not prime, 13 prime, 14 not prime, 15 not prime, 
16 not prime, 17 prime, 18 not prime, 19 prime, 20 not prime, 21 not prime, 
22 not prime, 23 prime, 24 not prime, 25 not prime, 26 not prime, 27 not prime, 
28 not prime, 29 prime, 30 not prime

You can see that all three functions (is_prime_2 to 4) give the same results. 
(I commented out the call to the naive function version, is_prime_1, after 
a few runs (not shown), so none of these outputs shows its results, but they 
are the same as the others, except for minor formatting differences, due to 
the slightly different output statements used.

I also timed the program for finding primes up to 1000 and 10,000 
(using my own simple command timing program written in Python - not shown).

Command: python -O is_prime.py 3 1000
Time taken: 2.79 seconds
Return code: 0

Command: python -O is_prime.py 3 10000
Time taken: 66.28 seconds
Return code: 0

Related links:

Q (programming_language from Kx_Systems

Kx Systems

Kdb+

Column-oriented DBMS

In-memory database

If you want to try out Q, Kx Systems a free version available for download for non-commercial use, here: Q downloads

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get fast web hosting with A2Hosting.com

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Email marketing for professional bloggers



1 comment:

Vasudev Ram said...

Typo. This text fragment:

>If you want to try out Q, Kx Systems a free version available

should read:

If you want to try out Q, Kx Systems has a free version available