Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In "The Practice of Programming", Brian Kernighan and Rob Pike use the Markov Chain Algorithm to illustrate a number of lessons about software design and implementation. I used it as my "Hello world" program when learning a new programming language.

In Python, the implementation is surprisingly succinct:

    import sys
    import random
    import collections

    MAXGEN = 10000
    NONWORD = ""
    suffixchain = collections.defaultdict(list)

    w1 = w2 = NONWORD
    for line in sys.stdin:
        for word in line.split():
            suffixchain[(w1, w2)].append(word)
            w1, w2 = w2, word
    suffixchain[(w1, w2)].append(NONWORD)

    w1 = w2 = NONWORD
    for i in range(MAXGEN):
        suffixes = suffixchain[(w1, w2)]
        next = random.choice(suffixes)
        if next is NONWORD:
            break
        print next,
        w1, w2 = w2, next
To use it, just run:

  $ python markov.py < english_corpus.txt


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: