Statistics Based Lorem Ipsum Generator

Several months ago I was wondering to myself if you could detect patterns in a list of words and then use those patterns to generate a new list of fake “words”. A kind of a flavored Lorem Ipsum generator, where I could change the feel of the sentences by switching out the data set.  I’m sure thousands of developers have done this before me – but it was a fun thought experiment.

I threw together a little script in PHP to test the idea, and the results were kind of interesting so I figured I'd throw it on my site - Anyway, here are some examples in action – you can hit reload for a random ipsum!  The code and a sample dataset are below.

Faux English Demo
Faux Latin-y English Demo
Faux Japanese Demo

You need to supply a set of words for it to base the pattern off - here is my sample dataset.

