"The goal is to make at least one full pass over a document, and classify every character into some meaningful
category, with a high level of robustness when faced with bad human editing." Interesting, but the "It" that does all this work sounds more like a human than a computer. If someone had figured out how to make computers be "very flexible at reading dates in human formats", we wouldn't have thousands of programmers sifting through lines of legacy code looking for the year 1900/2000. via
kottke.org.