Thursday, February 26, 2009

Hacking Up Statistics for Fun and Profit

The more I'm learning about inference and learning, the more I'm convinced that statistics is the "correct" way. My subject goes beyond statistics, too. For example, if you want to maximize something, you take the derivative and set it to 0 (and look at the second derivative to be sure you have a max and not a min or saddle point).

In contrast to this, some machine learning is full of hacks that seemed like good ideas at the time: neural networks, decision trees, and so on. Many algorithms seem ad hoc. Now I'm being someone over-the-top in my assessment. Lots of those folks have been more formal than I ever really want to be. But it's also so removed from traditional techniques at times.

Well, my current conjecture is that if there is a "right" way to do something (and that's a strong statement in algorithms), you should still be able to see how well your "hack" approximates to it. And meanwhile, it might be fun. And, you might still do well enough but be faster or have other desirable properties absent from the right way.

Well, I'm speaking from ignorance anyway. Just some thoughts on my mind.

1 comment: