Tuesday, August 30, 2011

Delving in Theory!

Disclaimer: theory is not my forte, since I did not study a math undergrad. This post may be biased of what I think Theory means in the specific area of Machine Learning.


Do you like that iPod you have , does your smartphone apps keep you busy and entertained.  How much do you use a computer for your daily work? Do you know whom do we owe the computer and all the electronic devices to?
Quantum Mechanics (QM)! Material Engineers have to take several courses on QM to create transistors, which electric engineers then use to create computers (roughly speaking), and the Computer Scientists use this computers to create algorithms and spawn companies like Google, Facebook or Foursquare. Thus creating worldwide connectivity and interaction. All of this because QM.

Like you could see in the example, a plain theory like QM spawned an enormous amount of wealth and applications. And we can say the same about Machine Learning. Let's put an example, Support Vector Machines (SVM).

There are 2 ways to do research with SVM's, one is to do research on SVM's and the other is to do research using SVM's. In the latter case, you probably will be an application man. But in the former, you will take care of the deepest construction of the SVM. You'll care not only on how it works, but also on how to improve it. Which kind of kernel do you have to use, and in the extreme, design an entirely new kernel to work with the data you have.

Kernel? What's a kernel? It's safe to assume that some people whom uses SVM's have no idea of what a kernel is, or means. But they do know SVM's are pretty good doing classification. SVM's are one of the (few) algorithms that you can take off the shelf and run it over your data without pre-processing or knowledge of the algorithm...... is a magic box that classifies.

But if you knew what a kernel was, you'll have incredible tools at your disposition. Knowing that, now you can choose the best one (I said choose, not guess) for the data you are using. You can also go as far as to try to increase the classification capabilities of the SVM. Thus obtaining incredibly good results. The catch...... you'll probably spend more time modifying the SVM than analyzing your data. But in any good journal or conference on Learning Theory, they'll care little about the test data, and car a lot on the algorithm. (Note to myself: is no use to mention these qualities of your algorithm in an application conference, most of them would not care about the algorithm, but the results)

AS you can see, when you do research ON a ML topic, you do not care deeply about the application at hand. Rather, you care about the algorithm's performance. Or you also care about the best way to represent specific data: images, characters, sounds, etc.

Doing research in ML theory, as I mentioned in the last post, is deep and hard work. You will rarely delve into a specific application, but you'll know tons of different theories. It's also likely that you'll spend most of your time reading books and not doing actual programs or simulations. At the best, you'll be doing it 50/50.

Note: In a previous post, someone asked me to put references to the claim: "While they (Theory man) go over 200 year old proofs, you'll (application man) go over 10 year old proofs, and while their proofs are 20 pages long, yours will be about 1 page (average)."
I really do not have time to go over different applications, their theories and proofs. But bare in mind, while most of the proofs of some of the Machine Learning Papers fit in a 20 page paper, Fermat's Theorem Proof, published in 1993 was about 100 pages long. If you read Bishop's Book on Machine Learning, you'll find most of the basic principles have "easy"proofs of about half a page.


Next time, we'll go over the dreadful fact of writing papers. When to do it, and why to do it.

Until next time

Monday, August 15, 2011

The Application Man!!!

Ask any company how much would they pay for a software, that would allow them to predict in a more reliable way how their products are going to sell. Or how much the competitors are capable of pushing the prices down before losing all profit. Let me tell you...... a lot.

Application, definitively the sweet spot of Machine Learning, it is where most of the money is made and is the thing that most people will relate to when they hear you do something AI related. Barely no one has heard of PageRank, yet everyone knows the Google Company, the same happens with pretty much every machine learning application out there, or any other theory. For example, Quantum Mechanics are the basis for most of the modern electronics we have.

So, if you don't really care on how the algorithms are working, and just care about applying the algorithms for the fame and glory, maybe developing an application might be right for you. Not to say you must entirely disregard the math, just that the math you'll have to go over won't be as dense as the ones statisticians have to use. While they go over 200 year old proofs, you'll go over 10 year old proofs, and while their proofs are 20 pages long, yours will be about 1 page (average).

I can't really emphasize how important is for you to learn the math behind the applications, even if you do not know exactly how is it working, at least is good that you have an idea of what the algorithm is doing. This way, if you have any errors, you can look for solutions in the right places instead of changing variables praying for something good to occur. Also you have bragging rights that you know more math than your peers at undergrad working as software engineers.

There are different machine learning applications, ranking, natural language processing, image processing, activity recognition, etc. However each of this problems have the difficulties and challenges. And different people may be suited for different applications.

How to find the application that best suits you? In my personal point of view, go over you passion. If you like dinosaurs, maybe you could apply recognition algorithms to detect structures and patterns in x-ray scans in the bones. If however, you like financial data, there are some work using Game Theory, and probability to increase your profit.

It would be crazy to try and list every laboratory that has an application and uses machine learning to solve it, instead, I'll list some of the most common machine learning algorithms, and how can you use them.
Of course this list is not exclusive, and there are thousands of different algorithms for the different applications, this is just to give you a head start of where to look and which algorithms you may find interesting to go over. I've chosen some of the most recent algorithms used to solve this problems, those published in ICML and NIPS from the last 10 years.
  • Financial Data: Markov Chains, Learning, Regression, Gaussian Processes, SVM
  • Robotics: Kalman Filter, Markov Chain Monte Carlo Methods, Markov Decision Process (Reinforcement Learning), SVM
  • Biology: Network Structure, Clustering, Network Parameters, Dirichlet Processes, Indian Buffet Processes, SVM
  • Vision: Markov Random Fields, Belief Networks, Neural Networks, SVM
  • Natural Language Processing: Conditional Random Fields, Latent Dirichlet Allocation, Mixture Models, SVM
Note: SVM's can solve everything, from a Biology inference to your dishwasher, and are very good out of the box using algorithms.  

In future posts we will go over the algorithms and explain them. So if you have a particular interest, stay tuned, because this is just getting interesting.

If you wish to contact me, you can always do so at my Twitter account @leonpalafox, my Google+ account, or my personal webpage www.leonpalafox.com

Take care, and see you next time