Machine Learning Strategy

October 18th, 2017

In modern big era, machine learning strategy plays significant role in ultimate fate of a machine learning project. Let's say we have trained a classifier with 90% accuracy on test examples. But that accuracy is not be good enough for the application. To improve classifier’s accuracy, we can try a range of measures, for instance;

collect more data
collect more diverse training set
train algorithm for longer with gradient descent
try Adam instead of straight gradient descent
try bigger network, try smaller network
try dropout optimization method
try $l_2$ regularizaiton
try different network architectures - activation functions, # of hidden units etc.

There could be a lots of good ideas to improve deep learning algorithm. But the problem we might pick up the wrong idea and spend months without substantial improvement in algorithm accuracy. For instance, collecting data for months is one of the most wrongly picked option in machine learning. Therefore, it is worth to have a machine learning strategy, with which we can evaluate our options and pick the one which is most promising.

Orthogonalization

Parameter tuning is one of the area where we have to picking right parameter to tune first from many possible ones.

Orthogonalization is about what to tune to achieve one effect - knowingly it is trait of successful machine learning practitioners.

That means we need to separate tuning knob for required effect, rather than a working on combined knob for multiple aspects. For instance, in machine learning algorithm, we need perform well on following fronts i.e. for 4 different effects;

Training set (on cost function) ~ human level performance
Dev set
Test set
Real world

According to orthogonalization, we must achieve one effect from each at a time - training set first and real world at the end.

Note: this article is inspired from Andrew Ng lecture.