A vast amount of literature in machine learning served the purpose of showing the superiority of some learning methods over the others. To support this claim, qualitative considerations and thousands of experimental simulations have been submitted to the scientific community.
If there was a universally best learning machine, research on machine learning would be unnecessary: we would use it all the time. Unfortunately, the theoretical results on this subject are not encouraging . For any number $N$ of samples, there exists a distribution of the input/output samples for which the estimate of generalization error is arbitrarily poor. At the same time, for any learning machine LM1 there exists a data distribution and another learning machine LM2 such that for all $N$, LM2 is better than LM1.
The nonexistence of the universal best learning approach implies that a debate based on experiments would never end and that simulations should never be used to prove the superiority of one method over another.
These considerations warn the designer not to prefer unconditionally a data analysis tool but to take always in consideration a range of possible alternatives. If it is not realistic to find the approximator which works nicely for any kind of experiment, it is however advisable to have an understanding of available methods, their rationale, properties and use.
We would like then to end this manuscript not by suggesting a unique and superior way of proceeding in front of data but by proposing some golden rules for anyone who would like to adventure in the world of statistical modeling and data analysis:
Each approach has its own assumptions! Be aware of them before using it.
Simpler things first!
Reality is probably most of the time nonlinear but a massive amount of (theoretical, algorithmic) results exists only for linear methods.
Expert knowledge MATTERS...
But data too :-)
Better be confident with a number of alternative techniques (preferably linear and nonlinear) and use them in parallel on the same task.
Resampling techniques make few assumptions and appear as powerful nonparametric tools.
Resampling and combing are at the forefront of the data analysis technology. Do not forget to test them when you have a data analysis problem.
Do not be religious about learning/modeling techniques. The best learning algorithm does NOT exist.
Statistical dependency does not imply causality.
The latter point will be discussed in the following section.