Thoughts about the free lunch

This post starts when I was reviewing the no free lunch theorem in machine learning. A machine learning model being able to predict with high accuracy from unseen data should have knowledge about the data. Then what is knowledge? I started to have the feeling that knowledge should be defined in a broader sense than what it is done in (intro) epistemology textbooks. Namely, it should be closely related to the creation of intelligent systems, which might be more important than defining what knowledge is in human languages.

  • A brief overview about the no free lunch theorem
  • Does learning something indicate having knowledge over it?
  • Gödel’s idea…?
  • What should we do then? (If knowledge is nothing, then what does the hard tasks at schools make sense?)

No Free Lunch Theorem: a machine learning point of view

If you don’t impose any restrictions on the training data, then averaged among all possible training data, the performance of all learned models are equal.
Intuitively this makes sense — if you have a model well fitted to a dataset , then an arbitrarily created test set will make this model have 0 accuracy. However this is by no means rigorous. A more rigorous formulation1 states that for every learner, there exists a task on which it fails, even though that task can be successfully learned by another learner.
Takeaways from this theorem?

  • Any learning algorithm is context-dependent. A model trained on a dataset can not be ported to test on another dataset without some guarantee (e.g: the two datasets come from the same source, etc.) However, if the algorithm has good generalizability to the other dataset, then something common between these two datasets are really learned by this model.
  • Do not be too frustrated if a model does not work well on a dataset. It might be working well on another. On the other hand, it is very important to figure out on which datasets do your model performs well.

Epistemology point of view

This sections notes down some attempts to describe knowledge from a nature language point of view. A lot of information come from the PHL232 recommended reading 2. As far as I can tell, people try to give a definition of knowledge using human languages.
A classical analysis of knowledge defines it as justified true belief. Later people hold either belief-first (internalism) or knowledge-first (externalism) point of views. There are also some arguments saying “you have to believe a true information in the right way”. Either way, knowledge is closely related to the human’s subjective perceptual systems.
I don’t have an exact definition for knowledge, but feel that the definition should consider non-human objects and creatures as well. If a robot, for example, can clean the room, then it knows how to clean the room. This could also be a form of “knowledge”. More specifically, the ability to create intelligent systems should be able to indicate the mastery of (at least some of) the definition of knowledge. In this way, machine learning researchers are actually tackling one of the most interesting problems I have ever encountered: addressing what knowledge is, using pragmatic and generative approaches.

Gödel’s idea…?

A further thought about why pragmatic and generative approach (instead of the literal approach taken in philosophers) works might be related to Gödel’s idea. More specifically, my bold extension to it.
Gödel’s incompleteness theorems proposed that no formal system of axioms are able to demonstrate its own consistency. (A tricky part comes in here: this theorem is not provable, and it actually asserts non-provability, so it is actually true. It is like saying “S1: this sentence is false. S2: you can’t prove, true or false, S1.” S2 evaluates to true.) In other words, there will be some axioms of a system that needs external knowledge to prove (true or false).
My bold extension: there are some statements in a knowledge system that is not able to be proved by the knowledge inside it. This might or might not be addressed, as more knowledge are discovered. The definition of knowledge using human language, in my opinion, belongs to one of the things that could not even be well defined using purely literal information: recursively asking “what is” is always (I think) possible — so that there is no point just going down levels by levels to clarify what knowledge is (for me: see below section for my definition of “no point”). As long as we can build intelligent systems, the definition of knowledge using human language is just meaningless (given that human languages, or formal logic systems, are limited in capacity). We might need to resort to the prowess of something we could not understand (but are able to control), to address “what knowledge is” in a pragmatic approach.

Pragmatic opinions

This ultimately comes back to talking about a problem that cannot even be clearly defined. In the above “reasoning”, I could only describe but not define the terms. I think it does not matter whether there is an ultimately discoverable truth (“matter” itself needs to be defined, by the way). What matters, however, is how you respond to that. Realizing your values, while making the world a better place could be a good metric to evaluate whether something matters. If holding a nihilistic point of view makes you go further away from making the world better, it is not good.

References


  1. 1.Understanding Machine Learning (Shai Shalev-Shwartz and Shai Ben-David)
  2. 2.Knowledge: a very short introduction (Jennifer Nagel)