Reinforcement learning | Arash Golmohammadi

I recently had an interview talk. It failed. But took me to thought journey that rewarded me with a neat realization about our academia, and possibly career in general.

Our societal and educational organizations have evolved in such a way that evaluation of people has become a binary-decision only. Of course, when recruitting, a (binary) decision has to be made. But can’t the output of this proess be possibly richer? More informative for the applicant?

For whatever set of reasons, it is not the case in majority of cases. This means to advance in career you simply must be more “fit” in the genetic algorithm sense, which always bring a chicken and egg problem. No one knows what is fit, but if you sucseed, you’re definetly fitter. In such a world, learning in the sense of mismatch minimization is really out of question since the target function is hard to find anyway. But even in cases in which fit can be defined, a person don’t get any feedback upon rejection, except the rejection itself.

This to me sound like we should not really expect to learn in academia (or career) in a supervised manner. But instead, should try to reinforce our learning through environmental cues (reward/punishments). In other words, self-supervised reinforcement learning is what you need.