Researchers are getting inspirations from the human mind when it comes to intelligence and Artificial Intelligence AI research also involves a deep connection with human brains, however, AI now teaches us about how human brains learn.
Researcher Will Dabney and his fellow researchers at Deepmind found that recent development in machine learning called distributional reinforcement learning provides a new explanation of how the reward pathways within the brain work. The pathways govern our responses to pleasurable events and are mediated by neurons that release brain chemical dopamine.
According to the researchers, “Dopamine works as a surprise signal in the brain when events turn out to be better than expectations more dopamine is released in the brain.”
However, it was noted previously that these dopamine neurons all respond identically, but the team of researchers found that dopamine neurons actually vary, each one is tuned to a different level of optimism.
The findings drew inspiration from a process “Distributional Reinforcement Learning (DRL)” it is one of the notable techniques in AI which is used to master games like Go and Starcraft II. In simple words, DRL is the idea that reward reinforces the behaviour that led to its acquisition. For instance, a dog may learn to follow the command sit because it receives the reward of pat from the owner to do so.
Dabney says, “Previously, models of reinforcement learning in both AI and neuroscience focused on learning to predict an “average” future reward. “But this doesn’t reflect reality as we experience it.”
For example, if someone plays a lottery, he expects to win or lose but he does not expect this halfway average outcome that doesn’t necessarily occur. When the future is not certain, the possible outcomes can be represented as probability distribution instead of positives and negatives. AIs that use DRL algorithms can predict the full spectrum of possible rewards and outcomes.
To test whether the brain’s dopamine reward pathways also work via a distribution, the team recorded responses from individual dopamine neurons in mice. The mice were trained to perform a task and were given rewards of varying and unpredictable sizes.
The researchers found that different dopamine cells showed reliably different levels of surprise.
“Associating rewards to certain stimuli or actions is of critical importance for survival,” says Raul Vicente at the University of Tartu, Estonia. “The brain cannot afford to throw away any valuable information about rewards.”
“At a large scale, the study is in line with the current view that to operate efficiently the brain has to represent not only the average value of a variable but also how often a variable takes different values,” says Vicente. “It is a good example of how computational algorithms can guide us in what to look for in neural responses.” But the researchers said more research is required to illustrate whether the findings apply to other species and the regions of the brain.