Reinforcement Learning

Pathway Intelligence believes that Reinforcement Learning, the sub-field of Machine Learning concerned with intelligent agents learning sequential decision-making, is a watershed technology which will ultimately transform the economy, politics, health care, transportation, education, and most other fields of human endeavour.

With expertise in top existing RL frameworks, a myriad of component technologies that can complement RL, and direct experience with contemporary RL theory and applications, Pathway Intelligence is well positioned at the forefront of this revolution.

Sutton and Barto, Reinforcement Learning: An Introduction

Why Reinforcement Learning?

Reinforcement Learning can combine the best of other kinds of machine learning and AI — unsupervised learning and supervised learning, including deep learning, bayesian methods, plus classical algorithms and data structures — to solve difficult problems requiring intelligent behaviour:

learned control
learned intelligent action, co-ordination, planning
learned tactics
learned strategy

AI Agents and Economics

Robin Chauhan , Pathway Intelligence
May 2021

Presented at the VanTech Data Science Lightening Talk Series.

What does the presence of truly intelligent agents (likely to incorporate RL) do to basic assumptions of economics?

Call to action: AI will completely transform the economy, its critical we ensure it serves its ultimate purpose: to increase the well-being of its living participants.

A brief talk drawing from various sources, and my own conjecture.

Amii Tea Time Talks

Robin Chauhan , Pathway Intelligence
August 2020

I presented some of my reinforcement learning work at the Amii Tea Time series of talks at the Alberta Machine Intelligence Institute at University of Alberta.

Notes on Deep RL, Self-Play, AlphaZero and DQN

Presented to Vancouver Kaggle group, April 30 2020 6pm.

Robin Chauhan , Pathway Intelligence
April 2020

How does AlphaZero work, and why was it designed that way?

Part of a presentation given by myself and Alexey Iskrov to the Vancouver Kaggle group on the Kaggle ConnectX competition.

I drew from a wide variety of primary and secondary sources, as well as personal experience training systems based on both algorithms, to answer:

How does AlphaZero work, and why was it designed that way?
How does it differ from “traditional” Q-learning?
What is self-play?
How does Atari differ from Go?
How complex are board games?
How do the properties of these problems relate to the design decisions in the algorithms?

Once again, I’ve never seen these issues addressed in quite the way I wished, so this is the presentation I always wanted on the subject.

reinforcement learning for Digital health product

Oct 2019 - present

Pathway is pleased to help develop a tailored reinforcement learning approach for an established digital health product at a leading US digital health company.

We work closely with their internal data scientists, to help optimize health outcomes with deep reinforcement learning.

Past phases of this project included problem framing, exploration design, simulation, agent prototyping and benchmarking.

Future phases will involve further simulation, modelling, agent design, and off-policy learning with reinforcement learning agents and other machine learning approaches.

TalkRL: REINFORCEMENT LEARNING PODCAST

Aug 2019 - present

Pathway is pleased to sponsor TalkRL Podcast.

TalkRL Podcast is all Reinforcement Learning, all the time. In-depth interviews with brilliant people at the forefront of RL research and practice.

Pommitron: Pommerman NEURIPS RL COMPETITION AGENT

Robin Chauhan , Pathway Intelligence
Nov, 2018

Submission to 2018 NeurIPS Pommerman Deep Reinforcement Learning Competition. I used Tensorflow, Keras, and a modified version of OpenAI Baselines to solve this challenging problem. The network design was largely based on a design from Ross Wightman.

For this project I explored many RL frameworks, gained strong experience on practical deep learning performance / throughput / realtime AI, and produced original solutions combining supervised learning, model-based RL, and model-free RL. This agent design includes at least 2 aspects that may end up as research papers (specifically the approach to model-based RL in a multi-agent setting), although writing papers was not the original intent.

For this RL agent I opted for an ambitious design, combining supervised learning, model-based RL,model-free RL, self-play with unique innovations.

I attended the competition at NeurIPS 2018 in Montreal, met with and learned from top researchers and practitioners in reinforcement learning from around the world.

OpenAI Five DOTA 2 agent teardown

Robin Chauhan , Pathway Intelligence
Sept 2018

Studying OpenAI Five has been influential on how I look at agent design

Simon Fraser University VentureLabs in Harbourfront Center, Vancouver BC Canada

Part of a set of lightening talks. My talk starts at about 58:00 below, the others are worth checking out.

I present a combination of OpenAI’s content plus original analysis and explanations. Studying OpenAI Five has been influential on how I look at agent design.

A technical, biased, incomplete analysis of OpenAI's awesome Five Agent design.

Intro to Reinforcement Learning + Deep Q-Networks

Robin Chauhan , Pathway Intelligence
Jun 14, 2018
HiVE, Gastown, Vancouver BC Canada

These two talks were combined 3 hours of content, designed for a technical audience who is not familiar with Reinforcement Learning.

I couldn’t find the exact RL fundamentals talk I always wanted, so I made it!

After a broad overview of RL, I focused on the fundamentals of Q-learning, and then Deep Q Networks, the seminal research from Deepmind that ignited the field of Deep Reinforcement Learning. I then proceed to explain Rainbow DQN, the current State of the Art in value-function-based agents at time of writing.

I combine well-cited authoritative sources, the best of content from research papers and the internet, my own original explanations and simplifying diagrams, plus insight based on my own work with RL systems. I couldn’t find the exact RL fundamentals talk I always wanted, so I made it!

We introduce RL concepts, methods, and applications. We look at tools and frameworks for posing and solving RL problems, including OpenAI gym. We then more closely examine Q learning and Deep Q Networks, a popular contemporary deep reinforcement algorithm. We aim to share the best insights from the top researchers in a lucid and entertaining way.