Deep Reinforcement learning has been a rising field in the last few years. A good approach to start with is the value-based method, where the state (or state-action) values are learned. In this post, a comprehensive review is provided where we focus on Q-learning and its extensions.

unsplash

A Short Introduction to Reinforcement Learning (RL)

There are three types of common machine learning approaches: 1) supervised learning, where a learning system learns a latent map based on labeled examples, 2) unsupervised learning, where a learning system establishes a model for data distribution based on unlabeled examples, and 3) Reinforcement Learning, where a decision-making system is trained to make optimal decisions. From the designer’s point-of-view, all kinds of learning are supervised by a loss function. The sources of supervision must be defined by humans. One way to do this is by the loss function.


Machine Learning

The random forest model is considered one of the promising ML ensemble models that recently became highly popular. In this post, we review the last trends of the random forest.

Image by Author

Ensemble Models-Intro

An ensemble considers multiple learning models and combines them to obtain a more powerful model. Combining different models into an ensemble leads to a better generalization of the data, minimizing the chance for overfitting. A random forest is an example of an ensemble model, where multiple decision trees are considered. As this post is related to the last trends of random forest, it is assumed the reader has a background on decision trees (if no, please refer to decision-trees-in-machine-learning, a great post by Prashant Gupta).

Random Forest-Background

The random forest was introduced by Leo Breiman [1] in 2001. The motivation lies in…


The Kalman filter is one of the most influential ideas used in Engineering, Economics, and Computer Science for real-time applications. This year we mention 60 years for the novel publication. This post is the first one in the series of “Kalman filter celebrates 60”.

unsplash

I first came across the Kalman filter during my undergraduate studies when I took the navigation systems class. It was the last lecture, and the professor said it is out of the course syllabus, but if someone will deal with real-time applications, he is expected to meet it again. He was right, and I kept studying for a master’s degree in the field of Guidance, Control, and Navigation (GCN) at the Aerospace Engineering Faculty of the Technion. I came across the Kalman filter again, where I used it to filter noisy measurements from various sensors during real-time navigation problems. Later…


Getting Started

A fundamental problem in geometry was solved using a Deep Neural Network (DNN). We learned a geometric property from examples in the supervised learning approach. As the simplest geometric object is a curve, we focused on learning the length of planar curves. For this reason, the fundamental length axioms were reconstructed and the ArcLengthNet was established.

Introduction

The calculation of curve length is one of the most major components in many modern and classical problems. For example, a handwritten signature involves the computation of the length along the curve (Ooi et al.). When one handles the challenge of length computation in real-life problems he faces several constraints such as additive noise, discretization error, and partial information. In this post, we review our work, a preprint is available online:

https://www.researchgate.net/publication/345435009_Length_Learning_for_Planar_Euclidean_Curves

In this current work, we address a fundamental question in the field of geometry where we aim to reconstruct a basic property using DNN. The simplest geometric object…


It is very common to use the F1 measure for binary classification. This is known as the Harmonic Mean. However, a more generic F_beta score criterion might better evaluate model performance. So, what about F2, F3, and F_beta? In this post, we will review the F measures.

Intro

According to many data scientists, the most reliable model performance measure is accuracy. It is not only the definitive model metric, there are many others, too. Periodically, the accuracy might be high, but the false-negative (to be defined in the sequel) is also high. Another key measure is the F-measure common in machine learning these days, for evaluating the model performance. It proportionally combines the precision and recall measures. In this post, we explore different approaches where the imbalance of the two is suggested.

Preliminary: Confusion matrix, Precision, and Recall

Confusion matrix (Image by author)

The confusion matrix summarizes the performance of a supervised learning algorithm in ML. It is more…


Intro

Kalman Filter (KF) is widely used for vehicle navigation tasks, and in particular for vehicle trajectory smoothing. One of the problems associated while applying the KF for navigation tasks is the modeling of the vehicle trajectory. For simplicity, it is convenient to choose a Constant Velocity (CV) model or a Constant Acceleration (CA) model for a wide range of tracking problems, where the position derivative is indeed the velocity and the velocity is (nearly) constant (for CV model). This advantage provides keeping dealing with a linear and stable system, as one demand from this type of tracking problem. Aside from…


COVID-19 has affected the worldwide economy, politics, education, tourism, and actually EVERYTHING. Many academic papers address trends prediction in various fields due to COVID-19, with the power of Artificial Intelligence.

Background

COVID-19 pandemic has affected the entire world. Many people lost their jobs, kids stay at home, and the economic crisis is disastrous. The question of “how will the world be after COVID-19” is of high interest. Many futurists predict a different world, where we should rethink public spaces and believes that the memory of the COVID-19 lockdown will remain for a long time (Del Bello, 2020). Google collects and arranges worldwide data, as you can see the daily new cases and deaths:

Google COVID-19


Hands-on Tutorials

In this post, we deal with exploding and Vanishing Gradient in Time Series and in particular in Recurrent Neural Network (RNN) by Truncated BackPropagation Through Time and Gradient Clipping.

Intro

In this post, we focus on deep learning for sequential data techniques. All of us familiar with this kind of data. For example, the text is a sequence of words, video is a sequence of images. More challenging examples are from the branch of time series data, with medical information such as heart rate, blood pressure, etc., or finance, with stock price information. The most common AI approaches for time-series tasks with deep learning is the Recurrent Neural Networks (RNNs). The motivation to use RNN lies in the generalization of the solution with respect to time. As sequences have different…


On October 5 2020 Python releases its 3.9 version. In this post, we review several amazing features and point out the relevant sources for further reading.

Shutterstock

Introduction

On Monday, October 5, Python releases a new stable version, 3.9.0rc2. If you are interested in the source page of Python, it is available at this link: whatsnew/3.9. In this post, we review the release highlights, new features, new modules, optimization, and provide some source code to try it in your own environment. Moreover, we refer to some additional reading and implementation sources.


The reinforcement learning field is used in many robotics problems and has a unique mechanism, where rewards should be accumulated through actions. But, what about the time between these actions?

Author figure

This post deals with the key parameter I found as a high influence: the discount factor. It discusses the time-based penalization to achieve better performances, where discount factor is modified accordingly.

I assume that if you land on this post, you are already familiar with the RL terminology. If it is not the case, then I highly recommend these blogs which provide a great background, before you continue: Intro1 and Intro2.

What is the role of the discount factor in RL?

The discount factor, 𝛾, is a real value ∈ [0, 1], cares for the rewards agent achieved in the past, present, and future. In different words, it relates the…

Barak Or

Founder @ ALMA, PhD Candidate, AI Researcher.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store