Biological Particle Identification and Tracking with Jay Newby – TWiML Talk #179

In today’s episode we’re joined by Jay Newby, Assistant Professor in the Department of Mathematical and Statistical Sciences at the University of Alberta. Jay joins us to discuss his work applying deep learning to biology, including his paper “Deep neural networks automate detection for tracking…

15 Statistical Hypothesis Tests in Python (Cheat Sheet)

Quick-reference guide to the 15 statistical hypothesis tests that you need inapplied machine learning, with sample code in Python. Although there are hundreds of statistical hypothesis tests that you could use, there is only a small subset that you may need to use in a…

How to Code the Student’s t-Test from Scratch in Python

Perhaps one of the most widely used statistical hypothesis tests is the Student’s t test. Because you may use this test yourself someday, it is important to have a deep understanding of how the test works. As a developer, this understanding is best achieved by…

How to Calculate McNemar’s Test to Compare Two Machine Learning Classifiers

The choice of a statistical hypothesis test is a challenging open problem for interpreting machine learning results. In his widely cited 1998 paper, Thomas Dietterich recommended the McNemar’s test in those cases where it is expensive or impractical to train multiple copies of classifier models….

A Gentle Introduction to Statistical Power and Power Analysis in Python

The statistical power of a hypothesis test is the probability of detecting an effect, if there is a true effect present to detect. Power can be calculated and reported for a completed experiment to comment on the confidence one might have in the conclusions drawn…

A Gentle Introduction to Effect Size Measures in Python

Statistical hypothesis tests report on the likelihood of the observed results given an assumption, such as no association between variables or no difference between groups. Hypothesis tests do not comment on the size of the effect if the association or difference is statistically significant. This…

Statistics for Evaluating Machine Learning Models

Tom Mitchell’s classic 1997 book “Machine Learning” provides a chapter dedicated to statistical methods for evaluating machine learning models. Statistics provides an important set of tools used at each step of a machine learning project. A practitioner cannot effectively evaluate the skill of a machine…

Critical Values for Statistical Hypothesis Testing and How to Calculate Them in Python

In is common, if not standard, to interpret the results of statistical hypothesis tests using a p-value. Not all implementations of statistical tests return p-values. In some cases, you must use alternatives, such as critical values. In addition, critical values are used when estimating the…

A Gentle Introduction to Estimation Statistics for Machine Learning

Statistical hypothesis tests can be used to indicate whether the difference between two samples is due to random chance, but cannot comment on the size of the difference. A group of methods referred to as “new statistics” are seeing increased use instead of or in…

A Gentle Introduction to k-fold Cross-Validation

Cross-validation is a statistical method used to estimate the skill of machine learning models. It is commonly used in applied machine learning to compare and select a model for a given predictive modeling problem because it is easy to understand, easy to implement, and results…