Fundamental concepts in supervised machine learning

This is a memo to share what I have learnt in Machine Learning with Tree-Based Models (using Python), capturing the learning objectives as well as my personal notes. The course is taught by Elie Kawerk from DataCamp.

Photo by Keith Jonson on Unsplash

Decision trees are supervised learning models used for problems involving classification and regression.


A beginner’s guide to the basic concepts of Apache Airflow

This is a memo to share what I have learnt in Apache Airflow, capturing the learning objectives as well as my personal notes. The course is taught by Mike Metzger from DataCamp.

Photo by Jacek Dylag on Unsplash

A data engineer’s job includes writing scripts, adding complex CRON tasks, and trying various ways to meet an…


Multiple Linear Regression, R², Adjusted R², MSE, p-value

Statistics and coding are fundamentally important in the data science field. Since a lot of a data science work is carried out with code, I would highly recommend learning statistics with a heavy focus on coding, preferably in Python or R.

Photo by Michael Dziedzic on Unsplash

In my previous article, I shared about how to…


When your ecommerce business grows

Photo by Mark König on Unsplash

If your ecommerce business is progressing to the Cloud, you need to be familiar with these three main types of cloud computing:

  • IaaS — Infrastructure as a Service
  • PaaS — Platform as a Service
  • SaaS — Software as a Service

These are all experiencing a surge in popularity as more…


Machine Learning from labelled data to make predictions

This is a tutorial to share what I have learnt in Supervised Learning with scikit-learn, capturing the learning objectives as well as my personal notes. The course is taught by Hugo Bowne-Anderson from DataCamp.

Photo by Andy Kelly on Unsplash

Is a particular email spam?
Will a tumor be benign or malignant? …


Continue to speak the statistical language of your data

Previous tutorial: Statistical Thinking in Python (Part 1)

This is a tutorial to share what I have learnt in Statistical Thinking in Python (Part 2), capturing the learning objectives as well as my personal notes. The course is taught by Justin Bois from DataCamp.

Photo by ThisisEngineering RAEng on Unsplash

To build the probabilistic mindset and…


Speak the statistical language of your data

This is a tutorial to share what I have learnt in Statistical Thinking in Python (Part 1), capturing the learning objectives as well as my personal notes. The course is taught by Justin Bois from DataCamp, and it includes 4 chapters.

Photo by Chris Liverani on Unsplash

The end goal of gathering data is to make…


How feature extraction techniques can reduce dimensionality

This is a tutorial to share what I have learnt in Dimensionality Reduction in Python, capturing the learning objectives as well as my personal notes. The course is taught by Jerone Boeye from DataCamp, and it includes 4 chapters.

Photo by Aditya Chinchure on Unsplash

High-dimensional datasets have high complexity and can be computationally expensive to…


Using the new Tableau version 2020.x onwards, with The World Bank GDP data preparation in Python 3

Bar chart race in action (music added): https://youtu.be/QQ9dw7gpbIM

A bar chart race has become very popular recently. At the beginning of 2020, Tableau released 2020.x version with a new Animations feature for dynamic parameters. This means that the bar chart race below can now be built easily in 6 minutes.

https://public.tableau.com/profile/blackraven#!/vizhome/Top10CountriesHistoricalGDPByYear/Top10CountriesHistoricalGDPByYear

Black_Raven (James Ng)

perpetual student, fitness enthusiast, passionate explorer https://www.linkedin.com/in/jnyh/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store