Machine Learning Demo

Machine Learning Demo


  Machine Learning in Python

Summary The purpose of this demo is just to illustrate the power of Machine Learning and compare it with the traditional programming models – not teach machine learning.

Contents

The purpose of these demos is to just illustrate the power or Machine Learning. Some of these demos are solved behind the scenes using Deep Learning ( which is a specialization in Machine Learning that uses a technology called Neural Networks ) which we will not be learning in this course. However, most of the fundamental principles remain the same

Google Vision

Google vision is based on Google’s proprietary image recognition algorithms and can easily identify all the objects in an image.Google

Quickdraw

Google Quickdraw is an experimental project hosted by Google that recognizes pictures based on your doodles. You can view many more such AI experiments here.

Housing Price prediction

Predicting house prices is a daily job for real estate consultants. There are a whole lot of parameters that are used in determining the price of a house – like

  • Number of rooms
  • pollution
  • traffic
  • crime
  • etc
# Ⓒ Ajaytech - https://ajaytech.co
#
# This program is not intended to teach you Machine Learning. 
# Instead it is intended to show what Machine Learning can solve 
# that conventional programming cannot solve

import numpy                as np
import pandas               as pd
import matplotlib.pyplot    as plt
import seaborn              as sns
import sklearn
from   sklearn.model_selection      import train_test_split
from   sklearn.linear_model         import LinearRegression

# Import the Boston Housing dataset
from sklearn.datasets import load_boston
boston = load_boston()

# Convert the boston object into a dataframe
df_boston = pd.DataFrame(boston.data)

# Set the columns of the data frame
df_boston.columns = boston.feature_names

# Set the target price in the data frame
df_boston['PRICE'] = boston.target

# Set the predictors and response variables

# Number of rooms ( RM ) is the predictor 
predictor = pd.DataFrame(df_boston['RM'])

# The price of the house ( PRICE ) is the response variable
response   = df_boston['PRICE']

# Fit the model
lm = LinearRegression().fit(predictor, response)


# Slope and intercept
slope       = lm.coef_
intercept   = lm.intercept_

# get the line points
rm_sorted = np.array(df_boston['RM'] )
rm_sorted.sort()
line = rm_sorted * slope[0] + intercept

# Predict how good the fit is based on the test data
predict = lm.predict(predictor)

plt.scatter(df_boston['RM'],df_boston['PRICE'], marker='.')
plt.title(" Number of rooms vs Price")
plt.xlabel("Number of Rooms")
plt.ylabel("Price")
plt.plot ( rm_sorted, line, 'r')
plt.show()

If you run this file you should be able to see a prediction of the house prices based on just 1 parameter – Number of rooms. For example, here is a quick snapshot of some of the rows of data.

      CRIM    ZN  INDUS   CHAS  NOX     RM  ...    RAD    TAX    PTRATIO      B  LSTAT  PRICE
0  0.00632  18.0   2.31   0.0  0.538  6.575  ...    1.0  296.0     15.3  396.90   4.98   24.0
1  0.02731   0.0   7.07   0.0  0.469  6.421  ...    2.0  242.0     17.8  396.90   9.14   21.6
2  0.02729   0.0   7.07   0.0  0.469  7.185  ...    2.0  242.0     17.8  392.83   4.03   34.7
3  0.03237   0.0   2.18   0.0  0.458  6.998  ...    3.0  222.0     18.7  394.63   2.94   33.4
4  0.06905   0.0   2.18   0.0  0.458  7.147  ...    3.0  222.0     18.7  396.90   5.33   36.2

As you can see, there are so many columns here and the last column is the house price in thousands of dollars. This is data from the 1970’s. So the actual prices are very low compared to modern day prices. However, we can still learn a lot from this simple dataset.

Here is a plot of the number of rooms vs the price of the house.

Number of Rooms vs Price
Number of Rooms vs Price

As you can see from the plot, there is no relationship between the number of rooms and the price that we can identify. It seems random. However, if we were to approximate the relationship, we can see 2 things.

  • As the number of rooms go up, the price of the house goes up too.
  • The relationship seems linear in nature

Under these assumptions, we can run the program to determine a straight line that approximates the relationship between the number of rooms and the house price. This is how that prediction would look like.

Number of Rooms vs Price
Number of Rooms vs Price

The red line is what Machine Learning has predicted. Of course it does not account for all the data points – it is just an approximation. However, it is an intelligent approximation. It is not just random line. There is a bit of math behind it.

How did ML solve the problem ? In this case, ML uses a statistical method called Linear Regression that approximates a linear correlation between the predictors and response variables. The predictor being the number of rooms and the response being the house price. We are not limited to just one predictor – we can use any or all of the predictors that we have seen above – crime rate, pollution, taxes etc.

The point I want you to take away is that fundamentally this is a fuzzy problem to solve using traditional IF ELSE computing model. In this specific case, we are taking the help of statistics to solve the problem using approximations. Statistics just happens to be a major tool used in the field of ML. However, ML is not limited to using statistics. The key being, data. As long as there is an algorithm that can use existing data to predict new data points, the field of ML is more than happy to use it. While the goal of traditional statistics is to summarize and analyze existing data, the goal of machine learning is to learn from data.

%d bloggers like this: