Machine Learning Demo
Summary The purpose of this demo is just to illustrate the power of Machine Learning and compare it with the traditional programming models – not teach machine learning.
The purpose of these demos is to just illustrate the power or Machine Learning. Some of these demos are solved behind the scenes using Deep Learning ( which is a specialization in Machine Learning that uses a technology called Neural Networks ) which we will not be learning in this course. However, most of the fundamental principles remain the same
Google vision is based on Google’s proprietary image recognition algorithms and can easily identify all the objects in an image.Google
Housing Price prediction
Predicting house prices is a daily job for real estate consultants. There are a whole lot of parameters that are used in determining the price of a house – like
- Number of rooms
# Ⓒ Ajaytech - https://ajaytech.co # # This program is not intended to teach you Machine Learning. # Instead it is intended to show what Machine Learning can solve # that conventional programming cannot solve import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import sklearn from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression # Import the Boston Housing dataset from sklearn.datasets import load_boston boston = load_boston() # Convert the boston object into a dataframe df_boston = pd.DataFrame(boston.data) # Set the columns of the data frame df_boston.columns = boston.feature_names # Set the target price in the data frame df_boston['PRICE'] = boston.target # Set the predictors and response variables # Number of rooms ( RM ) is the predictor predictor = pd.DataFrame(df_boston['RM']) # The price of the house ( PRICE ) is the response variable response = df_boston['PRICE'] # Fit the model lm = LinearRegression().fit(predictor, response) # Slope and intercept slope = lm.coef_ intercept = lm.intercept_ # get the line points rm_sorted = np.array(df_boston['RM'] ) rm_sorted.sort() line = rm_sorted * slope + intercept # Predict how good the fit is based on the test data predict = lm.predict(predictor) plt.scatter(df_boston['RM'],df_boston['PRICE'], marker='.') plt.title(" Number of rooms vs Price") plt.xlabel("Number of Rooms") plt.ylabel("Price") plt.plot ( rm_sorted, line, 'r') plt.show()
If you run this file you should be able to see a prediction of the house prices based on just 1 parameter – Number of rooms. For example, here is a quick snapshot of some of the rows of data.
CRIM ZN INDUS CHAS NOX RM ... RAD TAX PTRATIO B LSTAT PRICE 0 0.00632 18.0 2.31 0.0 0.538 6.575 ... 1.0 296.0 15.3 396.90 4.98 24.0 1 0.02731 0.0 7.07 0.0 0.469 6.421 ... 2.0 242.0 17.8 396.90 9.14 21.6 2 0.02729 0.0 7.07 0.0 0.469 7.185 ... 2.0 242.0 17.8 392.83 4.03 34.7 3 0.03237 0.0 2.18 0.0 0.458 6.998 ... 3.0 222.0 18.7 394.63 2.94 33.4 4 0.06905 0.0 2.18 0.0 0.458 7.147 ... 3.0 222.0 18.7 396.90 5.33 36.2
As you can see, there are so many columns here and the last column is the house price in thousands of dollars. This is data from the 1970’s. So the actual prices are very low compared to modern day prices. However, we can still learn a lot from this simple dataset.
Here is a plot of the number of rooms vs the price of the house.
As you can see from the plot, there is no relationship between the number of rooms and the price that we can identify. It seems random. However, if we were to approximate the relationship, we can see 2 things.
- As the number of rooms go up, the price of the house goes up too.
- The relationship seems linear in nature
Under these assumptions, we can run the program to determine a straight line that approximates the relationship between the number of rooms and the house price. This is how that prediction would look like.
The red line is what Machine Learning has predicted. Of course it does not account for all the data points – it is just an approximation. However, it is an intelligent approximation. It is not just random line. There is a bit of math behind it.
How did ML solve the problem ? In this case, ML uses a statistical method called Linear Regression that approximates a linear correlation between the predictors and response variables. The predictor being the number of rooms and the response being the house price. We are not limited to just one predictor – we can use any or all of the predictors that we have seen above – crime rate, pollution, taxes etc.
The point I want you to take away is that fundamentally this is a fuzzy problem to solve using traditional IF ELSE computing model. In this specific case, we are taking the help of statistics to solve the problem using approximations. Statistics just happens to be a major tool used in the field of ML. However, ML is not limited to using statistics. The key being, data. As long as there is an algorithm that can use existing data to predict new data points, the field of ML is more than happy to use it. While the goal of traditional statistics is to summarize and analyze existing data, the goal of machine learning is to learn from data.