(1) How to comput the Cost function in Univirate/Multivariate Linear Regression;
(2) How to comput the Batch Gradient Descent function in Univirate/Multivariate Linear Regression;
(3) How to scale features by mean value and standard deviation;
(4) How to calculate Theta by normal equaltion;
Data1
6.1101,17.5925.5277,9.13028.5186,13.6627.0032,11.8545.8598,6.82338.3829,11.8867.4764,4.34838.5781,126.4862,6.59875.0546,3.81665.7107,3.252214.164,15.5055.734,3.15518.4084,7.22585.6407,0.716185.3794,3.51296.3654,5.30485.1301,0.560776.4296,3.65187.0708,5.38936.1891,3.138620.27,21.7675.4901,4.2636.3261,5.18755.5649,3.082518.945,22.63812.828,13.50110.957,7.046713.176,14.69222.203,24.1475.2524,-1.226.5894,5.99669.2482,12.1345.8918,1.84958.2111,6.54267.9334,4.56238.0959,4.11645.6063,3.392812.836,10.1176.3534,5.49745.4069,0.556576.8825,3.911511.708,5.38545.7737,2.44067.8247,6.73187.0931,1.04635.0702,5.13375.8014,1.84411.7,8.00435.5416,1.01797.5402,6.75045.3077,1.83967.4239,4.28857.6031,4.99816.3328,1.42336.3589,-1.42116.2742,2.47565.6397,4.60429.3102,3.96249.4536,5.41418.8254,5.16945.1793,-0.7427921.279,17.92914.908,12.05418.959,17.0547.2182,4.88528.2951,5.744210.236,7.77545.4994,1.017320.341,20.99210.136,6.67997.3345,4.02596.0062,1.27847.2259,3.34115.0269,-2.68076.5479,0.296787.5386,3.88455.0365,5.701410.274,6.75265.1077,2.05765.7292,0.479535.1884,0.204216.3557,0.678619.7687,7.54356.5159,5.34368.5172,4.24159.1802,6.79816.002,0.926955.5204,0.1525.0594,2.82145.7077,1.84517.6366,4.29595.8707,7.20295.3054,1.98698.2934,0.1445413.394,9.05515.4369,0.61705
1. ex1.m
1 %% Machine Learning Online Class - Exercise 1: Linear Regression 2 3 % Instructions 4 % ------------ 5 % 6 % This file contains code that helps you get started on the 7 % linear exercise. You will need to complete the following functions 8 % in this exericse: 9 % 10 % warmUpExercise.m 11 % plotData.m 12 % gradientDescent.m 13 % computeCost.m 14 % gradientDescentMulti.m 15 % computeCostMulti.m 16 % featureNormalize.m 17 % normalEqn.m 18 % 19 % For this exercise, you will not need to change any code in this file, 20 % or any other files other than those mentioned above. 21 % 22 % x refers to the population size in 10,000s 23 % y refers to the profit in $10,000s 24 % 25 26 %% Initialization 27 clear ; close all; clc 28 29 %% ==================== Part 1: Basic Function ==================== 30 % Complete warmUpExercise.m 31 fprintf('Running warmUpExercise ... \n'); 32 fprintf('5x5 Identity Matrix: \n'); 33 warmUpExercise() 34 35 fprintf('Program paused. Press enter to continue.\n'); 36 pause; 37 38 39 %% ======================= Part 2: Plotting ======================= 40 fprintf('Plotting Data ...\n') 41 data = load('ex1data1.txt'); 42 X = data(:, 1); y = data(:, 2); 43 m = length(y); % number of training examples 44 45 % Plot Data 46 % Note: You have to complete the code in plotData.m 47 plotData(X, y); 48 49 fprintf('Program paused. Press enter to continue.\n'); 50 pause; 51 52 %% =================== Part 3: Gradient descent =================== 53 fprintf('Running Gradient Descent ...\n') 54 55 X = [ones(m, 1), data(:,1)]; % Add a column of ones to x 56 theta = zeros(2, 1); % initialize fitting parameters 57 58 % Some gradient descent settings 59 iterations = 1500; 60 alpha = 0.01; 61 62 % compute and display initial cost 63 computeCost(X, y, theta) 64 65 % run gradient descent 66 theta = gradientDescent(X, y, theta, alpha, iterations); 67 68 % print theta to screen 69 fprintf('Theta found by gradient descent: '); 70 fprintf('%f %f \n', theta(1), theta(2)); 71 72 % Plot the linear fit 73 hold on; % keep previous plot visible 74 plot(X(:,2), X*theta, '-') 75 legend('Training data', 'Linear regression') 76 hold off % don't overlay any more plots on this figure 77 78 % Predict values for population sizes of 35,000 and 70,000 79 predict1 = [1, 3.5] *theta; 80 fprintf('For population = 35,000, we predict a profit of %f\n',... 81 predict1*10000); 82 predict2 = [1, 7] * theta; 83 fprintf('For population = 70,000, we predict a profit of %f\n',... 84 predict2*10000); 85 86 fprintf('Program paused. Press enter to continue.\n'); 87 pause; 88 89 %% ============= Part 4: Visualizing J(theta_0, theta_1) ============= 90 fprintf('Visualizing J(theta_0, theta_1) ...\n') 91 92 % Grid over which we will calculate J 93 theta0_vals = linspace(-10, 10, 100); 94 theta1_vals = linspace(-1, 4, 100); 95 96 % initialize J_vals to a matrix of 0's 97 J_vals = zeros(length(theta0_vals), length(theta1_vals)); 98 99 % Fill out J_vals100 for i = 1:length(theta0_vals)101 for j = 1:length(theta1_vals)102 t = [theta0_vals(i); theta1_vals(j)]; 103 J_vals(i,j) = computeCost(X, y, t);104 end105 end106 107 108 % Because of the way meshgrids work in the surf command, we need to 109 % transpose J_vals before calling surf, or else the axes will be flipped110 J_vals = J_vals';111 % Surface plot112 figure;113 surf(theta0_vals, theta1_vals, J_vals)114 xlabel('\theta_0'); ylabel('\theta_1');115 116 % Contour plot117 figure;118 % Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100119 contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))120 xlabel('\theta_0'); ylabel('\theta_1');121 hold on;122 plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);
2.warmUpExercise.m
1 function A = warmUpExercise() 2 %WARMUPEXERCISE Example function in octave 3 % A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrix 4 5 A = []; 6 % ============= YOUR CODE HERE ============== 7 % Instructions: Return the 5x5 identity matrix 8 % In octave, we return values by defining which variables 9 % represent the return values (at the top of the file)10 % and then set them accordingly. 11 A = eye(5);12 13 14 15 16 17 18 % ===========================================19 20 21 end
3. computCost.m
1 function J = computeCost(X, y, theta) 2 %COMPUTECOST Compute cost for linear regression 3 % J = COMPUTECOST(X, y, theta) computes the cost of using theta as the 4 % parameter for linear regression to fit the data points in X and y 5 6 % Initialize some useful values 7 m = length(y); % number of training examples 8 9 % You need to return the following variables correctly 10 J = 0;11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Compute the cost of a particular choice of theta14 % You should set J to the cost.15 hypothesis = X*theta;16 J = 1/(2*m)*(sum((hypothesis-y).^2));17 18 % =========================================================================19 20 end
4.gradientDescent.m
1 function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) 2 %GRADIENTDESCENT Performs gradient descent to learn theta 3 % theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by 4 % taking num_iters gradient steps with learning rate alpha 5 6 % Initialize some useful values 7 m = length(y); % number of training examples 8 J_history = zeros(num_iters, 1); 9 10 for iter = 1:num_iters11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Perform a single gradient step on the parameter vector14 % theta. 15 %16 % Hint: While debugging, it can be useful to print out the values17 % of the cost function (computeCost) and gradient here.18 %19 hypothesis = X*theta;20 delta = X'*(hypothesis-y);21 theta = theta - alpha/m*delta;22 23 % ============================================================24 25 % Save the cost J in every iteration 26 J_history(iter) = computeCost(X, y, theta);27 28 end29 30 end
Data2
2104,3,3999001600,3,3299002400,3,3690001416,2,2320003000,4,5399001985,4,2999001534,3,3149001427,3,1989991380,3,2120001494,3,2425001940,4,2399992000,3,3470001890,3,3299994478,5,6999001268,3,2599002300,4,4499001320,2,2999001236,3,1999002609,4,4999983031,4,5990001767,3,2529001888,2,2550001604,3,2429001962,4,2599003890,3,5739001100,3,2499001458,3,4645002526,3,4690002200,3,4750002637,3,2999001839,2,3499001000,1,1699002040,4,3149003137,3,5799001811,4,2859001437,3,2499001239,3,2299002132,4,3450004215,4,5490002162,4,2870001664,2,3685002238,3,3299002567,4,3140001200,3,299000852,2,1799001852,4,2999001203,3,239500
0.ex1_multi.m
1 %% Machine Learning Online Class 2 % Exercise 1: Linear regression with multiple variables 3 % 4 % Instructions 5 % ------------ 6 % 7 % This file contains code that helps you get started on the 8 % linear regression exercise. 9 % 10 % You will need to complete the following functions in this 11 % exericse: 12 % 13 % warmUpExercise.m 14 % plotData.m 15 % gradientDescent.m 16 % computeCost.m 17 % gradientDescentMulti.m 18 % computeCostMulti.m 19 % featureNormalize.m 20 % normalEqn.m 21 % 22 % For this part of the exercise, you will need to change some 23 % parts of the code below for various experiments (e.g., changing 24 % learning rates). 25 % 26 27 %% Initialization 28 29 %% ================ Part 1: Feature Normalization ================ 30 31 %% Clear and Close Figures 32 clear ; close all; clc 33 34 fprintf('Loading data ...\n'); 35 36 %% Load Data 37 data = load('ex1data2.txt'); 38 X = data(:, 1:2); 39 y = data(:, 3); 40 m = length(y); 41 42 % Print out some data points 43 fprintf('First 10 examples from the dataset: \n'); 44 fprintf(' x = [%.0f %.0f], y = %.0f \n', [X(1:10,:) y(1:10,:)]'); 45 46 fprintf('Program paused. Press enter to continue.\n'); 47 pause; 48 49 % Scale features and set them to zero mean 50 fprintf('Normalizing Features ...\n'); 51 52 [X mu sigma] = featureNormalize(X); 53 54 % Add intercept term to X 55 X = [ones(m, 1) X]; 56 57 58 %% ================ Part 2: Gradient Descent ================ 59 60 % ====================== YOUR CODE HERE ====================== 61 % Instructions: We have provided you with the following starter 62 % code that runs gradient descent with a particular 63 % learning rate (alpha). 64 % 65 % Your task is to first make sure that your functions - 66 % computeCost and gradientDescent already work with 67 % this starter code and support multiple variables. 68 % 69 % After that, try running gradient descent with 70 % different values of alpha and see which one gives 71 % you the best result. 72 % 73 % Finally, you should complete the code at the end 74 % to predict the price of a 1650 sq-ft, 3 br house. 75 % 76 % Hint: By using the 'hold on' command, you can plot multiple 77 % graphs on the same figure. 78 % 79 % Hint: At prediction, make sure you do the same feature normalization. 80 % 81 82 fprintf('Running gradient descent ...\n'); 83 84 % Choose some alpha value 85 alpha = 0.01; 86 num_iters = 400; 87 88 % Init Theta and Run Gradient Descent 89 theta = zeros(3, 1); 90 [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters); 91 92 % Plot the convergence graph 93 figure; 94 plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2); 95 xlabel('Number of iterations'); 96 ylabel('Cost J'); 97 98 % Display gradient descent's result 99 fprintf('Theta computed from gradient descent: \n');100 fprintf(' %f \n', theta);101 fprintf('\n');102 103 % Estimate the price of a 1650 sq-ft, 3 br house104 % ====================== YOUR CODE HERE ======================105 % Recall that the first column of X is all-ones. Thus, it does106 % not need to be normalized.107 price = 0; % You should change this108 109 110 % ============================================================111 112 fprintf(['Predicted price of a 1650 sq-ft, 3 br house ' ...113 '(using gradient descent):\n $%f\n'], price);114 115 fprintf('Program paused. Press enter to continue.\n');116 pause;117 118 %% ================ Part 3: Normal Equations ================119 120 fprintf('Solving with normal equations...\n');121 122 % ====================== YOUR CODE HERE ======================123 % Instructions: The following code computes the closed form 124 % solution for linear regression using the normal125 % equations. You should complete the code in 126 % normalEqn.m127 %128 % After doing so, you should complete this code 129 % to predict the price of a 1650 sq-ft, 3 br house.130 %131 132 %% Load Data133 data = csvread('ex1data2.txt');134 X = data(:, 1:2);135 y = data(:, 3);136 m = length(y);137 138 % Add intercept term to X139 X = [ones(m, 1) X];140 141 % Calculate the parameters from the normal equation142 theta = normalEqn(X, y);143 144 % Display normal equation's result145 fprintf('Theta computed from the normal equations: \n');146 fprintf(' %f \n', theta);147 fprintf('\n');148 149 150 % Estimate the price of a 1650 sq-ft, 3 br house151 % ====================== YOUR CODE HERE ======================152 price = 0; % You should change this153 154 155 % ============================================================156 157 fprintf(['Predicted price of a 1650 sq-ft, 3 br house ' ...158 '(using normal equations):\n $%f\n'], price);
1.featureNormalize.m
1 function [X_norm, mu, sigma] = featureNormalize(X) 2 %FEATURENORMALIZE Normalizes the features in X 3 % FEATURENORMALIZE(X) returns a normalized version of X where 4 % the mean value of each feature is 0 and the standard deviation 5 % is 1. This is often a good preprocessing step to do when 6 % working with learning algorithms. 7 8 % You need to set these values correctly 9 X_norm = X;10 mu = zeros(1, size(X, 2));11 sigma = zeros(1, size(X, 2));12 13 % ====================== YOUR CODE HERE ======================14 % Instructions: First, for each feature dimension, compute the mean15 % of the feature and subtract it from the dataset,16 % storing the mean value in mu. Next, compute the 17 % standard deviation of each feature and divide18 % each feature by it's standard deviation, storing19 % the standard deviation in sigma. 20 %21 % Note that X is a matrix where each column is a 22 % feature and each row is an example. You need 23 % to perform the normalization separately for 24 % each feature. 25 %26 % Hint: You might find the 'mean' and 'std' functions useful.27 % 28 mu = mean(X);29 sigma = std(X);30 X_norm = (X_norm.-mu)./sigma;31 32 % ============================================================33 34 end
2.computCostMulti.m
1 function J = computeCostMulti(X, y, theta) 2 %COMPUTECOSTMULTI Compute cost for linear regression with multiple variables 3 % J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the 4 % parameter for linear regression to fit the data points in X and y 5 6 % Initialize some useful values 7 m = length(y); % number of training examples 8 9 % You need to return the following variables correctly 10 J = 0;11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Compute the cost of a particular choice of theta14 % You should set J to the cost.15 hypothesis = X*theta;16 J = 1/(2*m)*(sum((hypothesis-y).^2));17 18 19 20 21 % =========================================================================22 23 end
3.gradientDescentMulti.m
1 function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters) 2 %GRADIENTDESCENTMULTI Performs gradient descent to learn theta 3 % theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by 4 % taking num_iters gradient steps with learning rate alpha 5 6 % Initialize some useful values 7 m = length(y); % number of training examples 8 J_history = zeros(num_iters, 1); 9 10 for iter = 1:num_iters11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Perform a single gradient step on the parameter vector14 % theta. 15 %16 % Hint: While debugging, it can be useful to print out the values17 % of the cost function (computeCostMulti) and gradient here.18 %19 hypothesis = X*theta;20 delta = X'*(hypothesis-y);21 theta = theta - alpha/m*delta;22 23 % ============================================================24 25 % Save the cost J in every iteration 26 J_history(iter) = computeCostMulti(X, y, theta);27 28 end29 30 end
4.normalEqn.m
1 function [theta] = normalEqn(X, y) 2 %NORMALEQN Computes the closed-form solution to linear regression 3 % NORMALEQN(X,y) computes the closed-form solution to linear 4 % regression using the normal equations. 5 6 theta = zeros(size(X, 2), 1); 7 8 % ====================== YOUR CODE HERE ====================== 9 % Instructions: Complete the code to compute the closed form solution10 % to linear regression and put the result in theta.11 %12 13 % ---------------------- Sample Solution ----------------------14 15 theta = pinv(X'*X)*X'*y;16 17 18 % -------------------------------------------------------------19 20 21 % ============================================================22 23 end