CheeseZH: Stanford University: Machine Learning Ex1:Linear Regression-白红宇

CheeseZH: Stanford University: Machine Learning Ex1:Linear Regression

阅读量：5060 次

发布时间：2019-06-12

本文共 17515 字，大约阅读时间需要 58 分钟。

(1) How to comput the Cost function in Univirate/Multivariate Linear Regression;

(2) How to comput the Batch Gradient Descent function in Univirate/Multivariate Linear Regression;

(3) How to scale features by mean value and standard deviation;

(4) How to calculate Theta by normal equaltion;

Data1

6.1101,17.5925.5277,9.13028.5186,13.6627.0032,11.8545.8598,6.82338.3829,11.8867.4764,4.34838.5781,126.4862,6.59875.0546,3.81665.7107,3.252214.164,15.5055.734,3.15518.4084,7.22585.6407,0.716185.3794,3.51296.3654,5.30485.1301,0.560776.4296,3.65187.0708,5.38936.1891,3.138620.27,21.7675.4901,4.2636.3261,5.18755.5649,3.082518.945,22.63812.828,13.50110.957,7.046713.176,14.69222.203,24.1475.2524,-1.226.5894,5.99669.2482,12.1345.8918,1.84958.2111,6.54267.9334,4.56238.0959,4.11645.6063,3.392812.836,10.1176.3534,5.49745.4069,0.556576.8825,3.911511.708,5.38545.7737,2.44067.8247,6.73187.0931,1.04635.0702,5.13375.8014,1.84411.7,8.00435.5416,1.01797.5402,6.75045.3077,1.83967.4239,4.28857.6031,4.99816.3328,1.42336.3589,-1.42116.2742,2.47565.6397,4.60429.3102,3.96249.4536,5.41418.8254,5.16945.1793,-0.7427921.279,17.92914.908,12.05418.959,17.0547.2182,4.88528.2951,5.744210.236,7.77545.4994,1.017320.341,20.99210.136,6.67997.3345,4.02596.0062,1.27847.2259,3.34115.0269,-2.68076.5479,0.296787.5386,3.88455.0365,5.701410.274,6.75265.1077,2.05765.7292,0.479535.1884,0.204216.3557,0.678619.7687,7.54356.5159,5.34368.5172,4.24159.1802,6.79816.002,0.926955.5204,0.1525.0594,2.82145.7077,1.84517.6366,4.29595.8707,7.20295.3054,1.98698.2934,0.1445413.394,9.05515.4369,0.61705

View Code

1. ex1.m

1 %% Machine Learning Online Class - Exercise 1: Linear Regression  2   3 %  Instructions  4 %  ------------  5 %   6 %  This file contains code that helps you get started on the  7 %  linear exercise. You will need to complete the following functions   8 %  in this exericse:  9 % 10 %     warmUpExercise.m 11 %     plotData.m 12 %     gradientDescent.m 13 %     computeCost.m 14 %     gradientDescentMulti.m 15 %     computeCostMulti.m 16 %     featureNormalize.m 17 %     normalEqn.m 18 % 19 %  For this exercise, you will not need to change any code in this file, 20 %  or any other files other than those mentioned above. 21 % 22 % x refers to the population size in 10,000s 23 % y refers to the profit in $10,000s 24 % 25  26 %% Initialization 27 clear ; close all; clc 28  29 %% ==================== Part 1: Basic Function ==================== 30 % Complete warmUpExercise.m  31 fprintf('Running warmUpExercise ... \n'); 32 fprintf('5x5 Identity Matrix: \n'); 33 warmUpExercise() 34  35 fprintf('Program paused. Press enter to continue.\n'); 36 pause; 37  38  39 %% ======================= Part 2: Plotting ======================= 40 fprintf('Plotting Data ...\n') 41 data = load('ex1data1.txt'); 42 X = data(:, 1); y = data(:, 2); 43 m = length(y); % number of training examples 44  45 % Plot Data 46 % Note: You have to complete the code in plotData.m 47 plotData(X, y); 48  49 fprintf('Program paused. Press enter to continue.\n'); 50 pause; 51  52 %% =================== Part 3: Gradient descent =================== 53 fprintf('Running Gradient Descent ...\n') 54  55 X = [ones(m, 1), data(:,1)]; % Add a column of ones to x 56 theta = zeros(2, 1); % initialize fitting parameters 57  58 % Some gradient descent settings 59 iterations = 1500; 60 alpha = 0.01; 61  62 % compute and display initial cost 63 computeCost(X, y, theta) 64  65 % run gradient descent 66 theta = gradientDescent(X, y, theta, alpha, iterations); 67  68 % print theta to screen 69 fprintf('Theta found by gradient descent: '); 70 fprintf('%f %f \n', theta(1), theta(2)); 71  72 % Plot the linear fit 73 hold on; % keep previous plot visible 74 plot(X(:,2), X*theta, '-') 75 legend('Training data', 'Linear regression') 76 hold off % don't overlay any more plots on this figure 77  78 % Predict values for population sizes of 35,000 and 70,000 79 predict1 = [1, 3.5] *theta; 80 fprintf('For population = 35,000, we predict a profit of %f\n',... 81     predict1*10000); 82 predict2 = [1, 7] * theta; 83 fprintf('For population = 70,000, we predict a profit of %f\n',... 84     predict2*10000); 85  86 fprintf('Program paused. Press enter to continue.\n'); 87 pause; 88  89 %% ============= Part 4: Visualizing J(theta_0, theta_1) ============= 90 fprintf('Visualizing J(theta_0, theta_1) ...\n') 91  92 % Grid over which we will calculate J 93 theta0_vals = linspace(-10, 10, 100); 94 theta1_vals = linspace(-1, 4, 100); 95  96 % initialize J_vals to a matrix of 0's 97 J_vals = zeros(length(theta0_vals), length(theta1_vals)); 98  99 % Fill out J_vals100 for i = 1:length(theta0_vals)101     for j = 1:length(theta1_vals)102       t = [theta0_vals(i); theta1_vals(j)];    103       J_vals(i,j) = computeCost(X, y, t);104     end105 end106 107 108 % Because of the way meshgrids work in the surf command, we need to 109 % transpose J_vals before calling surf, or else the axes will be flipped110 J_vals = J_vals';111 % Surface plot112 figure;113 surf(theta0_vals, theta1_vals, J_vals)114 xlabel('\theta_0'); ylabel('\theta_1');115 116 % Contour plot117 figure;118 % Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100119 contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20))120 xlabel('\theta_0'); ylabel('\theta_1');121 hold on;122 plot(theta(1), theta(2), 'rx', 'MarkerSize', 10, 'LineWidth', 2);

View Code

2.warmUpExercise.m

1 function A = warmUpExercise() 2 %WARMUPEXERCISE Example function in octave 3 %   A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrix 4  5 A = []; 6 % ============= YOUR CODE HERE ============== 7 % Instructions: Return the 5x5 identity matrix  8 %               In octave, we return values by defining which variables 9 %               represent the return values (at the top of the file)10 %               and then set them accordingly. 11 A = eye(5);12 13 14 15 16 17 18 % ===========================================19 20 21 end

View Code

3. computCost.m

1 function J = computeCost(X, y, theta) 2 %COMPUTECOST Compute cost for linear regression 3 %   J = COMPUTECOST(X, y, theta) computes the cost of using theta as the 4 %   parameter for linear regression to fit the data points in X and y 5  6 % Initialize some useful values 7 m = length(y); % number of training examples 8  9 % You need to return the following variables correctly 10 J = 0;11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Compute the cost of a particular choice of theta14 %               You should set J to the cost.15 hypothesis = X*theta;16 J = 1/(2*m)*(sum((hypothesis-y).^2));17 18 % =========================================================================19 20 end

View Code

4.gradientDescent.m

1 function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) 2 %GRADIENTDESCENT Performs gradient descent to learn theta 3 %   theta = GRADIENTDESENT(X, y, theta, alpha, num_iters) updates theta by  4 %   taking num_iters gradient steps with learning rate alpha 5  6 % Initialize some useful values 7 m = length(y); % number of training examples 8 J_history = zeros(num_iters, 1); 9 10 for iter = 1:num_iters11 12     % ====================== YOUR CODE HERE ======================13     % Instructions: Perform a single gradient step on the parameter vector14     %               theta. 15     %16     % Hint: While debugging, it can be useful to print out the values17     %       of the cost function (computeCost) and gradient here.18     %19     hypothesis = X*theta;20     delta = X'*(hypothesis-y);21     theta  = theta - alpha/m*delta;22 23     % ============================================================24 25     % Save the cost J in every iteration    26     J_history(iter) = computeCost(X, y, theta);27 28 end29 30 end

View Code

Data2

2104,3,3999001600,3,3299002400,3,3690001416,2,2320003000,4,5399001985,4,2999001534,3,3149001427,3,1989991380,3,2120001494,3,2425001940,4,2399992000,3,3470001890,3,3299994478,5,6999001268,3,2599002300,4,4499001320,2,2999001236,3,1999002609,4,4999983031,4,5990001767,3,2529001888,2,2550001604,3,2429001962,4,2599003890,3,5739001100,3,2499001458,3,4645002526,3,4690002200,3,4750002637,3,2999001839,2,3499001000,1,1699002040,4,3149003137,3,5799001811,4,2859001437,3,2499001239,3,2299002132,4,3450004215,4,5490002162,4,2870001664,2,3685002238,3,3299002567,4,3140001200,3,299000852,2,1799001852,4,2999001203,3,239500

View Code

0.ex1_multi.m

1 %% Machine Learning Online Class  2 %  Exercise 1: Linear regression with multiple variables  3 %  4 %  Instructions  5 %  ------------  6 %   7 %  This file contains code that helps you get started on the  8 %  linear regression exercise.   9 % 10 %  You will need to complete the following functions in this  11 %  exericse: 12 % 13 %     warmUpExercise.m 14 %     plotData.m 15 %     gradientDescent.m 16 %     computeCost.m 17 %     gradientDescentMulti.m 18 %     computeCostMulti.m 19 %     featureNormalize.m 20 %     normalEqn.m 21 % 22 %  For this part of the exercise, you will need to change some 23 %  parts of the code below for various experiments (e.g., changing 24 %  learning rates). 25 % 26  27 %% Initialization 28  29 %% ================ Part 1: Feature Normalization ================ 30  31 %% Clear and Close Figures 32 clear ; close all; clc 33  34 fprintf('Loading data ...\n'); 35  36 %% Load Data 37 data = load('ex1data2.txt'); 38 X = data(:, 1:2); 39 y = data(:, 3); 40 m = length(y); 41  42 % Print out some data points 43 fprintf('First 10 examples from the dataset: \n'); 44 fprintf(' x = [%.0f %.0f], y = %.0f \n', [X(1:10,:) y(1:10,:)]'); 45  46 fprintf('Program paused. Press enter to continue.\n'); 47 pause; 48  49 % Scale features and set them to zero mean 50 fprintf('Normalizing Features ...\n'); 51  52 [X mu sigma] = featureNormalize(X); 53  54 % Add intercept term to X 55 X = [ones(m, 1) X]; 56  57  58 %% ================ Part 2: Gradient Descent ================ 59  60 % ====================== YOUR CODE HERE ====================== 61 % Instructions: We have provided you with the following starter 62 %               code that runs gradient descent with a particular 63 %               learning rate (alpha).  64 % 65 %               Your task is to first make sure that your functions -  66 %               computeCost and gradientDescent already work with  67 %               this starter code and support multiple variables. 68 % 69 %               After that, try running gradient descent with  70 %               different values of alpha and see which one gives 71 %               you the best result. 72 % 73 %               Finally, you should complete the code at the end 74 %               to predict the price of a 1650 sq-ft, 3 br house. 75 % 76 % Hint: By using the 'hold on' command, you can plot multiple 77 %       graphs on the same figure. 78 % 79 % Hint: At prediction, make sure you do the same feature normalization. 80 % 81  82 fprintf('Running gradient descent ...\n'); 83  84 % Choose some alpha value 85 alpha = 0.01; 86 num_iters = 400; 87  88 % Init Theta and Run Gradient Descent  89 theta = zeros(3, 1); 90 [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters); 91  92 % Plot the convergence graph 93 figure; 94 plot(1:numel(J_history), J_history, '-b', 'LineWidth', 2); 95 xlabel('Number of iterations'); 96 ylabel('Cost J'); 97  98 % Display gradient descent's result 99 fprintf('Theta computed from gradient descent: \n');100 fprintf(' %f \n', theta);101 fprintf('\n');102 103 % Estimate the price of a 1650 sq-ft, 3 br house104 % ====================== YOUR CODE HERE ======================105 % Recall that the first column of X is all-ones. Thus, it does106 % not need to be normalized.107 price = 0; % You should change this108 109 110 % ============================================================111 112 fprintf(['Predicted price of a 1650 sq-ft, 3 br house ' ...113          '(using gradient descent):\n $%f\n'], price);114 115 fprintf('Program paused. Press enter to continue.\n');116 pause;117 118 %% ================ Part 3: Normal Equations ================119 120 fprintf('Solving with normal equations...\n');121 122 % ====================== YOUR CODE HERE ======================123 % Instructions: The following code computes the closed form 124 %               solution for linear regression using the normal125 %               equations. You should complete the code in 126 %               normalEqn.m127 %128 %               After doing so, you should complete this code 129 %               to predict the price of a 1650 sq-ft, 3 br house.130 %131 132 %% Load Data133 data = csvread('ex1data2.txt');134 X = data(:, 1:2);135 y = data(:, 3);136 m = length(y);137 138 % Add intercept term to X139 X = [ones(m, 1) X];140 141 % Calculate the parameters from the normal equation142 theta = normalEqn(X, y);143 144 % Display normal equation's result145 fprintf('Theta computed from the normal equations: \n');146 fprintf(' %f \n', theta);147 fprintf('\n');148 149 150 % Estimate the price of a 1650 sq-ft, 3 br house151 % ====================== YOUR CODE HERE ======================152 price = 0; % You should change this153 154 155 % ============================================================156 157 fprintf(['Predicted price of a 1650 sq-ft, 3 br house ' ...158          '(using normal equations):\n $%f\n'], price);

View Code

1.featureNormalize.m

1 function [X_norm, mu, sigma] = featureNormalize(X) 2 %FEATURENORMALIZE Normalizes the features in X  3 %   FEATURENORMALIZE(X) returns a normalized version of X where 4 %   the mean value of each feature is 0 and the standard deviation 5 %   is 1. This is often a good preprocessing step to do when 6 %   working with learning algorithms. 7  8 % You need to set these values correctly 9 X_norm = X;10 mu = zeros(1, size(X, 2));11 sigma = zeros(1, size(X, 2));12 13 % ====================== YOUR CODE HERE ======================14 % Instructions: First, for each feature dimension, compute the mean15 %               of the feature and subtract it from the dataset,16 %               storing the mean value in mu. Next, compute the 17 %               standard deviation of each feature and divide18 %               each feature by it's standard deviation, storing19 %               the standard deviation in sigma. 20 %21 %               Note that X is a matrix where each column is a 22 %               feature and each row is an example. You need 23 %               to perform the normalization separately for 24 %               each feature. 25 %26 % Hint: You might find the 'mean' and 'std' functions useful.27 %       28 mu = mean(X);29 sigma = std(X);30 X_norm = (X_norm.-mu)./sigma;31 32 % ============================================================33 34 end

View Code

2.computCostMulti.m

1 function J = computeCostMulti(X, y, theta) 2 %COMPUTECOSTMULTI Compute cost for linear regression with multiple variables 3 %   J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the 4 %   parameter for linear regression to fit the data points in X and y 5  6 % Initialize some useful values 7 m = length(y); % number of training examples 8  9 % You need to return the following variables correctly 10 J = 0;11 12 % ====================== YOUR CODE HERE ======================13 % Instructions: Compute the cost of a particular choice of theta14 %               You should set J to the cost.15 hypothesis = X*theta;16 J = 1/(2*m)*(sum((hypothesis-y).^2));17 18 19 20 21 % =========================================================================22 23 end

View Code

3.gradientDescentMulti.m

1 function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters) 2 %GRADIENTDESCENTMULTI Performs gradient descent to learn theta 3 %   theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by 4 %   taking num_iters gradient steps with learning rate alpha 5  6 % Initialize some useful values 7 m = length(y); % number of training examples 8 J_history = zeros(num_iters, 1); 9 10 for iter = 1:num_iters11 12     % ====================== YOUR CODE HERE ======================13     % Instructions: Perform a single gradient step on the parameter vector14     %               theta. 15     %16     % Hint: While debugging, it can be useful to print out the values17     %       of the cost function (computeCostMulti) and gradient here.18     %19     hypothesis = X*theta;20     delta = X'*(hypothesis-y);21     theta  = theta - alpha/m*delta;22 23     % ============================================================24 25     % Save the cost J in every iteration    26     J_history(iter) = computeCostMulti(X, y, theta);27 28 end29 30 end

View Code

4.normalEqn.m

1 function [theta] = normalEqn(X, y) 2 %NORMALEQN Computes the closed-form solution to linear regression  3 %   NORMALEQN(X,y) computes the closed-form solution to linear  4 %   regression using the normal equations. 5  6 theta = zeros(size(X, 2), 1); 7  8 % ====================== YOUR CODE HERE ====================== 9 % Instructions: Complete the code to compute the closed form solution10 %               to linear regression and put the result in theta.11 %12 13 % ---------------------- Sample Solution ----------------------14 15 theta = pinv(X'*X)*X'*y;16 17 18 % -------------------------------------------------------------19 20 21 % ============================================================22 23 end

View Code

转载于:https://www.cnblogs.com/CheeseZH/p/4597006.html

你可能感兴趣的文章

App右上角数字

查看>>

从.NET中委托写法的演变谈开去（上）：委托与匿名方法

查看>>

六、PowerDesigner 正向工程和逆向工程说明

查看>>

小算法

查看>>

201521123024 《java程序设计》第12周学习总结