week 1 - machine learning - Andrew ng- coursera
2021/12/6 23:20:35
本文主要是介绍week 1 - machine learning - Andrew ng- coursera,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
week1
week1
Table of Contents
- 1. week 1
- 1.1. intro
- 1.1.1. what is ML?
- 1.1.2. supervised learning
- 1.1.3. unsupervised learning
- 1.1.4. test 1
- 1.2. Linear Regression with One Variable
- 1.2.1. model representation
- 1.2.2. cost function–J(θ)
- 1.2.3. gradient descent-梯度下降
- 1.2.4. gradient descent for linear regression
- 1.3. quiz 1
- 1.4. linear algebra review
- 1.4.1. matrices and vectors
- 1.4.2. addition and scalar multiplication
- 1.4.3. matrix vector multiplication
- 1.4.4. matrix matrix multiplication
- 1.4.5. matrix multiplication properties
- 1.4.6. inverse and transpose
- 1.1. intro
1 week 1
1.1 intro
1.1.1 what is ML?
- definition
- the field of study that gives computers the ability to learn without being explicitly programmedd. (Arthur Samuel, 1959)
Tom mitchell (1998) well-posed learning problem: a computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
Example: playing checkers.
E = the experience of playing many games of checkers
T = the task of playing checkers.
P = the probability that the program will win the next game.
E | experience |
T | task |
P | performance |
- classifications:
- Supervised learning
- Unsupervised learning.
1.1.2 supervised learning
categories:
- regression problem - continuous output
- classification problem- discrete output
- example: email spam/not spam
In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.
examples of regression
- housing price prediction
Example 2:
(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture
(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
1.1.3 unsupervised learning
example: google news, clustering
applications:
- organize computing clusters
- social network
- market segmentation
- astronomical data analysis
example 2 cocktail party
Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds
1.1.4 test 1
wrong answer for the following Q
desc: Some of the problems below are best addressed using a supervised
learning algorithm, and the others with an unsupervised
learning algorithm. Which of the following would you apply
supervised learning to? (Select all that apply.) In each case, assume some appropriate
dataset is available for your algorithm to learn from.
1.2 Linear Regression with One Variable
1.2.1 model representation
- m
- number of traning examples
univariate=one variable x(i) – i element of input variables y(i) – i elements of output variables
- h
- hypothesis, h(x)=a+bx
house predicting, regression vs classification
target variable | type |
continuous | regression |
discrete values | classification |
When the target variable that we’re trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.
Linear regression predicts a real-valued output based on an input value. We discuss the application of linear regression to housing price prediction, present the notion of a cost function, and introduce the gradient descent method for learning.
1.2.2 cost function–J(θ)
fig/ octave code: J=sum(((X*theta-y).2))/(2*m)
an example is the least square error least squre estimation 最小二乘估计法
- m
- number of traning examples
J(θ) | cost function |
goal: minimize the cost function. We can measure the accuracy of our hypothesis function by using a cost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x's and the actual output y's.
J(θ0,θ1) =12m∑i=1m(yi−yi)2=12m∑i=1m(hθ(xi)−yi)2J(θ0, θ1) = \dfrac {1}{2m} \displaystyle ∑ {i=1}m \left ( \hat{y}i- yi \right)2 = \dfrac {1}{2m} \displaystyle ∑ _{i=1}m \left (hθ (xi) - yi \right)2J(θ0,θ1)=2m1i=1∑m(yi−yi)2=2m1i=1∑m(hθ(xi)−yi)2
To break it apart, it is 0.5 \bar{x}, where \bar{x}= the mean of the squares of hθ(xi)−yi or the difference between the predicted value and the actual value.
This function is otherwise called the "Squared error function", or "Mean squared error". The mean is halved (12)\left(\frac{1}{2}\right)(21) as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the 12\frac{1}{2}21 term. The following image summarizes what the cost function does:
- contour plot
- a graph that contains many contour lines
- (no term)
- feature:
- a contour line of a two variable function has a constant value at all points of the same line.
1.2.3 gradient descent-梯度下降
goal: minimize the cost function J(θ1)
an algorithm for automatically finding that value of theta0 and theta1 that minimizes the cost function
We put theta0 on the x axis and θ1 on the y axis, with the cost function on the vertical z axis.
https://www.hackerearth.com/blog/developers/gradient-descent-algorithm-linear-regression/
1.2.4 gradient descent for linear regression
1.3 quiz 1
1.4 linear algebra review
This optional module provides a refresher on linear algebra concepts. Basic understanding of linear algebra is necessary for the rest of the course, especially as we begin to cover models with multiple variables.
1.4.1 matrices and vectors
- Aij refers to the element in the ith row and jth column of matrix A.
- A vector with 'n' rows is referred to as an 'n'-dimensional vector.
- vi refers to the element in the ith row of the vector.
- In general, all our vectors and matrices will be 1-indexed. Note that for some programming languages, the arrays are 0-indexed.
- Matrices are usually denoted by uppercase names while vectors are lowercase.
- "Scalar" means that an object is a single value, not a vector or matrix.
- R refers to the set of scalar real numbers.
- Rn refers to the set of n-dimensional vectors of real numbers.
% The ; denotes we are going back to a new row.
> A = [1, 2, 3; 4, 5, 6; 7, 8, 9; 10, 11, 12]
A = 1 2 3 4 5 6 7 8 9 10 11 1
% Get the dimension of the matrix A where m = rows and n = columns >[m,n] = size(A)
m = 4 n = 3 % You could also store it this way
dim_A = size(A)
dimA =
4 3
% let's index into the 2nd row 3rd column of matrix A A23 = A(2,3)
A_23 = 6
% Initialize a vector
v = [1;2;3]
v =
1 2 3
% Get the dimension of the vector v dimv = size(v)
dim_v = 3 1
1.4.2 addition and scalar multiplication
+ | addition |
- | subtraction |
note:To add or subtract two matrices, their dimensions must be the same.
[a b c d]+[w x y z]=[a+w b+x c+y d+z]
[a b c d]−[w x y z]=[a−w b−x c−y d−z]
% Initialize matrix A and B A = [1, 2, 4; 5, 3, 2] B = [1, 3, 4; 1, 1, 1]
A = 1 2 4 5 3 2
% Initialize constant s s = 2
% See how element-wise addition works addAB = A + B
% See how element-wise subtraction works subAB = A - B
% See how scalar multiplication works multAs = A * s
% Divide A by s divAs = A / s
% What happens if we have a Matrix + scalar? addAs = A + s addAs =
3 4 6 7 5 4
1.4.3 matrix vector multiplication
we map the column of the vector onto each row of the matrix, multiplying each element and summing the result. [a b; c d; e f]∗[x; y]=[a∗x+b∗y; c∗x+d∗y; e∗x+f∗y] The result is a vector. The number of columns of the matrix must equal the number of rows of the vector.
An m x n matrix multiplied by an n x 1 vector results in an m x 1 vector.
exercise
% Initialize matrix A A = [1, 2, 3; 4, 5, 6;7, 8, 9] % Initialize vector v v = [1; 1; 1] % Multiply A * v Av = A * v
example of an application: house prediction
1.4.4 matrix matrix multiplication
% Initialize a 3 by 2 matrix A = [1, 2; 3, 4; 5, 6]
% Initialize a 2 by 1 matrix B = [1; 2]
% We expect a resulting matrix of (3 by 2)*(2 by 1) = (3 by 1) multAB = A*B
% Make sure you understand why we got that result
example of an application: house prediction
1.4.5 matrix multiplication properties
- matrix are not comutative Matrices are not commutative: A∗B≠B∗A
- associative, yes Matrices are associative: (A∗B)∗C=A∗(B∗C)=(A∗B)∗C
- indentity matrix
exercise
% Initialize random matrices A and B A = [1,2;4,5] B = [1,1;0,2]
% Initialize a 2 by 2 identity matrix I = eye(2)
% The above notation is the same as I = [1,0;0,1]
% What happens when we multiply I*A ? IA = I*A
% How about A*I ? AI = A*I
% Compute A*B AB = A*B
% Is it equal to B*A? BA = B*A
% Note that IA = AI but AB != BA
1.4.6 inverse and transpose
The inverse of a matrix A is denoted A-1. Multiplying by the inverse results in the identity matrix. > A=[1 2 3; 4 5 6]; > pinv(A) % octave > inv(A) % matlab > transpose(A); A non square matrix does not have an inverse matrix. We can compute inverses of matrices in octave with the pinv(A) function and in Matlab with the inv(A) function. Matrices that don't have an inverse are singular or degenerate.
The transposition of a matrix is like rotating the matrix 90° in clockwise direction and then reversing it. We can compute transposition of matrices in matlab with the transpose(A) function or A':
Created: 2021-12-06 Mon 22:14
Emacs 25.3.1 (Org mode 8.2.10)
Validate
这篇关于week 1 - machine learning - Andrew ng- coursera的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-12-25TypeScript基础知识详解
- 2024-12-25安卓NDK 是什么?-icode9专业技术文章分享
- 2024-12-25caddy 可以定义日志到 文件吗?-icode9专业技术文章分享
- 2024-12-25wordfence如何设置密码规则?-icode9专业技术文章分享
- 2024-12-25有哪些方法可以实现 DLL 文件路径的管理?-icode9专业技术文章分享
- 2024-12-25错误信息 "At least one element in the source array could not be cast down to the destination array-icode9专业技术文章分享
- 2024-12-25'flutter' 不是内部或外部命令,也不是可运行的程序 或批处理文件。错误信息提示什么意思?-icode9专业技术文章分享
- 2024-12-25flutter项目 as提示Cannot resolve symbol 'embedding'提示什么意思?-icode9专业技术文章分享
- 2024-12-24怎么切换 Git 项目的远程仓库地址?-icode9专业技术文章分享
- 2024-12-24怎么更改 Git 远程仓库的名称?-icode9专业技术文章分享