机器学习——Logistic回归

2019-11-14 17:44:03

字体：大中小

来源：转载

供稿：网友

参考《机器学习实战》

利用Logistic回归进行分类的主要思想：

根据现有数据对分类边界线建立回归公式，以此进行分类。

分类借助的Sigmoid函数：

Sigmoid函数图：

Sigmoid函数的作用：

将所有特征都乘上一个回归系数，然后将所有结果值相加，将这个总和代入Sigmoid函数中，进而得到一个0～1之间的数值。任何大于0.5的数据被分1类，小于0.5分入0类。

综上，Sigmoid的输入可以记为z：

所以向量w即是我们要通过最优化方法找的系数。

w向量的求解：

1）、梯度上升法（思想：沿函数梯度方向找函数最大值）

梯度上升法伪代码：

更新w系数的细节实现代码：

要注意的是，作者在这里dataMatrix的特征矢量维度比实际特征矢量多了一维。作者使用的数据是二维[x1,x2]，而程序中增加了一维[x0=1,x1,x2].奇怪的是x0加在了最前面的位置，而不是最后的位置。此外，上图中画红线处的公式作者未给出由来，网上搜索了下，找到一篇博文，写得还不错。这里帖上点简要概述：

具体过程如下：（参考:http://blog.csdn.net/yangliuy/article/details/18504921?reload）

参数概率方程：

其中x为训练特征，y为x对应的类，θ为待估计参数

利用上式中y只取0或1的特点，刚好可以表示为:

似然函数：（这里就是Logistic Regression的目标函数，原书中并未指明，所以如果不网上找logistic的资料区先前学过机器学习，很无法理解书中的做法的）

对数似然函数：

所以极大似然估计：

从而得到梯度上升法的递归公式：

这里也就是上面的图中，我画红线处公式的由来了。

这里再上传下自己写的代码（未优化的logistic算法），代码中的数据来源仍是《机器学习实战》一书提供的数据：

#-*- coding:cp936 -*-import numpy as npimport matplotlib.pyplot as pltclass Log_REG():    def __init__(self):        self._closed=False            def loadData(self, dataFile='testSet.txt'):        f_file = open(dataFile)        lines = f_file.readlines()        line_data = lines[0].strip().split()        self.num_feature = len(line_data) - 1        self.xData = np.zeros((len(lines), self.num_feature + 1))        self.label = np.zeros((len(lines), 1))        self.num_label = len(lines)        line_cnt = 0        for iLine in lines:            line_data = iLine.strip().split()              for i in range(self.num_feature):                self.xData[line_cnt][i] = float(line_data[i])            self.xData[line_cnt][self.num_feature] = 1            self.label[line_cnt] = float(line_data[-1])            line_cnt+=1        def _sigmoid(self, z):        return 1.0 / (1 + np.exp(-z))        def gradAscendClass(self):        maxIter = 500        self.omiga = np.ones((1, self.num_feature+1))        xDataMat = np.matrix(self.xData)        alpha = 0.01        self.omiga_record=[]        for i in range(maxIter):            h = self._sigmoid(self.omiga * xDataMat.transpose())  # 矩阵乘            error = self.label - h.transpose()            self.omiga = self.omiga + alpha * (xDataMat.transpose()*error).transpose()             self.omiga_record.append(self.omiga)            if np.sum(np.abs(error)) < self.num_label * 0.05:                PRint  "error very low",i                break        def stochasticGradAscend(self):        pass#         maxIter = 150#         self.omiga = np.ones((1,self.num_feature+1))#         for     def plotResult(self):        self._close()        if self.num_feature != 2:            print "Only plot data with 2 features!"            return        label0x = []        label0y = []        label1x = []        label1y = []        for i in range(self.num_label):            if int(self.label[i]) == 1:                label1x.append(self.xData[i][0])                label1y.append(self.xData[i][1])            else:                label0x.append(self.xData[i][0])                label0y.append(self.xData[i][1])        fig = plt.figure()        ax = fig.add_subplot(111)        ax.scatter(label0x, label0y, c='b',marker='o')        ax.scatter(label1x, label1y, c='r',marker='s')                minx = min(min(label0x),min(label1x))        maxx = max(max(label0x),max(label1x))        wx = np.arange(minx,maxx,0.1)        wy = (-self.omiga[0,2]-self.omiga[0,0]*wx)/self.omiga[0,1]         ax.plot(wx,wy)                def plotIteration(self):        self._close()        iterTimes = len(self.omiga_record)        w0=[i[0][0,0] for i in self.omiga_record]        w1=[i[0][0,1] for i in self.omiga_record]        w2=[i[0][0,2] for i in self.omiga_record]        fig = plt.figure()        ax1 = fig.add_subplot(3,1,1)        ax1.plot(range(iterTimes),w0,c='b')#,marker='*')        plt.xlabel('w0')        ax2 = fig.add_subplot(3,1,2)        ax2.plot(range(iterTimes),w1,c='r')#,marker='s')        plt.xlabel('w1')        ax3 = fig.add_subplot(3,1,3)        ax3.plot(range(iterTimes),w2,c='g')#,marker='o')        plt.xlabel('w2')    def show(self):        plt.show()    def _close(self):        pass                            if __name__ =='__main__':    testclass = Log_REG()    testclass.loadData()    testclass.gradAscendClass()    testclass.plotResult()        testclass.plotIteration()    testclass.show()

显示结果：

分类结果

分类参数收敛结果

梯度上升（或下降）算法的改进：

当数据量很大时，上述梯度上升算法每次迭代都要对所有数据进行处理，会造成计算量异常庞大。解决的方法是引入随机梯度的思想。

随机梯度下降的基本原理是：不直接计算梯度的精确值，而是用梯度的无偏估计g(w)来代替梯度：

实际操作时，随机地选取单个数据而非整个数据集参与迭代，详细的原理推导可参见：http://www.52ml.net/2024.html

改进的随机梯度上升法：

def stochasticGradAscend2(self):        maxIter = 150        self.omiga = np.ones((1,self.num_feature+1))        self.omiga_record=[]                for j in range(maxIter):            randRange = range(self.xData.shape[0])            for i in range(self.xData.shape[0]):                alpha = 4/(1.0+i+j)+0.01                randIndex  = int(random.uniform(0,len(randRange)-1))                index = randRange[randIndex]                h = self._sigmoid(np.matrix(self.omiga)[0]*np.matrix(self.xData[index,:]).transpose())                error = self.label[index]-h                                self.omiga  = self.omiga+alpha*error*self.xData[index,:]                self.omiga_record.append(np.matrix(self.omiga))                del(randRange[randIndex])