Caffe 笔记

caffe basic

installation, howto?? see yhlleo’s blog post: Ubuntu14.04 安装CUDA7.5 + Caffe + cuDNN.

BVLC/caffe: Caffe: a fast open framework for deep learning.

caffe 代码

My fork: district10/caffe-rc3: Play with caffe.

我注解过的 notebook:1

  • 00-classification.ipynb @
    filters([n,k,h,w])---->transpose(0,2,3,1)---->filters([n,h,w,k])---->feed into----> vis_square
    # the parameters are a list of [weights, biases]
    filters = net.params['conv1'][0].data
    vis_square(filters.transpose(0, 2, 3, 1))

    refer to …

  • 01-learning-lenet.ipynb @

    null.

    blobs and params @

    blob = {data, diff}, shape: (batch size, feature dim, spatial dim)

    # each output is (batch size, feature dim, spatial dim)
    [(k, v.data.shape) for k, v in solver.net.blobs.items()]
    
    #       [('data', (64, 1, 28, 28)),
    #        ('label', (64,)),
    #        ('conv1', (64, 20, 24, 24)),
    #        ('pool1', (64, 20, 12, 12)),
    #        ('conv2', (64, 50, 8, 8)),
    #        ('pool2', (64, 50, 4, 4)),
    #        ('ip1', (64, 500)),
    #        ('ip2', (64, 10)),
    #        ('loss', ())]

    params = [weights, biases]

    # weights
    [(k, v[0].data.shape) for k, v in solver.net.params.items()]
    #       [('conv1', (20, 1, 5, 5)),
    #        ('conv2', (50, 20, 5, 5)),
    #        ('ip1', (500, 800)),
    #        ('ip2', (10, 500))]
    
    # biases
    [(k, v[1].data.shape) for k, v in solver.net.params.items()]
    #       [('conv1', (20,)),
    #       ('conv2', (50,)),
    #       ('ip1', (500,)),
    #       ('ip2', (10,))]
    关于卷积层的维度

    conv layer

    • weights: (4096, 9216) -> [output, input]
    • biases: (4096,) -> [output]

    fc-conv layer

    • weights: (4096, 256, 6, 6) -> [output, input, h, w]
    train net & test net
    # train net
    solver.net.forward()
    
    # test net (there can be more than one)
    solver.test_nets[0].forward()
    #       {'loss': array(2.4466583728790283, dtype=float32)}
    维度的操作 @

    这部分你需要理解如下的维度操作:[n, k, h, w] -> [n, k=1, h, w] -> [n1, n2, h, w] -> [n1, h, n2, w],具体的解释可以参考我的 notebook。

    显示所有的 filters(共 20 filters)。4 行 5 列:

    imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4, 5, 5, 5)
           .transpose(0, 2, 1, 3).reshape(4*5, 5*5), cmap='gray')

    只显示第 1 行:

    imshow(solver.net.params['conv1'][0].diff[:5, 0].reshape(1, 5, 5, 5)
        .transpose(0, 2, 1, 3).reshape(1*5, 5*5), cmap='gray')

    右下角的 2x3 个 filters:

    imshow(solver.net.params['conv1'][0].diff[[12,13,14,17,18,19], 0].reshape(2, 3, 5, 5)
        .transpose(0, 2, 1, 3).reshape(2*5, 3*5), cmap='gray')
  • 02-brewing-logreg.ipynb

  • 03-fine-tuning.ipynb

  • net_surgery.ipynb

  • detection.ipynb @

    Let’s run detection on an image of a bicyclist riding a fish bike in the desert (from the ImageNet challenge—no joke).

    这个例子跑不起来了。


HED 边缘检测

My fork: district10/hed

这里有两份注释过的笔记:

  • HED-tutorial.ipynb

    一个用 pretrained 的 model 来测试,得到边缘结果图。这个例子运行起来很快。

  • solve.ipynb

    训练 model。当然,训练起来很慢。需要 days,不是 hours。

    # make a bilinear interpolation kernel
    # credit @longjon
    def upsample_filt(size):
        factor = (size + 1) // 2 # ‘//’ 确保了结果是整数,和‘/’不一样
        if size % 2 == 1:
            center = factor - 1
        else:
            center = factor - 0.5
        og = np.ogrid[:size, :size]
        return (1 - abs(og[0] - center) / factor) * \
               (1 - abs(og[1] - center) / factor)
    
    # set parameters s.t. deconvolutional layers compute bilinear interpolation
    # N.B. this is for deconvolution without groups
    # N.B. 啥意思?:
    #       Derived from the Latin (and italian) nota bene, meaning note well (take notice).:
    #       It is used to draw the attention to a certain aspect.
    def interp_surgery(net, layers):
        for l in layers:
            m, k, h, w = net.params[l][0].data.shape
            if m != k:
                print 'input + output channels need to be the same'
                raise
            if h != w:
                print 'filters need to be square'
                raise
            filt = upsample_filt(h)
            # 对 layer l 的 weights 进行设置(设置一个 filter)
            net.params[l][0].data[range(m), range(k), :, :] = filt
    image_data_param {
        root_folder: "../../data/HED-BSDS/"
        source: "../../data/HED-BSDS/train_pair.lst"
        batch_size: 1
        shuffle: true
        new_height: 192
        new_width: 193
    }


  1. 虽然 GitHub 现在支持显示 .ipynb 文件,我还是更喜欢 jupyter 提供的 nbviewer 链接。