DL QuickStart

SSD_MobileNet_v1 300x300 (1055fps)
YOLOv5m 640x640 (218fps)
- YOLOv4_tiny (610fps)
- YOLOv6n (345fps)
- YOLOv7 (45fps)
- YOLOx_s_wide (75fps)
- YOLOv3 (26fps)
- YOLOv8n (270fps)
- YOLOv8s (128fps)
- YOLOv8m (55fps)

3.2 Segmentation

stdc1 1024x1920 34fps

3.3 Multi stream object detection (8stream)

YOLOv3 608x608 66fps

3.4 Classification

ResNet-50v1 224x224, 1331fps
MobileNet_v2_1.0 224x224, 2444fps
EfficientNet_M 240x240, 890fps

4 RNN

http://blog.csdn.net/heyongluoyao8/article/details/48636251

http://rayz0620.github.io/2015/05/14/rnn_note_1/

https://www.zhihu.com/question/34681168

5 Framework

推荐斯坦福的 CS231n 课程

在 Lecture 12 中，JJ 对于几个主流的库的应用情况，优缺点逐个做了详细的说明：

5.1 TensorFlow

5.1.1 Install

$ sudo apt-get install python-pip python-dev   # for Python 2.7
$ sudo apt-get install python3-pip python3-dev # for Python 3.n

 $ pip install tensorflow      # Python 2.7; CPU support (no GPU support)
 $ pip3 install tensorflow     # Python 3.n; CPU support (no GPU support)
 $ pip install tensorflow-gpu  # Python 2.7;  GPU support
 $ pip3 install tensorflow-gpu # Python 3.n; GPU support

$ pip install -i https://pypi.tuna.tsinghua.edu.cn/simple tensorflow-gpu==2.5.0

5.1.2 Uninstall

$ sudo pip uninstall tensorflow  # for Python 2.7
$ sudo pip3 uninstall tensorflow # for Python 3.n

5.1.3 Build from Git in Windows

The latest guide: https://tensorflow.google.cn/install/source_windows

5.1.4 Build from Git in Linux

https://tensorflow.google.cn/install/source

$ git clone https://github.com/tensorflow/tensorflow 
$ cd tensorflow
$ git checkout r1.0

Install Bazel: https://bazel.build/versions/master/docs/install.html
- https://docs.bazel.build/versions/main/install-ubuntu.html
- https://github.com/bazelbuild/bazel/releases?expanded=true&page=2&q=0.25

$ sudo apt-get install python-numpy python-dev python-pip python-wheel

$ sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel

 $ sudo apt-get install libcupti-dev

$ cd tensorflow # cd to the top-level directory created
$ ./configure
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA JIT support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] N
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] Y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to use system default]: 5
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
Setting up Cuda include
Setting up Cuda lib
Setting up Cuda bin
Setting up Cuda nvvm
Setting up CUPTI include
Setting up CUPTI lib64
Configuration finished

$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.2.0-py2-none-any.whl

5.1.5 Validate your installation

$ python
# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

Hello, TensorFlow!

5.2 DarkNet

Tiny DarkNet

5.3 Caffe

Caffe Installation on Ubuntu 14.04 (CPU) with PYTHON support

6 Hardware

6.1 GPU Architecture

6.2 TPU Architecture

Tensor Processing Unit(or TPUs) are application-specific integrated circuits (ASICs) developed specifically for machine learning.

Paper
Blog Quantifying the performance of the TPU, our first machine learning chip 2017-04-05

7 Notes

7.1 Nvidia drivers in windows

安装完驱动，就会有 nvidia-smi 这个工具：

DELL@BH1RBH MINGW64 ~
$ nvidia-smi
Fri Feb 28 10:10:24 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.94                 Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 980       WDDM  |   00000000:01:00.0 Off |                  N/A |
| 33%   56C    P0             49W /  180W |       0MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     11688      C   ...win-pre1.0\system\python\python.exe      N/A      |
+-----------------------------------------------------------------------------------------+

7.2 Install Nvidia driver for ubuntu 16.04

Blacklist the modules. Open the blacklist.conf file.

add the following modules in the file /etc/modprobe.d/blacklist.conf

#this might not be required for x86 32 bit users
blacklist amd76x_edac
blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv

$ sudo update-initramfs -u

reboot the system, then:

$ sudo apt-get install gcc libc-dev make 
$ sudo ./NVIDIA-Linux-x86_64-343.22.run # for GeForce 980

$ sudo ./NVIDIA-Linux-x86_64-381.22.run # for GeForce 1050M

7.3 Install CUDA Toolkit

Download: https://developer.nvidia.com/cuda-downloads

$ sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cuda

add the following lines into /etc/bash.bashrc:

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH="/usr/local/cuda/lib64 $LD_LIBRARY_PATH"

8 OpenAI

从OpenAI看深度学习研究前沿

9 Applications

9.1 Computor Vision

9.1.1 Face detection

Face Detection and Recognition (Theory and Practice)

CNN网络二值化--XNOR-Net

RAISR（Rapid and Accurate Image Super-Resolution），利用机器学习将低分辨率图像转化为高分辨率图像。这项技术能够在节省带宽75%的情况下分辨率效果达到甚至超过原图，同时速度能够提升大约10到100倍

9.1.2 OpenFace

http://shamangary.logdown.com/posts/800267-openface-installation

9.1.3 Facial Keypoints Detection

https://www.kaggle.com/c/facial-keypoints-detection Facial Keypoints Detection - Kaggle project
dnouri's Kaggle Facial Keypoints Detection tutorial
https://github.com/saber1988/facial-keypoints-detection

9.1.4 Facial Emotion Recognition

9.1.5 Texture Matching

Texture Matching using Local Binary Patterns (LBP), OpenCV, scikit-learn and Python

10 Furture

Deep Reinforcement Learning 深度增强学习资源 (持续更新）https://zhuanlan.zhihu.com/p/20885568?refer=intelligentunit

获取人工智能AI前沿信息 https://zhuanlan.zhihu.com/p/21263408?refer=intelligentunit

11 Reference

问：您对Jeff Hawkins对深度学习的批评有什么看法？Hawkins是On Intelligence一书的作者, 该书2004年出版，内容关于大脑如何工作，以及如何参考大脑来制造智能机器。他声称深度学习没有对时间序列建模。人脑是基于一系列的传感数据进行思考的，人的学习主要在于对序列模式的记忆，比如你看到一个搞怪猫的视频，实际是猫的动作让你发笑，而不是像Google公司所用的静态图片。参见这个链接

答：时间相关的神经网络其实有很多工作，递归神经网络模型对时间关系隐性建模，通常应用于语音识别。比如下面这两个工作。

[1] http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf

[2] http://papers.nips.cc/paper/5166 ... neural-networks.pdf

还有这篇文章：http://arxiv.org/abs/1312.6026.

自然语言处理中的序列也有所考虑：http://arxiv.org/abs/1306.2795

问：根据我的理解，深度神经网络训练上的成功跟选取正确的超参数有关系，比如网络深度，隐含层的大小，稀疏约束值等等。有些论文基于随机搜索来寻找这些参数。可能跟代码写得好也有关系。有没有一个地方能让研究者找到某些特定任务的合理超参数呢？在这些参数的基础上，可能更容易找到更优化的参数。

答：可以看上文关于超参数的部分。James Bergstra 继续了这部分工作。我觉得有这么一个数据库，存储着许多推荐的超参数设置，对于神经网络训练是非常有好处的。Github上面的Hyperopt项目，做了类似的事情。hyperopt项目聚焦于神经网络、卷积网络，给出一些超参数设置的建议。以简单的因子分布的形式给出。比如隐含层的数量应该是1到3，每一层的隐含单元数目应该是50到5000。其实超参数还有很多，以及更好的超参数搜索算法等等。下面是更多的参考论文：

问：Bengio教授，在深度学习中，有那么一类方法，采用比较高级的数学如代数和拓扑集合。John Healy几年前声称通过通过范畴论（Category Theory）改进了神经网络（ART1）。您对于这类尝试有什么看法？是儿戏还是很有前途？

答：可以看看Morton和Montufar的工作，参考附加材料：

热带几何以及概率模型中的热带几何

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.242.9890