
关于Ubuntu18.04/20.04安装后的一系列环境配置过程的总结
关于Ubuntu18.04/20.04安装后的一系列环境配置过程的总结
ZZU-RoboCup环境配置
Updating(最速更新链接【支持一些扩展语法,观感良好】):博客)…
[!IMPORTANT]
ZZU-SR的童鞋配置环境前可以给鄙人发邮件:E-mail,另外:
如果装了
Anaconda3/Miniconda3
,最好设置auto_activate_base: false
,或者需要编译时conda deactivate
,否则会影响编译。本文根据重要程度对各个步骤进行分类:
- ESSENTIAL (必需)
- RECOMMENDED (推荐)
- OPTIONAL (可选)
- NOT RECOMMENDED (不宜)
- NOT REQUIRED (无需)
- EOL (停更)
目录
本文用到的部分文件打包供无法访问部分网站的童鞋下载 (EOL):
链接:
https://pan.baidu.com/s/1PgmWHKl8oyX_cWYx_uZJrg?pwd=zwz4
提取码:
zwz4
ESSENTIAL
刚进入系统一段时间,系统会通知是否更新到新版本的系统(比如Ubuntu20.04 → Ubuntu22.04 or Later),选择否,之后会询问是否更新系统组件,选择否。
阻止软件更新弹窗:
打开终端输入:
sudo chmod a-x /usr/bin/update-notifier
将关机时间从90秒换为5秒:
sudo gedit /etc/systemd/system.conf
将:
#DefaultTimeoutStopSec=90s
改为:
DefaultTimeoutStopSec=5s
保存退出,打开终端输入:
sudo systemctl daemon-reload
更换国内源(可替换为自己中意的镜像源)
sudo gedit /etc/apt/sources.list
将原本的注释掉,在最下方加入:
# 华科源(Ubuntu 18.04)【默认注释了源码仓库,如有需要可自行取消注释】
deb https://mirrors.hust.edu.cn/ubuntu bionic main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu bionic main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu bionic-updates main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu bionic-updates main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu bionic-backports main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu bionic-backports main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu bionic-security main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu bionic-security main restricted universe multiverse
# 预发布软件源,不建议启用
# deb https://mirrors.hust.edu.cn/ubuntu bionic-proposed main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu bionic-proposed main restricted universe multiverse
# 华科源(Ubuntu 20.04)【默认注释了源码仓库,如有需要可自行取消注释】
deb https://mirrors.hust.edu.cn/ubuntu focal main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu focal main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu focal-updates main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu focal-updates main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu focal-backports main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu focal-backports main restricted universe multiverse
deb https://mirrors.hust.edu.cn/ubuntu focal-security main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu focal-security main restricted universe multiverse
# 预发布软件源,不建议启用
# deb https://mirrors.hust.edu.cn/ubuntu focal-proposed main restricted universe multiverse
# deb-src https://mirrors.hust.edu.cn/ubuntu focal-proposed main restricted universe multiverse
sudo apt update -y && sudo apt upgrade -y
anaconda
镜像源(~/.condarc
):
[!TIP]
注意替换
envs_dirs
中的绝对路径!
custom_channels
中鄙人只填了自己可能用到的,其它第三方源列表可参考清华源的文档自行添加。
channels:
- defaults
show_channel_urls: true
default_channels:
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/pro
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
- https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
custom_channels:
msys2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
menpo: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
numba: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pyviz: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
omnia: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
ohmeta: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
plotly: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
fastai: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
caffe2: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
Paddle: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
dglteam: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
rapidsai: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
MindSpore: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch3d: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch-lts: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
pytorch-test: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
nvidia: https://mirrors.sustech.edu.cn/anaconda-extra/cloud
envs_dirs:
- /home/m0rtzz/Programs/anaconda3/envs
auto_activate_base: false
pip
设置镜像源:
mkdir -p ${HOME}/.config/pip/ && cd ${HOME}/.config/pip/ && \
tee ${HOME}/.config/pip/pip.conf > /dev/null << EOF
[global]
index-url = https://mirrors.hust.edu.cn/pypi/web/simple
extra-index-url =
https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
https://mirrors.bfsu.edu.cn/pypi/web/simple
# https://pypi.nvidia.com
# https://pypi.ngc.nvidia.com
trusted-host =
mirrors.hust.edu.cn
mirrors.tuna.tsinghua.edu.cn
mirrors.bfsu.edu.cn
pypi.nvidia.com
pypi.ngc.nvidia.com
no-cache-dir = true
EOF
禁用Nouveau驱动
sudo tee -a /etc/modprobe.d/blacklist.conf > /dev/null << EOF
# 禁用Nouveau驱动
blacklist nouveau
options nouveau modeset=0
EOF
sudo update-initramfs -u
reboot
NVIDIA驱动
[!CAUTION]
由于众所周知的原因,安装
NVIDIA
显卡驱动有可能会损坏系统,如果损坏可以重装并看看网上的其他教程,鄙人曾经尝试过多种方法,认定这种方法最快捷且最不容易损坏系统。
打开终端,输入:
sudo apt install -y gcc g++ make zlib1g
sudo ubuntu-drivers devices
寻找带有recommended
的版本,输入:
# `your_version`是你的版本号
sudo apt install -y nvidia-driver-your_version nvidia-settings nvidia-prime
sudo apt update -y
sudo apt upgrade -y
reboot
验证版本:
nvidia-smi
CUDA
https://developer.nvidia.com/cuda-toolkit-archive
选择≤上一步nvidia-smi
显示的CUDA Version
进行安装,官方有教程。
安装好之后打开终端输入:
sudo tee -a /etc/profile > /dev/null << 'EOF'
# CUDA
export PATH=${PATH}:/usr/local/cuda/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
export CUDA_HOME=/usr/local/cuda # 通过设置软链接`/usr/local/cuda`,可以做到多版本CUDA共存
EOF
source /etc/profile
接下来验证CUDA
版本:
nvcc --version
安装成功!
cuDNN
https://developer.nvidia.com/rdp/cudnn-archive
[!TIP]
官方安装教程(选择合适版本的NVIDIA cuDNN Installation Guide,鄙人一般来说会安装和已安装
CUDA
的发布时间相近的版本,之前安装PaddlePaddle
的时候发现GPU
版PaddlePaddle
依赖库要求的CUDA
工具包版本和cuDNN
版本貌似也是这样对应的):https://docs.nvidia.com/deeplearning/cudnn/archives/index.html
tar -xvf cudnn-linux-x86_64-8.x.x.x_cudaX.Y-archive.tar.xz
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include
sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
验证是否安装成功:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
ROS-noetic(有些图忘记截了)
sudo gpg --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654
sudo gpg --export C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654 | sudo tee /usr/share/keyrings/ros.gpg > /dev/null
sudo tee /etc/apt/sources.list.d/ros-latest.list > /dev/null << EOF
deb [signed-by=/usr/share/keyrings/ros.gpg] https://mirrors.hust.edu.cn/ros/ubuntu/ $(lsb_release -sc) main
EOF
sudo apt update -y && sudo apt install -y ros-noetic-desktop-full
echo 'source /opt/ros/noetic/setup.bash' >> ~/.bashrc
source ~/.bashrc
sudo apt install -y python3-rosdep python3-rosinstall python3-rosinstall-generator python3-wstool build-essential
sudo apt install -y python3-pip
使用镜像源加速pip
下载:
sudo pip3 install rosdepc -i https://mirrors.hust.edu.cn/pypi/web/simple
sudo rosdepc init
rosdepc update
sudo chmod 777 -R ~/.ros/
roscore
再新建两个终端,分别输入:
rosrun turtlesim turtlesim_node
rosrun turtlesim turtle_teleop_key
在 rosrun turtlesim turtle_teleop_key
所在终端点击一下任意位置,然后使用←↕→
小键盘控制,看小海龟会不会动,如果会动则安装成功。
OpenCV-4.2.0及其扩展模块
经尝试多版本Ubuntu和OpenCV,装Ubuntu20.04,ROS noetic和OpenCV4.2.0及其扩展模块才能解决将彩色图像转换为网络所需的输入Blob后前馈时抛出的raised OpenCV exception
和error: (-215:Assertion failed)
等错误。
cmake命令
以下为几次成功安装的命令(注意替换命令中的绝对路径),安装过程可以参考NOT RECOMMENDED中的OpenCV3
安装步骤:
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK=ON \
-D WITH_VTK=ON \
-D WITH_ADE=OFF \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_LAPACK=OFF \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D CUDA_CUDA_LIBRARY=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-4.2.0/opencv_contrib-4.2.0/modules \
..
或:
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK=ON \
-D WITH_VTK=ON \
-D WITH_ADE=OFF \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_LAPACK=OFF \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D CUDA_CUDA_LIBRARY=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda \
-D CUDNN_LIBRARY=/usr/local/cuda/lib64/libcudnn.so \
-D CUDA_CUDA_LIBRARY=/usr/local/cuda/lib64/stubs/libcuda.so \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-4.2.0/opencv_contrib-4.2.0/modules \
..
或:
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK=ON \
-D WITH_VTK=ON \
-D WITH_ADE=OFF \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_LAPACK=OFF \
-D CUDA_ARCH_BIN=8.6 \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D CUDA_CUDA_LIBRARY=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D CUDA_HOST_COMPILER:FILEPATH=/usr/bin/gcc \
-D CUDNN_LIBRARY=/usr/local/cuda/lib64/libcudnn.so \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-4.2.0/opencv_contrib-4.2.0/modules \
..
CUDA_ARCH_BIN
查看命令:
sudo apt install -y mlocate
sudo updatedb.mlocate
mlocate deviceQuery | grep cuda | head -n 1 | xargs -r bash -c | grep 'CUDA Capability Major/Minor version number:'
部分报错解决办法
cuDNN8.X相关
无法识别cuDNN版本
解决办法:
# @file: $(git rev-parse --show-toplevel)/cmake/FindCUDNN.cmake
# @line: 66左右
# extract version from the include
if(CUDNN_INCLUDE_DIR)
file(READ "${CUDNN_INCLUDE_DIR}/cudnn.h" CUDNN_H_CONTENTS) # [!code --]
if(EXISTS "${CUDNN_INCLUDE_DIR}/cudnn_version.h") # [!code ++]
file(READ "${CUDNN_INCLUDE_DIR}/cudnn_version.h" CUDNN_H_CONTENTS) # [!code ++]
else() # [!code ++]
file(READ "${CUDNN_INCLUDE_DIR}/cudnn.h" CUDNN_H_CONTENTS) # [!code ++]
endif() # [!code ++]
string(REGEX MATCH "define CUDNN_MAJOR ([0-9]+)" _ "${CUDNN_H_CONTENTS}")
set(CUDNN_MAJOR_VERSION ${CMAKE_MATCH_1} CACHE INTERNAL "")
message("CUDNN_MAJOR_VERSION:" ${CUDNN_MAJOR_VERSION}) # [!code ++]
string(REGEX MATCH "define CUDNN_MINOR ([0-9]+)" _ "${CUDNN_H_CONTENTS}")
set(CUDNN_MINOR_VERSION ${CMAKE_MATCH_1} CACHE INTERNAL "")
message("CUDNN_MINOR_VERSION:" ${CUDNN_MINOR_VERSION}) # [!code ++]
string(REGEX MATCH "define CUDNN_PATCHLEVEL ([0-9]+)" _ "${CUDNN_H_CONTENTS}")
set(CUDNN_PATCH_VERSION ${CMAKE_MATCH_1} CACHE INTERNAL "")
message("CUDNN_PATCH_VERSION:" ${CUDNN_PATCH_VERSION}) # [!code ++]
set(CUDNN_VERSION
"${CUDNN_MAJOR_VERSION}.${CUDNN_MINOR_VERSION}.${CUDNN_PATCH_VERSION}"
CACHE
STRING # [!code --]
INTERNAL # [!code ++]
"cuDNN version"
)
unset(CUDNN_H_CONTENTS)
endif()
Reference:
添加cuDNN8.X支持
打补丁:
.patch
文件:
cd $(git rev-parse --show-toplevel)/ && \
wget -q --show-progress https://raw.gitcode.com/M0rtzz/opencv4-cudnn8-support/raw/master/opencv_pr_17685.patch -O opencv_pr_17685.patch && \
git apply opencv_pr_17685.patch
或者手动加入PR代码:
.diff
文件:
cd $(git rev-parse --show-toplevel)/ && \
wget -q --show-progress https://raw.gitcode.com/M0rtzz/opencv4-cudnn8-support/raw/master/opencv_pr_17685.diff -O opencv_pr_17685.diff
// @brief: 行号是按照从上到下添加的顺序依次排列的
// @file: $(git rev-parse --show-toplevel)/modules/dnn/src/cuda4dnn/csl/cudnn/convolution.hpp
// @line: 226左右
CUDA4DNN_CHECK_CUDNN(cudnnSetConvolutionGroupCount(descriptor, group_count));
/**/// [!code ++]
#if CUDNN_MAJOR >= 8 // [!code ++]
/* cuDNN 7 and below use FMA math by default. cuDNN 8 includes TF32 Tensor Ops // [!code ++]
* in the default setting. TF32 convolutions have lower precision than FP32. // [!code ++]
* Hence, we set the math type to CUDNN_FMA_MATH to reproduce old behavior. // [!code ++]
*/ // [!code ++]
CUDA4DNN_CHECK_CUDNN(cudnnSetConvolutionMathType(descriptor, CUDNN_FMA_MATH)); // [!code ++]
#endif // [!code ++]
/**/// [!code ++]
if (std::is_same<T, half>::value)
/********************************分割线********************************/
// @line: 263左右
ConvolutionAlgorithm(
const Handle& handle, // [!code --]
const ConvolutionDescriptor<T>& conv, // [!code --]
const FilterDescriptor<T>& filter, // [!code --]
const TensorDescriptor<T>& input, // [!code --]
const TensorDescriptor<T>& output) // [!code --]
const ConvolutionDescriptor<T>& convDesc, // [!code ++]
const FilterDescriptor<T>& filterDesc, // [!code ++]
const TensorDescriptor<T>& inputDesc, // [!code ++]
const TensorDescriptor<T>& outputDesc) // [!code ++]
/********************************分割线********************************/
// @line: 269左右
{
#if CUDNN_MAJOR >= 8 // [!code ++]
int requestedAlgoCount = 0, returnedAlgoCount = 0; // [!code ++]
CUDA4DNN_CHECK_CUDNN(cudnnGetConvolutionForwardAlgorithmMaxCount(handle.get(), &requestedAlgoCount)); // [!code ++]
std::vector<cudnnConvolutionFwdAlgoPerf_t> results(requestedAlgoCount); // [!code ++]
CUDA4DNN_CHECK_CUDNN( // [!code ++]
cudnnGetConvolutionForwardAlgorithm_v7( // [!code ++]
handle.get(), // [!code ++]
inputDesc.get(), filterDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
requestedAlgoCount, // [!code ++]
&returnedAlgoCount, // [!code ++]
&results[0] // [!code ++]
) // [!code ++]
); // [!code ++]
/**/// [!code ++]
size_t free_memory, total_memory; // [!code ++]
CUDA4DNN_CHECK_CUDA(cudaMemGetInfo(&free_memory, &total_memory)); // [!code ++]
/**/// [!code ++]
bool found_conv_algorithm = false; // [!code ++]
for (int i = 0; i < returnedAlgoCount; i++) // [!code ++]
{ // [!code ++]
if (results[i].status == CUDNN_STATUS_SUCCESS && // [!code ++]
results[i].algo != CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED && // [!code ++]
results[i].memory < free_memory) // [!code ++]
{ // [!code ++]
found_conv_algorithm = true; // [!code ++]
algo = results[i].algo; // [!code ++]
workspace_size = results[i].memory; // [!code ++]
break; // [!code ++]
} // [!code ++]
} // [!code ++]
/**/// [!code ++]
if (!found_conv_algorithm) // [!code ++]
CV_Error (cv::Error::GpuApiCallError, "cuDNN did not return a suitable algorithm for convolution."); // [!code ++]
#else // [!code ++]
CUDA4DNN_CHECK_CUDNN(
/********************************分割线********************************/
// @line: 304左右
cudnnGetConvolutionForwardAlgorithm(
handle.get(),
input.get(), filter.get(), conv.get(), output.get(), // [!code --]
inputDesc.get(), filterDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
CUDNN_CONVOLUTION_FWD_PREFER_FASTEST,
/********************************分割线********************************/
// @line: 314左右
cudnnGetConvolutionForwardWorkspaceSize(
handle.get(),
input.get(), filter.get(), conv.get(), output.get(), // [!code --]
inputDesc.get(), filterDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
algo, &workspace_size
/********************************分割线********************************/
// @line: 320左右
);
#endif // [!code ++]
}
ConvolutionAlgorithm& operator=(const ConvolutionAlgorithm&) = default;
// @brief: 行号是按照从上到下添加的顺序依次排列的
// @file: $(git rev-parse --show-toplevel)/modules/dnn/src/cuda4dnn/csl/cudnn/transpose_convolution.hpp
// @line: 31左右
TransposeConvolutionAlgorithm(
const Handle& handle, // [!code --]
const ConvolutionDescriptor<T>& conv, // [!code --]
const FilterDescriptor<T>& filter, // [!code --]
const TensorDescriptor<T>& input, // [!code --]
const TensorDescriptor<T>& output) // [!code --]
const ConvolutionDescriptor<T>& convDesc, // [!code ++]
const FilterDescriptor<T>& filterDesc, // [!code ++]
const TensorDescriptor<T>& inputDesc, // [!code ++]
const TensorDescriptor<T>& outputDesc) // [!code ++]
/********************************分割线********************************/
// @line: 31左右
{
#if CUDNN_MAJOR >= 8 // [!code ++]
int requestedAlgoCount = 0, returnedAlgoCount = 0; // [!code ++]
CUDA4DNN_CHECK_CUDNN(cudnnGetConvolutionBackwardDataAlgorithmMaxCount(handle.get(), &requestedAlgoCount)); // [!code ++]
std::vector<cudnnConvolutionBwdDataAlgoPerf_t> results(requestedAlgoCount); // [!code ++]
CUDA4DNN_CHECK_CUDNN( // [!code ++]
cudnnGetConvolutionBackwardDataAlgorithm_v7( // [!code ++]
handle.get(), // [!code ++]
filterDesc.get(), inputDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
requestedAlgoCount, // [!code ++]
&returnedAlgoCount, // [!code ++]
&results[0] // [!code ++]
) // [!code ++]
); // [!code ++]
/**/// [!code ++]
size_t free_memory, total_memory; // [!code ++]
CUDA4DNN_CHECK_CUDA(cudaMemGetInfo(&free_memory, &total_memory)); // [!code ++]
/**/// [!code ++]
bool found_conv_algorithm = false; // [!code ++]
for (int i = 0; i < returnedAlgoCount; i++) // [!code ++]
{ // [!code ++]
if (results[i].status == CUDNN_STATUS_SUCCESS && // [!code ++]
results[i].algo != CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED && // [!code ++]
results[i].memory < free_memory) // [!code ++]
{ // [!code ++]
found_conv_algorithm = true; // [!code ++]
dalgo = results[i].algo; // [!code ++]
workspace_size = results[i].memory; // [!code ++]
break; // [!code ++]
} // [!code ++]
} // [!code ++]
/**/// [!code ++]
if (!found_conv_algorithm) // [!code ++]
CV_Error (cv::Error::GpuApiCallError, "cuDNN did not return a suitable algorithm for transpose convolution."); // [!code ++]
#else // [!code ++]
CUDA4DNN_CHECK_CUDNN(
/********************************分割线********************************/
// @line: 73左右
cudnnGetConvolutionBackwardDataAlgorithm(
handle.get(),
filter.get(), input.get(), conv.get(), output.get(), // [!code --]
filterDesc.get(), inputDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
CUDNN_CONVOLUTION_BWD_DATA_PREFER_FASTEST,
/********************************分割线********************************/
// @line: 83左右
cudnnGetConvolutionBackwardDataWorkspaceSize(
handle.get(),
filter.get(), input.get(), conv.get(), output.get(), // [!code --]
filterDesc.get(), inputDesc.get(), convDesc.get(), outputDesc.get(), // [!code ++]
dalgo, &workspace_size
/********************************分割线********************************/
// @line: 88左右
);
#endif // [!code ++]
}
TransposeConvolutionAlgorithm& operator=(const TransposeConvolutionAlgorithm&) = default;
Reference:
CUDA11.X相关
因为CUDA11.X
不再支持CUDA_nppicom_LIBRARY
而报错:
解决办法:
# @file: $(git rev-parse --show-toplevel)/cmake/OpenCVDetectCUDA.cmake
# @line: 29左右
if(CUDA_FOUND)
set(HAVE_CUDA 1)
if(CUDA_VERSION VERSION_GREATER_EQUAL "11.0") # [!code ++]
# CUDA 11.X removes nppicom # [!code ++]
ocv_list_filterout(CUDA_npp_LIBRARY "nppicom") # [!code ++]
ocv_list_filterout(CUDA_nppi_LIBRARY "nppicom") # [!code ++]
endif() # [!code ++]
if(WITH_CUFFT)
Reference:
Python相关
可能是cmake
找不到合适的Python
解释器来执行脚本:
解决办法:
# 手动执行脚本
cd $(git rev-parse --show-toplevel)/ && \
python3 ./modules/python/src2/gen2.py \
./build/modules/python_bindings_generator \
./build/modules/python_bindings_generator/headers.txt
Reference:
https://github.com/opencv/opencv/issues/10771#issuecomment-376861139
处理配置文件
如果不执行以下几步,编译darknet_ros
会报错: error: 'IplImage'
之类的:
sudo cp /usr/local/lib/pkgconfig/opencv4.pc /usr/lib/pkgconfig/opencv4.pc
sudo cp /usr/lib/pkgconfig/opencv4.pc /usr/lib/pkgconfig/opencv.pc
百度智能云
sudo apt install -y curl libjsoncpp-dev
jsoncpp
库的头文件改为:
#include <jsoncpp/json/json.h>
g++
编译:
g++ test.cpp -o test.out -lcurl -ljsoncpp
运行:
./test.out
darknet、yolov3及darknet_ros工作空间
git clone https://github.com/AlexeyAB/darknet.git darknet
或公益加速源:
git clone https://ghp.ci/https://github.com/AlexeyAB/darknet.git darknet
cd darknet/
sudo gedit Makefile
修改以下前几行为:
GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=0
OPENMP=1
LIBSO=1
ZED_CAMERA=0
ZED_CAMERA_v2_8=0
然后修改NVCC=
后边为nvcc
路径:
NVCC=/usr/local/cuda/bin/nvcc
打开终端,输入:
sudo tee /etc/ld.so.conf.d/cuda.conf > /dev/null << EOF
/usr/local/cuda/lib64
EOF
sudo ldconfig
sudo make -j$(nproc)
./darknet
输出为:
usage: ./darknet <function>
之后我们下载yolov3
权重文件:
cd $(git rev-parse --show-toplevel)/ && \
mkdir weights && cd weights/ && \
wget -q --show-progress https://pjreddie.com/media/files/yolov3.weight
到此为止darknet
就配置好了。
下面我们测试一下:
cd $(git rev-parse --show-toplevel)/ && \
./darknet detect cfg/yolov3.cfg weights/yolov3.weights data/dog.jpg
输出以下就证明配置没有问题:
输出的最后一行报错:
Gtk-Message: 15:22:30.610: Failed to load module "canberra-gtk-module"
解决方法:
sudo apt install -y 'libcanberra-gtk*'
安装之后重新运行就不会报错了。
darknet_ros
工作空间(OpenCV-4.2.0
):
mkdir -p catkin_ws/src && cd catkin_ws/src/ && catkin_init_workspace
cd .. && catkin_make
cd src/
git clone -b opencv4 --recursive https://github.com/M0rtzz/darknet_ros.git darknet_ros
cd darknet_ros/
git submodule update --recursive
如果视频流只有第一帧是RGB8
编码格式,阅读源码后发现在show_image
之前调用image.cpp
中的rgbgr_image
函数循环转换图像编码格式即可解决此问题:
// @file: image.cpp
void rgbgr_image(image im)
{
int i;
for(i = 0; i < im.w*im.h; ++i){
float swap = im.data[i];
im.data[i] = im.data[i+im.w*im.h*2];
im.data[i+im.w*im.h*2] = swap;
}
}
// @file: YoloObjectDetector.cpp
void *YoloObjectDetector::displayInThread(void *ptr)
{
// NOTE: Modified by M0rtzz, solved the problem of displaying video stream as bgr8
rgbgr_image(buff_[(buffIndex_ + 1) % 3]);
int c = show_image(buff_[(buffIndex_ + 1) % 3], "YOLO V3", waitKeyDelay_);
if (c != -1)
c = c % 256;
if (c == 27)
{
demoDone_ = 1;
return 0;
}
else if (c == 82)
{
demoThresh_ += .02;
}
else if (c == 84)
{
demoThresh_ -= .02;
if (demoThresh_ <= .02)
demoThresh_ = .02;
}
else if (c == 83)
{
demoHier_ += .02;
}
else if (c == 81)
{
demoHier_ -= .02;
if (demoHier_ <= .0)
demoHier_ = .0;
}
return 0;
}
之后:
catkin_make
catkin_make
如果编译不过的话(error: 'IplImage'
之类的,之前装OpenCV
提到过避免报错的方法),注意以下命令是只编译darknet_ros
一个包,若工作空间下有多个包需要一起编译那么把命令中的darknet_ros
删除重新执行即可:
catkin_make darknet_ros \
--cmake-args \
-D CMAKE_CXX_FLAGS='-D CV__ENABLE_C_API_CTORS'
如果报错nvcc fatal : Unsupported gpu architecture 'compute_30'
之类的,是因为CUDA11.X
已经不支持compute_30
了,我们将darknet_ros/darknet_ros/CMakeLists.txt
中含有 compute_30
的行进行注释后重新catkin_make
:
Azure Kinect SDK-v1.4.2(软件包)
[!NOTE]
鄙人在
Ubuntu18.04
下是通过源码编译安装的,在Ubuntu20.04
下是通过deb
包直接安装的。
下载软件包:
wget -q --show-progress https://packages.microsoft.com/ubuntu/18.04/prod/pool/main/libk/libk4a1.4/libk4a1.4_1.4.2_amd64.deb -O ./libk4a1.4_1.4.2_amd64.deb && \
wget -q --show-progress https://packages.microsoft.com/ubuntu/18.04/prod/pool/main/libk/libk4a1.4-dev/libk4a1.4-dev_1.4.2_amd64.deb -O ./libk4a1.4-dev_1.4.2_amd64.deb && \
wget -q --show-progress https://packages.microsoft.com/ubuntu/18.04/prod/pool/main/k/k4a-tools/k4a-tools_1.4.2_amd64.deb -O ./k4a-tools_1.4.2_amd64.deb
安装:
sudo apt install -y ./libk4a1.4_1.4.2_amd64.deb && \
sudo cp /usr/lib/x86_64-linux-gnu/libk4a1.4/libdepthengine.so.2.0 /usr/lib/ && \
sudo cp /usr/lib/libdepthengine.so.2.0 /usr/lib/x86_64-linux-gnu/
sudo apt install -y ./libk4a1.4-dev_1.4.2_amd64.deb ./k4a-tools_1.4.2_amd64.deb
配置udev
规则:
sudo tee /etc/udev/rules.d/99-k4a.rules > /dev/null << EOF
# Bus 002 Device 116: ID 045e:097a Microsoft Corp. - Generic Superspeed USB Hub
# Bus 001 Device 015: ID 045e:097b Microsoft Corp. - Generic USB Hub
# Bus 002 Device 118: ID 045e:097c Microsoft Corp. - Azure Kinect Depth Camera
# Bus 002 Device 117: ID 045e:097d Microsoft Corp. - Azure Kinect 4K Camera
# Bus 001 Device 016: ID 045e:097e Microsoft Corp. - Azure Kinect Microphone Array
BUS!="usb", ACTION!="add", SUBSYSTEM!=="usb_device", GOTO="k4a_logic_rules_end"
ATTRS{idVendor}=="045e", ATTRS{idProduct}=="097a", MODE="0666", GROUP="plugdev"
ATTRS{idVendor}=="045e", ATTRS{idProduct}=="097b", MODE="0666", GROUP="plugdev"
ATTRS{idVendor}=="045e", ATTRS{idProduct}=="097c", MODE="0666", GROUP="plugdev"
ATTRS{idVendor}=="045e", ATTRS{idProduct}=="097d", MODE="0666", GROUP="plugdev"
ATTRS{idVendor}=="045e", ATTRS{idProduct}=="097e", MODE="0666", GROUP="plugdev"
LABEL="k4a_logic_rules_end"
EOF
科大讯飞语音
https://www.xfyun.cn/sdk/dispatcher
sudo apt install -y sox libsox-fmt-all pavucontrol
# 如果编译时有相关warning再修改
sudo gedit /usr/include/pcl-1.8/pcl/visualization/cloud_viewer.h
修改一下:
// @line: 199左右
private:
/** \brief Private implementation. */
struct CloudViewer_impl;
// std::auto_ptr<CloudViewer_impl> impl_;
std::shared_ptr<CloudViewer_impl> impl_;
boost::signals2::connection
registerMouseCallback (boost::function<void (const pcl::visualization::MouseEvent&)>);
下载所需SDK
,将libs/x64/libmsc.so
文件拷贝至工作空间根目录/lib/your-appid/libmsc.so
。
cmake_minimum_required(VERSION 3.0.2)
project(tts_voice_test)
set(CMAKE_CXX_FLAGS "-std=c++11")
find_package(
catkin REQUIRED
COMPONENTS roscpp
rospy
std_msgs
genmsg
actionlib_msgs
actionlib
rostime
sensor_msgs
message_filters
cv_bridge
image_transport
compressed_image_transport
compressed_depth_image_transport
tf
tf2
tf2_ros
tf2_geometry_msgs
geometry_msgs
message_generation
nodelet
kinova_msgs)
find_package(k4a REQUIRED)
find_package(realsense2 REQUIRED)
set(OpenCV_DIR "/usr/local/lib/cmake/opencv4")
find_package(OpenCV REQUIRED)
find_package(OpenMP)
find_package(PCL REQUIRED)
add_message_files(FILES person_msgs.msg BoundingBoxes.msg BoundingBox.msg)
generate_messages(DEPENDENCIES std_msgs)
catkin_package(
INCLUDE_DIRS
CATKIN_DEPENDS
include
roscpp
rospy
std_msgs
message_runtime
actionlib_msgs)
if(OPENMP_FOUND)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
endif()
include_directories(/home/m0rtzz/Workspaces/tts_test_ws/include ${catkin_INCLUDE_DIRS}
${OpenCV_INCLUDE_DIRS} ${PCL_INCLUDE_DIRS})
link_directories(/home/m0rtzz/Workspaces/tts_test_ws/lib/)
add_executable(tts_voice_test src/tts_voice_test.cc)
add_dependencies(tts_voice_test ${${PROJECT_NAME}_EXPORTED_TARGETS}
${catkin_EXPORTED_TARGETS})
target_link_libraries(
tts_voice_test
PRIVATE
k4a::k4a
${PCL_LIBRARIES}
${catkin_LIBRARIES}
${OpenCV_LIBRARIES}
${realsense2_LIBRARY}
-lrt
-ldl
-lhpdf
-lcurl
-pthread
-lasound
-ljsoncpp
/home/m0rtzz/Workspaces/tts_voice_test_ws/lib/your-appid/libmsc.so # 替换为你的appid
)
打开终端:
catkin_make
若找不到asoundlib.h
文件打开终端输入:
sudo apt install -y libasound2-dev
编译通过~
librealsense及realsense-ros工作空间
sudo apt install -y ros-${ROS_DISTRO}-realsense2-camera ros-${ROS_DISTRO}-rgbd-launch
安装realsense sdk
:
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE || sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE
sudo add-apt-repository "deb https://librealsense.intel.com/Debian/apt-repo $(lsb_release -cs) main" -u
sudo apt update -y
安装realsense lib
:
sudo apt install -y librealsense2-dkms librealsense2-utils
安装gcc-4.9
和g++-4.9
:
sudo tee -a /etc/apt/sources.list > /dev/null << EOF
# gcc-4.9, g++4.9
deb https://mirrors.hust.edu.cn/ubuntu/ xenial universe
EOF
sudo apt update -y
sudo apt install -y gcc-4.9 g++-4.9
# 注释掉xenial软件源
sudo sed -i '/^deb https:\/\/mirrors.hust.edu.cn\/ubuntu\/ xenial universe/s/^/# /' /etc/apt/sources.list && sudo apt update -y
测试:
realsense-viewer
克隆librealsense
源码并指定版本为v2.50.0
:
git clone -b v2.50.0 https://github.com/IntelRealSense/librealsense.git librealsense-2.50.0
或公益加速源:
git clone -b v2.50.0 https://ghp.ci/https://github.com/IntelRealSense/librealsense.git librealsense-2.50.0
安装依赖:
sudo apt install -y libssl-dev libgtk-3-dev libusb-1.0-0-dev libglfw3-dev libgl1-mesa-dev libglu1-mesa-dev
进入刚才克隆的librealsense
文件夹内:
cd librealsense-2.50.0/
./scripts/setup_udev_rules.sh
# The Bionic patches are maintained for Bionic Beaver LTS kernels 4.1[5/8], 5.[0/3/4] (Ubuntu18.04,`uname -r`查看自己的内核版本)
# The Focal patches are maintained for Ubuntu LTS with kernel 5.4, 5.8, 5.11 (Ubuntu20.04,`uname -r`查看自己的内核版本)
# 貌似不执行也不影响
./scripts/patch-realsense-ubuntu-lts.sh
注意:上面的命令可能执行过慢,请耐心等待,或者科学的上网~
完成结果如下:
之后输入:
mkdir build && cd build/
# @file: $(git rev-parse --show-toplevel)/CMakeLists.txt
# @line: 3左右
project(librealsense2 LANGUAGES CXX C)
# NOTE: Modified by M0rtzz # [!code ++]
LINK_LIBRARIES(-lcurl -lcrypto) # [!code ++]
include(CMake/lrs_options.cmake)
cmake \
-D CMAKE_BUILD_TYPE=Release \
-D BUILD_EXAMPLES=true \
..
以下编译过慢,使用CPU
最大线程进行make
,速度会快很多:
sudo make -j$(nproc)
sudo make -j$(nproc) install
测试:
cd examples/capture/
./rs-capture
接下来我们配置realsense-ros
工作空间:
cd catkin_ws/src/
下载功能包:
git clone -b ros1-legacy https://github.com/IntelRealSense/realsense-ros.git realsense-ros
或公益加速源:
git clone -b ros1-legacy https://ghp.ci/https://github.com/IntelRealSense/realsense-ros.git realsense-ros
cd ..
catkin_make \
-D CMAKE_BUILD_TYPE=Release \
-D CATKIN_ENABLE_TESTING=False
catkin_make install
测试:
roslaunch realsense2_camera rs_camera.launch
还没安摄像头~
kinova-ros机械臂工作空间
cd catkin_ws/src/
catkin_init_workspace
cd ..
catkin_make
cd src/
git clone https://github.com/Kinovarobotics/kinova-ros.git kinova-ros
或公益加速源:
git clone https://ghp.ci/https://github.com/Kinovarobotics/kinova-ros.git kinova-ros
cd ..
安装缺少的moveit
中相应的功能包 :
sudo apt install -y ros-${ROS_DISTRO}-moveit-visual-tools ros-${ROS_DISTRO}-moveit-ros-planning-interface
catkin_make
sudo cp src/kinova-ros/kinova_driver/udev/10-kinova-arm.rules /etc/udev/rules.d/
安装Moveit和pr2:
sudo apt install -y $(apt-cache search ros-${ROS_DISTRO}-pr2- | grep -v "ros-${ROS_DISTRO}-pr2-apps" | cut -d' ' -f1)
完成~
机器人导航
[!CAUTION]
ZZU-SR的童鞋请注意,此小节只需安装软件包,其他内容是之前听课做的笔记,导航相关代码直接
Copy
比赛电脑的catkin_ws
中的mrobot
即可。
Dependency (ESSENTIAL)
sudo apt install -y "ros-${ROS_DISTRO}-move-base*" "ros-${ROS_DISTRO}-turtlebot3-*"
sudo apt install -y ros-${ROS_DISTRO}-dwa-local-planner ros-${ROS_DISTRO}-joy ros-${ROS_DISTRO}-teleop-twist-joy ros-${ROS_DISTRO}-teleop-twist-keyboard ros-${ROS_DISTRO}-laser-proc ros-${ROS_DISTRO}-rgbd-launch ros-${ROS_DISTRO}-depthimage-to-laserscan ros-${ROS_DISTRO}-rosserial-arduino ros-${ROS_DISTRO}-rosserial-python ros-${ROS_DISTRO}-rosserial-server ros-${ROS_DISTRO}-rosserial-client ros-${ROS_DISTRO}-rosserial-msgs ros-${ROS_DISTRO}-amcl ros-${ROS_DISTRO}-map-server ros-${ROS_DISTRO}-move-base ros-${ROS_DISTRO}-urdf ros-${ROS_DISTRO}-xacro ros-${ROS_DISTRO}-compressed-image-transport ros-${ROS_DISTRO}-rqt-image-view ros-${ROS_DISTRO}-gmapping ros-${ROS_DISTRO}-navigation ros-${ROS_DISTRO}-interactive-markers
安装gmapping
包(用于构建地图):
sudo apt install -y ros-${ROS_DISTRO}-gmapping
安装地图服务包(用于保存与读取地图):
sudo apt install -y ros-${ROS_DISTRO}-map-server
安装navigation
包(用于定位以及路径规划):
sudo apt install -y ros-${ROS_DISTRO}-navigation
因tf
和tf2
迁移问题,需将工作空间内的所有global_costmap_params.yaml
和local_costmap_params.yaml
文件里的头几行去掉/
,返回工作空间根目录下重新编译。
Reference:
http://wiki.ros.org/tf2/Migration#Removal_of_support_for_tf_prefix
Arduino IDE (NOT REQUIRED) (EOL)
https://www.arduino.cc/en/software
下载安装包:
tar -xvf arduino-1.8.19-linux64.tar.xz && cd arduino-1.8.19/
sudo chmod +x install.sh
sudo ./install.sh
听课笔记 (NOT REQUIRED) (EOL)
首先创建实体导航工作空间:
mkdir -p navigation_entity_test_ws/src && cd navigation_entity_test_ws/src/
catkin_create_pkg entity_test roscpp rospy std_msgs gmapping map_server amcl move_base
cd .. && catkin_make
cd src/ && catkin_create_pkg robot_start_test roscpp rospy std_msgs ros_arduino_python usb_cam rplidar_ros
cd robot_start_test/ && mkdir launch && cd launch/ && touch start_test.launch
<!-- @file: start_test.launch
@brief: 机器人启动文件:
1.启动底盘
2.启动激光雷达
3.启动摄像头
-->
<launch>
<include file="$(find ros_arduino_python)/launch/arduino.launch" />
<include file="$(find usb_cam)/launch/usb_cam-test.launch" />
<include file="$(find rplidar_ros)/launch/rplidar.launch" />
</launch>
接下来创建机器人模型相关的功能包:
cd ../../src/
catkin_create_pkg robot_description_test urdf xacro
在功能包下新建urdf
目录,编写具体的urdf
文件:
cd robot_description_test/ && mkdir urdf
cd urdf/ && touch {robot.urdf.xacro,robot_base.urdf.xacro,robot_camera.urdf.xacro,robot_laser.urdf.xacro} && code robot.urdf.xacro
将下列代码粘贴进去:
<!-- @file: robot.urdf.xacro -->
<robot name="robot_test" xmlns:xacro="http://wiki.ros.org/xacro">
<xacro:include filename="robot_base.urdf.xacro" />
<xacro:include filename="robot_camera.urdf.xacro" />
<xacro:include filename="robot_laser.urdf.xacro" />
</robot>
保存退出,打开终端输入:
code robot_base.urdf.xacro
将下列代码粘贴进去:
robot_base.urdf.xacro
<!-- @file: robot_base.urdf.xacro -->
<robot name="robot_test" xmlns:xacro="http://wiki.ros.org/xacro">
<xacro:property name="footprint_radius" value="0.001" />
<link name="base_footprint">
<visual>
<geometry>
<sphere radius="${footprint_radius}" />
</geometry>
</visual>
</link>
<xacro:property name="base_radius" value="0.1" />
<xacro:property name="base_length" value="0.08" />
<xacro:property name="lidi" value="0.015" />
<xacro:property name="base_joint_z" value="${base_length / 2 + lidi}" />
<link name="base_link">
<visual>
<geometry>
<cylinder radius="0.1" length="0.08" />
</geometry>
<origin xyz="0 0 0" rpy="0 0 0" />
<material name="baselink_color">
<color rgba="1.0 0.5 0.2 0.5" />
</material>
</visual>
</link>
<joint name="link2footprint" type="fixed">
<parent link="base_footprint" />
<child link="base_link" />
<origin xyz="0 0 0.055" rpy="0 0 0" />
</joint>
<xacro:property name="wheel_radius" value="0.0325" />
<xacro:property name="wheel_length" value="0.015" />
<xacro:property name="PI" value="3.1415927" />
<xacro:property name="wheel_joint_z" value="${(base_length / 2 + lidi - wheel_radius) * -1}" />
<xacro:macro name="wheel_func" params="wheel_name flag">
<link name="${wheel_name}_wheel">
<visual>
<geometry>
<cylinder radius="${wheel_radius}" length="${wheel_length}" />
</geometry>
<origin xyz="0 0 0" rpy="${PI / 2} 0 0" />
<material name="wheel_color">
<color rgba="0 0 0 0.3" />
</material>
</visual>
</link>
<joint name="${wheel_name}2link" type="continuous">
<parent link="base_link" />
<child link="${wheel_name}_wheel" />
<origin xyz="0 ${0.1 * flag} ${wheel_joint_z}" rpy="0 0 0" />
<axis xyz="0 1 0" />
</joint>
</xacro:macro>
<xacro:wheel_func wheel_name="left" flag="1" />
<xacro:wheel_func wheel_name="right" flag="-1" />
<xacro:property name="small_wheel_radius" value="0.0075" />
<xacro:property name="small_joint_z" value="${(base_length / 2 + lidi - small_wheel_radius) * -1}" />
<xacro:macro name="small_wheel_func" params="small_wheel_name flag">
<link name="${small_wheel_name}_wheel">
<visual>
<geometry>
<sphere radius="${small_wheel_radius}" />
</geometry>
<origin xyz="0 0 0" rpy="0 0 0" />
<material name="wheel_color">
<color rgba="0 0 0 0.3" />
</material>
</visual>
</link>
<joint name="${small_wheel_name}2link" type="continuous">
<parent link="base_link" />
<child link="${small_wheel_name}_wheel" />
<origin xyz="${0.08 * flag} 0 ${small_joint_z}" rpy="0 0 0" />
<axis xyz="0 1 0" />
</joint>
</xacro:macro >
<xacro:small_wheel_func small_wheel_name="front" flag="1"/>
<xacro:small_wheel_func small_wheel_name="back" flag="-1"/>
</robot>
保存退出,打开终端输入:
code robot_camera.urdf.xacro
将下列代码粘贴进去:
<!-- @file: robot_camera.urdf.xacro -->
<robot name="robot_test" xmlns:xacro="http://wiki.ros.org/xacro">
<xacro:property name="camera_length" value="0.02" />
<xacro:property name="camera_width" value="0.05" />
<xacro:property name="camera_height" value="0.05" />
<xacro:property name="joint_camera_x" value="0.08" />
<xacro:property name="joint_camera_y" value="0" />
<xacro:property name="joint_camera_z" value="${base_length / 2 + camera_height / 2}" />
<link name="camera">
<visual>
<geometry>
<box size="${camera_length} ${camera_width} ${camera_height}" />
</geometry>
<origin xyz="0 0 0" rpy="0 0 0" />
<material name="black">
<color rgba="0 0 0 0.8" />
</material>
</visual>
</link>
<joint name="camera2base" type="fixed">
<parent link="base_link" />
<child link="camera" />
<origin xyz="${joint_camera_x} ${joint_camera_y} ${joint_camera_z}" rpy="0 0 0" />
</joint>
</robot>
保存退出,打开终端输入:
code robot_laser.urdf.xacro
将下列代码粘贴进去:
<!-- @file: robot_laser.urdf.xacro -->
<robot name="robot_test" xmlns:xacro="http://wiki.ros.org/xacro">
<xacro:property name="support_radius" value="0.01" />
<xacro:property name="support_length" value="0.15" />
<xacro:property name="laser_radius" value="0.03" />
<xacro:property name="laser_length" value="0.05" />
<xacro:property name="joint_support_x" value="0" />
<xacro:property name="joint_support_y" value="0" />
<xacro:property name="joint_support_z" value="${base_length / 2 + support_length / 2}" />
<xacro:property name="joint_laser_x" value="0" />
<xacro:property name="joint_laser_y" value="0" />
<xacro:property name="joint_laser_z" value="${support_length / 2 + laser_length / 2}" />
<link name="support">
<visual>
<geometry>
<cylinder radius="${support_radius}" length="${support_length}" />
</geometry>
<material name="yellow">
<color rgba="0.8 0.5 0.0 0.5" />
</material>
</visual>
</link>
<joint name="support2base" type="fixed">
<parent link="base_link" />
<child link="support"/>
<origin xyz="${joint_support_x} ${joint_support_y} ${joint_support_z}" rpy="0 0 0" />
</joint>
<link name="laser">
<visual>
<geometry>
<cylinder radius="${laser_radius}" length="${laser_length}" />
</geometry>
<material name="black">
<color rgba="0 0 0 0.5" />
</material>
</visual>
</link>
<joint name="laser2support" type="fixed">
<parent link="support" />
<child link="laser"/>
<origin xyz="${joint_laser_x} ${joint_laser_y} ${joint_laser_z}" rpy="0 0 0" />
</joint>
</robot>
保存退出,打开终端:
cd .. && mkdir launch
cd launch/ && touch robot_test.launch && code robot_test.launch
将下列代码粘贴进去:
<!-- @file: robot_test.launch -->
<launch>
<param name="robot_description" command="$(find xacro)/xacro $(find robot_description_test)/urdf/robot.urdf.xacro" />
<node pkg="joint_state_publisher" name="joint_state_publisher" type="joint_state_publisher" />
<node pkg="robot_state_publisher" name="robot_state_publisher" type="robot_state_publisher" />
</launch>
保存退出,打开终端:
cd ../../../ && echo 'source /home/m0rtzz/Workspaces/navigation_entity_test_ws/devel/setup.bash' >> ~/.bashrc && source ~/.bashrc
测试一下:
roslaunch robot_description_test robot_test.launch
之后Ctrl+Alt+T
打开一个新的终端,输入:
rviz
将Fixed Frame
设置为base_footprint
:
Add
一个RobotModel
:
Add
一个TF
:
cd src/entity_test/ && mkdir launch && cd launch/
touch gmapping.launch && code gmapping.launch
将下列代码粘贴进去:
<!-- @file: gmapping.launch -->
<launch>
<node pkg="gmapping" type="slam_gmapping" name="slam_gmapping" output="screen">
<remap from="scan" to="scan"/>
<param name="base_frame" value="base_footprint"/><!--底盘坐标系-->
<param name="odom_frame" value="odom"/> <!--里程计坐标系-->
<param name="map_update_interval" value="5.0"/>
<param name="maxUrange" value="16.0"/>
<param name="sigma" value="0.05"/>
<param name="kernelSize" value="1"/>
<param name="lstep" value="0.05"/>
<param name="astep" value="0.05"/>
<param name="iterations" value="5"/>
<param name="lsigma" value="0.075"/>
<param name="ogain" value="3.0"/>
<param name="lskip" value="0"/>
<param name="srr" value="0.1"/>
<param name="srt" value="0.2"/>
<param name="str" value="0.1"/>
<param name="stt" value="0.2"/>
<param name="linearUpdate" value="1.0"/>
<param name="angularUpdate" value="0.5"/>
<param name="temporalUpdate" value="3.0"/>
<param name="resampleThreshold" value="0.5"/>
<param name="particles" value="30"/>
<param name="xmin" value="-50.0"/>
<param name="ymin" value="-50.0"/>
<param name="xmax" value="50.0"/>
<param name="ymax" value="50.0"/>
<param name="delta" value="0.05"/>
<param name="llsamplerange" value="0.01"/>
<param name="llsamplestep" value="0.01"/>
<param name="lasamplerange" value="0.005"/>
<param name="lasamplestep" value="0.005"/>
</node>
</launch>
cd .. && mkdir map
cd launch/ && touch map_save.launch && code map_save.launch
将下列代码粘贴进去:
<!-- @file: map_save.launch -->
<launch>
<arg name="filename" value="$(find entity_test)/map/nav" />
<node name="map_save" pkg="map_server" type="map_saver" args="-f $(arg filename)" />
</launch>
touch map_server.launch && code map_server.launch
将下列代码粘贴进去:
<!-- @file: map_server.launch -->
<launch>
<!-- 设置地图的配置文件 -->
<arg name="map" default="nav.yaml" />
<!-- 运行地图服务器,并且加载设置的地图-->
<node name="map_server" pkg="map_server" type="map_server" args="$(find entity_test)/map/$(arg map)"/>
</launch>
touch amcl.launch && code amcl.launch
将下列代码粘贴进去:
<!-- @file: amcl.launch -->
<launch>
<node pkg="amcl" type="amcl" name="amcl" output="screen">
<!-- Publish scans from best pose at a max of 10 Hz -->
<param name="odom_model_type" value="diff"/><!-- 里程计模式为差分 -->
<param name="odom_alpha5" value="0.1"/>
<param name="transform_tolerance" value="0.2" />
<param name="gui_publish_rate" value="10.0"/>
<param name="laser_max_beams" value="30"/>
<param name="min_particles" value="500"/>
<param name="max_particles" value="5000"/>
<param name="kld_err" value="0.05"/>
<param name="kld_z" value="0.99"/>
<param name="odom_alpha1" value="0.2"/>
<param name="odom_alpha2" value="0.2"/>
<!-- translation std dev, m -->
<param name="odom_alpha3" value="0.8"/>
<param name="odom_alpha4" value="0.2"/>
<param name="laser_z_hit" value="0.5"/>
<param name="laser_z_short" value="0.05"/>
<param name="laser_z_max" value="0.05"/>
<param name="laser_z_rand" value="0.5"/>
<param name="laser_sigma_hit" value="0.2"/>
<param name="laser_lambda_short" value="0.1"/>
<param name="laser_lambda_short" value="0.1"/>
<param name="laser_model_type" value="likelihood_field"/>
<!-- <param name="laser_model_type" value="beam"/> -->
<param name="laser_likelihood_max_dist" value="2.0"/>
<param name="update_min_d" value="0.2"/>
<param name="update_min_a" value="0.5"/>
<param name="odom_frame_id" value="odom"/><!-- 里程计坐标系 -->
<param name="base_frame_id" value="base_footprint"/><!-- 添加机器人基坐标系 -->
<param name="global_frame_id" value="map"/><!-- 添加地图坐标系 -->
</node>
</launch>
cd .. && mkdir param && cd param/ && touch {costmap_common_params.yaml,local_costmap_params.yaml,global_costmap_params.yaml,base_local_planner_params.yaml} && code .
将下列几个代码分别粘贴进去:
# @file: base_local_planner_params.yaml
TrajectoryPlannerROS:
# Robot Configuration Parameters
max_vel_x: 0.5 # X 方向最大速度
min_vel_x: 0.1 # X 方向最小速速
max_vel_theta: 1.0 #
min_vel_theta: -1.0
min_in_place_vel_theta: 1.0
acc_lim_x: 1.0 # X 加速限制
acc_lim_y: 0.0 # Y 加速限制
acc_lim_theta: 0.6 # 角速度加速限制
# Goal Tolerance Parameters,目标公差
xy_goal_tolerance: 0.10
yaw_goal_tolerance: 0.05
# Differential-drive robot configuration
# 是否是全向移动机器人
holonomic_robot: false
# Forward Simulation Parameters,前进模拟参数
sim_time: 0.8
vx_samples: 18
vtheta_samples: 20
sim_granularity: 0.05
# @file: cost_common_params.yaml
#机器人几何参,如果机器人是圆形,设置 robot_radius,如果是其他形状设置 footprint
robot_radius: 0.12 #圆形
# footprint: [[-0.12, -0.12], [-0.12, 0.12], [0.12, 0.12], [0.12, -0.12]] #其他形状
obstacle_range: 3.0 # 用于障碍物探测,比如: 值为 3.0,意味着检测到距离小于 3 米的障碍物时,就会引入代价地图
raytrace_range: 3.5 # 用于清除障碍物,比如:值为 3.5,意味着清除代价地图中 3.5 米以外的障碍物
#膨胀半径,扩展在碰撞区域以外的代价区域,使得机器人规划路径避开障碍物
inflation_radius: 0.2
#代价比例系数,越大则代价值越小
cost_scaling_factor: 3.0
#地图类型
map_type: costmap
#导航包所需要的传感器
observation_sources: scan
#对传感器的坐标系和数据进行配置。这个也会用于代价地图添加和清除障碍物。例如,你可以用激光雷达传感器用于在代价地图添加障碍物,再添加kinect用于导航和清除障碍物。
scan:
{
sensor_frame: laser,
data_type: LaserScan,
topic: scan,
marking: true,
clearing: true,
}
# @file: global_costmap_params.yaml
global_costmap:
global_frame: map #地图坐标系
robot_base_frame: base_footprint #机器人坐标系
# 以此实现坐标变换
update_frequency: 1.0 #代价地图更新频率
publish_frequency: 1.0 #代价地图的发布频率
transform_tolerance: 0.5 #等待坐标变换发布信息的超时时间
static_map: true # 是否使用一个地图或者地图服务器来初始化全局代价地图,如果不使用静态地图,这个参数为false.
# @file: local_costmap_params.yaml
local_costmap:
global_frame: odom #里程计坐标系
robot_base_frame: base_footprint #机器人坐标系
update_frequency: 10.0 #代价地图更新频率
publish_frequency: 10.0 #代价地图的发布频率
transform_tolerance: 0.5 #等待坐标变换发布信息的超时时间
static_map: false #不需要静态地图,可以提升导航效果
rolling_window: true #是否使用动态窗口,默认为false,在静态的全局地图中,地图不会变化
width: 3 # 局部地图宽度 单位是 m
height: 3 # 局部地图高度 单位是 m
resolution: 0.05 # 局部地图分辨率 单位是 m,一般与静态地图分辨率保持一致
cd ../launch/ && touch move_base.launch && code move_base.launch
将下列代码粘贴进去:
<!-- @file: move_base.launch -->
<launch>
<node pkg="move_base" type="move_base" respawn="false" name="move_base" output="screen" clear_params="true">
<rosparam file="$(find nav)/param/costmap_common_params.yaml" command="load" ns="global_costmap" />
<rosparam file="$(find nav)/param/costmap_common_params.yaml" command="load" ns="local_costmap" />
<rosparam file="$(find nav)/param/local_costmap_params.yaml" command="load" />
<rosparam file="$(find nav)/param/global_costmap_params.yaml" command="load" />
<rosparam file="$(find nav)/param/base_local_planner_params.yaml" command="load" />
</node>
</launch>
touch auto_slam.launch && code auto_slam.launch
将下列代码粘贴进去:
<!-- @file: auto_slam.launch -->
<launch>
<!-- 启动SLAM节点 -->
<include file="$(find entity_test)/launch/gmapping.launch" />
<!-- 运行move_base节点 -->
<include file="$(find entity_test)/launch/move_base.launch" />
</launch>
RECOMMENDED
自定义函数及别名
打开终端输入:
tee -a ~/.bashrc >> /dev/null << 'EOF'
# 在最后(WARNING:如果安装了Anaconda3,需要加在__conda_setup之前)加入如下代码段:
# 显示git分支
function customizePrompt()
{
local none='\[\033[00m\]' # 重置所有属性到默认状态
local green='\[\033[0;32m\]' # 绿色,用于用户名和主机名
local user_at_host="${green}\[\033[1m\]\u@\h${none}" # 用户名和主机名显示为绿色并加粗
local blue='\[\033[0;34m\]' # 蓝色,用于当前工作目录
local git_branch_color='\[\033[1;43;37m\]' # 黄色背景,白色字体,用于git分支
local command_prompt='$' # 命令提示符
if [ $UID -eq 0 ]; then
command_prompt='#'
fi
local working_directory="${blue}\[\033[1m\]\w${none}" # 当前工作目录显示为蓝色并加粗
# 使用__git_ps1函数来显示当前git分支,使用自定义的颜色
echo "${user_at_host}:${working_directory}\$(__git_ps1 \" ${git_branch_color}[%s]${none} \")${command_prompt} "
}
export PS1="$(customizePrompt)"
# 复制上一个命令到系统剪切板,sudo apt install -y xsel
function copyLastCommand()
{
# fc获取最后执行的命令,echo发送给xsel复制到剪切板
# -nl以列表形式显示命令历史,但不包括命令编号,-1只获取最近一条命令
echo -n $(fc -nl -1) | xsel --clipboard --input
}
# 创建一个别名,若与你的其他软件包内置命令冲突,请自行更换别名
alias clc='copyLastCommand'
# 计划关机(为了使得`.zshrc`也能用此函数,就没有使用`read -p`)
function powerOff() {
sudo shutdown -c # 取消之前的计划关机
echo "已取消之前的计划关机,下面进行新的计划~"
echo "请输入关机的年份(YYYY):"
read shutdown_year
echo "请输入关机的月份(MM):"
read shutdown_month
echo "请输入关机的日期(DD):"
read shutdown_day
echo "请输入关机的小时(24小时制,HH):"
read shutdown_hour
echo "请输入关机的分钟(MM):"
read shutdown_minute
shutdown_datetime="${shutdown_year}-${shutdown_month}-${shutdown_day} ${shutdown_hour}:${shutdown_minute}"
# 检查日期时间是否合法
date -d "${shutdown_datetime}" &>/dev/null
if [ $? -ne 0 ]; then
echo "Error:请输入合法的日期和时间格式(YYYY-MM-DD HH:MM)"
return 1
fi
# 计算等待时间(秒数)
current_timestamp=$(date +%s)
shutdown_timestamp=$(date -d "${shutdown_datetime}" +%s)
wait_time=$(echo "(${shutdown_timestamp} - ${current_timestamp}) / 60" | bc)
# 检查是否为未来时间
if [ ${wait_time} -le 0 ]; then
echo "Error:请输入未来的日期和时间"
return 1
fi
echo "计划在 ${shutdown_datetime} 关机"
sudo shutdown -h +${wait_time}
}
alias po='powerOff'
# 有效解决Anaconda3激活虚拟环境后使用`pip install`或`pip3 install`会安装到其他虚拟环境的问题
alias pip='python3 -m pip'
alias pip3='python3 -m pip'
EOF
source ~/.bashrc
这样就可以更清晰的显示git
分支~
设置${HOME}下的文件夹为英文
export LANG=en_US
xdg-user-dirs-gtk-update
编辑选择右边的Update Names
:
之后执行以下语句:
export LANG=zh_CN
reboot
勾选不要再次询问我,并选择保留旧的名称:
同步双系统时间
sudo apt install -y ntpdate
sudo ntpdate time.windows.com
timedatectl set-local-rtc 1 --adjust-system-clock
Software
推荐一些Linux
办公常用的软件(包括wine
环境下,全部下载deb
格式的安装包,系统架构可通过命令uname -a
查看):
搜狗输入法(下载安装包后,官方会跳转至安装教程,严格按照步骤执行)
Visual Studio Code(推荐打开Settings Sync
,换电脑时设置可以同步)
仅支持Ubuntu20.04
,需安装依赖包:
Linux原生微信_需要安装星火应用商店(最近的版本好像使用Fcitx输入法时中文输入有问题)
Ubuntu20.04
以下版本或者不想安装星火应用商店的用户可安装Flatpak
打包的微信(需安装Flatpak
,下文有Flatpak
安装教程):
解决基于Fcitx5
的搜狗输入法无法在Flatpak
版微信中进行中文输入的问题:https://github.com/web1n/wechat-universal-flatpak/issues/33#issuecomment-2222259823
wget -q --show-progress https://github.com/web1n/wechat-universal-flatpak/releases/latest/download/com.tencent.WeChat-x86_64.flatpak -O com.tencent.WeChat-x86_64.flatpak && sudo flatpak install ./com.tencent.WeChat-x86_64.flatpak
可以水平和垂直分割的终端:
sudo apt install -y terminator
Neovim
:
sudo apt install -y neovim && \
echo '/usr/bin/nvim' | sudo update-alternatives --config editor
trash
命令:
sudo apt install -y trash-cli
tree
命令:
sudo apt install -y tree
查看系统信息:
sudo apt install -y neofetch
或安装用C语言写的更快的fastfetch
:
wget -q --show-progress https://github.com/fastfetch-cli/fastfetch/releases/latest/download/fastfetch-linux-amd64.deb -O fastfetch-linux-amd64.deb && sudo apt install -y ./fastfetch-linux-amd64.deb
rar
文件解压工具:
sudo apt install -y unrar
解决不能观看MP4
文件:
sudo apt update -y
sudo apt install -y libdvdnav-dev libdvdread-dev gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly libdvd-pkg
sudo apt install -y ubuntu-restricted-extras
sudo dpkg-reconfigure libdvd-pkg
系统优化:
sudo apt update -y
sudo apt install -y gnome-tweak-tool
命令启动:
gnome-tweaks
或:
剪贴板管理工具:
sudo add-apt-repository ppa:diodon-team/stable
sudo apt update -y && sudo apt install -y diodon
然后使用刚才安装的优化工具将diodon
设置为开机自启动:
这样就实现了类似于Windows
下Win + V
的剪贴板功能:
另外可使用中科大源反向代理的Canonical PPA仓库
:
[!TIP]
将
/etc/apt/sources.list.d
下.list
文件中的http://ppa.launchpad.net
替换为https://launchpad.proxy.ustclug.org
即可,建议替换前先sudo cp /etc/apt/sources.list.d/your-file.list /etc/apt/sources.list.d/your-file.list.save
备份一下(请自行替换文件名)
Flatpak
:
Ubuntu 18.10 (Cosmic Cuttlefish) or later:
sudo apt install -y flatpak
Older Ubuntu versions:
sudo add-apt-repository ppa:flatpak/stable
sudo apt update -y
sudo apt install -y flatpak
FlatHub
上交镜像源:
https://mirror.sjtu.edu.cn/docs/flathub
切换之后更新Flatpak
应用将加速:
sudo flatpak update
火狐浏览器优化:
地址栏输入:
about:config
full-screen-api.warning.timeout
设置为0
~
full-screen-api.transition-duration.enter
和
full-screen-api.transition-duration.leave
都设置为0 0
~
browser.search.openintab
browser.urlbar.openintab
browser.tabs.loadBookmarksInTabs
都设置为true
~
browser.urlbar.trimURLs
设置为false
~
OPTIONAL
启动菜单的默认项
sudo gedit /etc/default/grub
改一下GRUB_DEFAULT=
后边的数字,默认是0
,Windows
是第n
个就设置为n-1
保存后关闭,打开终端,输入:
sudo update-grub
reboot
重启后问题解决~
使在桌面上右键打开终端时进入Desktop目录(Ubuntu18.04)
https://launchpad.net/ubuntu/+source/gnome-terminal/3.28.1-1ubuntu1
下载源码包:
wget -q --show-progress https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/gnome-terminal/3.28.1-1ubuntu1/gnome-terminal_3.28.1.orig.tar.xz -O gnome-terminal_3.28.1.orig.tar.xz && \
wget -q --show-progress https://launchpad.net/ubuntu/+archive/primary/+sourcefiles/gnome-terminal/3.28.1-1ubuntu1/gnome-terminal_3.28.1-1ubuntu1.debian.tar.xz -O gnome-terminal_3.28.1-1ubuntu1.debian.tar.xz
解压:
tar -xf gnome-terminal_3.28.1.orig.tar.xz && \
tar -xf gnome-terminal_3.28.1-1ubuntu1.debian.tar.xz
ls debian/ gnome-terminal-3.28.1/
cp -r debian/* gnome-terminal-3.28.1/
cd gnome-terminal-3.28.1/ && git apply patches/*.patch
安装依赖:
sudo apt install -y intltool libvte-2.91-dev gsettings-desktop-schemas-dev uuid-dev libdconf-dev libpcre2-dev libgconf2-dev libxml2-utils gnome-shell libnautilus-extension-dev itstool yelp-tools pcre2-utils
打开src/
下的terminal-nautilus.c
,找到:
static inline gboolean
desktop_opens_home_dir (TerminalNautilus *nautilus)
{
#if 0
return _client_get_bool (gconf_client,
"/apps/nautilus-open-terminal/desktop_opens_home_dir",
NULL);
#endif
return TRUE;
}
改为
static inline gboolean
desktop_opens_home_dir (TerminalNautilus *nautilus)
{
#if 0
return _client_get_bool (gconf_client,
"/apps/nautilus-open-terminal/desktop_opens_home_dir",
NULL);
#endif
return FALSE; // here
}
src/
下打开终端
cd ..
autoreconf --install
autoconf
./configure --prefix='/usr'
sudo make -j$(nproc)
sudo make check -j$(nproc)
sudo make -j$(nproc) install
sudo cp /usr/lib/nautilus/extensions-3.0/libterminal-nautilus.so /usr/lib/x86_64-linux-gnu/nautilus/extensions-3.0/
reboot
问题解决!
protobuf-2.6.1
sudo apt install -y libtool
https://github.com/protocolbuffers/protobuf/releases/download/v2.6.1/protobuf-2.6.1.tar.gz
或镜像:
wget -q --show-progress https://raw.gitcode.com/M0rtzz/protobuf-2.6.1/assets/199 -O protobuf-2.6.1.tar.gz
tar -zxvf protobuf-2.6.1.tar.gz && cd protobuf-2.6.1
# 可有可无(Reference: https://github.com/protocolbuffers/protobuf/blob/v2.6.1/README.md?plain=1#L11-L21)
./autogen.sh
./configure --prefix=/usr/local/protobuf
sudo make -j$(nproc)
养成有make check/test
就执行的好习惯:
sudo make check -j$(nproc)
sudo make -j$(nproc) install
sudo tee -a /etc/profile > /dev/null << 'EOF'
# protobuf
export PATH=${PATH}:/usr/local/protobuf/bin
export LIBRARY_PATH=${LIBRARY_PATH}:/usr/local/protobuf/lib
export PKG_CONFIG_PATH=${PKG_CONFIG_PATH}:/usr/local/protobuf/lib/pkgconfig
EOF
source /etc/profile
sudo tee /etc/ld.so.conf.d/protobuf.conf > /dev/null << EOF
/usr/local/protobuf/lib
EOF
sudo ldconfig
OpenBLAS
软件源安装 (RECOMMENDED)
sudo apt update -y && sudo apt install -y libopenblas-dev
源码编译安装 (NOT RECOMMENDED)
sudo apt install -y gfortran gcc-arm-linux-gnueabihf libnewlib-arm-none-eabi libc6-dev-i386
git clone https://github.com/OpenMathLib/OpenBLAS.git OpenBLAS
或公益加速源:
git clone https://gitclone.com/github.com/OpenMathLib/OpenBLAS.git OpenBLAS
cd OpenBLAS/
sudo make -j$(nproc)
sudo make PREFIX=/usr/local install
查看版本:
grep OPENBLAS_VERSION /usr/local/include/openblas_config.h
seetaface2工作空间
echo 'source /home/m0rtzz/Workspaces/catkin_ws/devel/setup.bash' >> ~/.bashrc
source ~/.bashrc
解决办法:
加入工作空间下lib文件夹的路径:
echo 'export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/home/m0rtzz/Workspaces/catkin_ws/lib' >> ~/.bashrc
source ~/.bashrc
解决!
报错:
Gtk-Message: 15:22:30.610: Failed to load module "canberra-gtk-module"
解决方法:
sudo apt install -y 'libcanberra-gtk*'
caffe (EOL)
Reference:
https://blog.csdn.net/weixin_39161727/article/details/120136500
首先安装依赖:
sudo apt install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler libatlas-base-dev libgflags-dev libgoogle-glog-dev liblmdb-dev libboost-all-dev
git clone https://github.com/BVLC/caffe.git caffe
或公益加速源:
git clone https://ghp.ci/https://github.com/BVLC/caffe.git caffe
cd caffe/ && sudo cp Makefile.config.example Makefile.config
gedit Makefile.config
Makefile.config
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# This code is taken from https://github.com/sh1r0/caffe-android-lib
# USE_HDF5 := 0
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
# You should not set this flag if you will be reading LMDBs with any
# possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
CUSTOM_CXX := g++
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility.
CUDA_ARCH := #-gencode arch=compute_20,code=sm_20 \
#-gencode arch=compute_20,code=sm_21 \
#-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_61,code=sm_61 \
-gencode arch=compute_61,code=compute_61
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := open
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
# $(ANACONDA_HOME)/include/python2.7 \
# $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include
# Uncomment to use Python 3 (default is Python 2)
PYTHON_LIBRARIES := boost_python3 python3.6m
PYTHON_INCLUDE := /usr/include/python3.6m \
/usr/lib/python3.6/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
# USE_NCCL := 1
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
# enable pretty build (comment to see full commands)
Q ?= @
gedit Makefile
Makefile
PROJECT := caffe
CONFIG_FILE := Makefile.config
# Explicitly check for the config file, otherwise make -k will proceed anyway.
ifeq ($(wildcard $(CONFIG_FILE)),)
$(error $(CONFIG_FILE) not found. See $(CONFIG_FILE).example.)
endif
include $(CONFIG_FILE)
BUILD_DIR_LINK := $(BUILD_DIR)
ifeq ($(RELEASE_BUILD_DIR),)
RELEASE_BUILD_DIR := .$(BUILD_DIR)_release
endif
ifeq ($(DEBUG_BUILD_DIR),)
DEBUG_BUILD_DIR := .$(BUILD_DIR)_debug
endif
DEBUG ?= 0
ifeq ($(DEBUG), 1)
BUILD_DIR := $(DEBUG_BUILD_DIR)
OTHER_BUILD_DIR := $(RELEASE_BUILD_DIR)
else
BUILD_DIR := $(RELEASE_BUILD_DIR)
OTHER_BUILD_DIR := $(DEBUG_BUILD_DIR)
endif
# All of the directories containing code.
SRC_DIRS := $(shell find * -type d -exec bash -c "find {} -maxdepth 1 \
\( -name '*.cpp' -o -name '*.proto' \) | grep -q ." \; -print)
# The target shared library name
LIBRARY_NAME := $(PROJECT)
LIB_BUILD_DIR := $(BUILD_DIR)/lib
STATIC_NAME := $(LIB_BUILD_DIR)/lib$(LIBRARY_NAME).a
DYNAMIC_VERSION_MAJOR := 1
DYNAMIC_VERSION_MINOR := 0
DYNAMIC_VERSION_REVISION := 0
DYNAMIC_NAME_SHORT := lib$(LIBRARY_NAME).so
#DYNAMIC_SONAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR)
DYNAMIC_VERSIONED_NAME_SHORT := $(DYNAMIC_NAME_SHORT).$(DYNAMIC_VERSION_MAJOR).$(DYNAMIC_VERSION_MINOR).$(DYNAMIC_VERSION_REVISION)
DYNAMIC_NAME := $(LIB_BUILD_DIR)/$(DYNAMIC_VERSIONED_NAME_SHORT)
COMMON_FLAGS += -DCAFFE_VERSION=$(DYNAMIC_VERSION_MAJOR).$(DYNAMIC_VERSION_MINOR).$(DYNAMIC_VERSION_REVISION)
##############################
# Get all source files
##############################
# CXX_SRCS are the source files excluding the test ones.
CXX_SRCS := $(shell find src/$(PROJECT) ! -name "test_*.cpp" -name "*.cpp")
# CU_SRCS are the cuda source files
CU_SRCS := $(shell find src/$(PROJECT) ! -name "test_*.cu" -name "*.cu")
# TEST_SRCS are the test source files
TEST_MAIN_SRC := src/$(PROJECT)/test/test_caffe_main.cpp
TEST_SRCS := $(shell find src/$(PROJECT) -name "test_*.cpp")
TEST_SRCS := $(filter-out $(TEST_MAIN_SRC), $(TEST_SRCS))
TEST_CU_SRCS := $(shell find src/$(PROJECT) -name "test_*.cu")
GTEST_SRC := src/gtest/gtest-all.cpp
# TOOL_SRCS are the source files for the tool binaries
TOOL_SRCS := $(shell find tools -name "*.cpp")
# EXAMPLE_SRCS are the source files for the example binaries
EXAMPLE_SRCS := $(shell find examples -name "*.cpp")
# BUILD_INCLUDE_DIR contains any generated header files we want to include.
BUILD_INCLUDE_DIR := $(BUILD_DIR)/src
# PROTO_SRCS are the protocol buffer definitions
PROTO_SRC_DIR := src/$(PROJECT)/proto
PROTO_SRCS := $(wildcard $(PROTO_SRC_DIR)/*.proto)
# PROTO_BUILD_DIR will contain the .cc and obj files generated from
# PROTO_SRCS; PROTO_BUILD_INCLUDE_DIR will contain the .h header files
PROTO_BUILD_DIR := $(BUILD_DIR)/$(PROTO_SRC_DIR)
PROTO_BUILD_INCLUDE_DIR := $(BUILD_INCLUDE_DIR)/$(PROJECT)/proto
# NONGEN_CXX_SRCS includes all source/header files except those generated
# automatically (e.g., by proto).
NONGEN_CXX_SRCS := $(shell find \
src/$(PROJECT) \
include/$(PROJECT) \
python/$(PROJECT) \
matlab/+$(PROJECT)/private \
examples \
tools \
-name "*.cpp" -or -name "*.hpp" -or -name "*.cu" -or -name "*.cuh")
LINT_SCRIPT := scripts/cpp_lint.py
LINT_OUTPUT_DIR := $(BUILD_DIR)/.lint
LINT_EXT := lint.txt
LINT_OUTPUTS := $(addsuffix .$(LINT_EXT), $(addprefix $(LINT_OUTPUT_DIR)/, $(NONGEN_CXX_SRCS)))
EMPTY_LINT_REPORT := $(BUILD_DIR)/.$(LINT_EXT)
NONEMPTY_LINT_REPORT := $(BUILD_DIR)/$(LINT_EXT)
# PY$(PROJECT)_SRC is the python wrapper for $(PROJECT)
PY$(PROJECT)_SRC := python/$(PROJECT)/_$(PROJECT).cpp
PY$(PROJECT)_SO := python/$(PROJECT)/_$(PROJECT).so
PY$(PROJECT)_HXX := include/$(PROJECT)/layers/python_layer.hpp
# MAT$(PROJECT)_SRC is the mex entrance point of matlab package for $(PROJECT)
MAT$(PROJECT)_SRC := matlab/+$(PROJECT)/private/$(PROJECT)_.cpp
ifneq ($(MATLAB_DIR),)
MAT_SO_EXT := $(shell $(MATLAB_DIR)/bin/mexext)
endif
MAT$(PROJECT)_SO := matlab/+$(PROJECT)/private/$(PROJECT)_.$(MAT_SO_EXT)
##############################
# Derive generated files
##############################
# The generated files for protocol buffers
PROTO_GEN_HEADER_SRCS := $(addprefix $(PROTO_BUILD_DIR)/, \
$(notdir ${PROTO_SRCS:.proto=.pb.h}))
PROTO_GEN_HEADER := $(addprefix $(PROTO_BUILD_INCLUDE_DIR)/, \
$(notdir ${PROTO_SRCS:.proto=.pb.h}))
PROTO_GEN_CC := $(addprefix $(BUILD_DIR)/, ${PROTO_SRCS:.proto=.pb.cc})
PY_PROTO_BUILD_DIR := python/$(PROJECT)/proto
PY_PROTO_INIT := python/$(PROJECT)/proto/__init__.py
PROTO_GEN_PY := $(foreach file,${PROTO_SRCS:.proto=_pb2.py}, \
$(PY_PROTO_BUILD_DIR)/$(notdir $(file)))
# The objects corresponding to the source files
# These objects will be linked into the final shared library, so we
# exclude the tool, example, and test objects.
CXX_OBJS := $(addprefix $(BUILD_DIR)/, ${CXX_SRCS:.cpp=.o})
CU_OBJS := $(addprefix $(BUILD_DIR)/cuda/, ${CU_SRCS:.cu=.o})
PROTO_OBJS := ${PROTO_GEN_CC:.cc=.o}
OBJS := $(PROTO_OBJS) $(CXX_OBJS) $(CU_OBJS)
# tool, example, and test objects
TOOL_OBJS := $(addprefix $(BUILD_DIR)/, ${TOOL_SRCS:.cpp=.o})
TOOL_BUILD_DIR := $(BUILD_DIR)/tools
TEST_CXX_BUILD_DIR := $(BUILD_DIR)/src/$(PROJECT)/test
TEST_CU_BUILD_DIR := $(BUILD_DIR)/cuda/src/$(PROJECT)/test
TEST_CXX_OBJS := $(addprefix $(BUILD_DIR)/, ${TEST_SRCS:.cpp=.o})
TEST_CU_OBJS := $(addprefix $(BUILD_DIR)/cuda/, ${TEST_CU_SRCS:.cu=.o})
TEST_OBJS := $(TEST_CXX_OBJS) $(TEST_CU_OBJS)
GTEST_OBJ := $(addprefix $(BUILD_DIR)/, ${GTEST_SRC:.cpp=.o})
EXAMPLE_OBJS := $(addprefix $(BUILD_DIR)/, ${EXAMPLE_SRCS:.cpp=.o})
# Output files for automatic dependency generation
DEPS := ${CXX_OBJS:.o=.d} ${CU_OBJS:.o=.d} ${TEST_CXX_OBJS:.o=.d} \
${TEST_CU_OBJS:.o=.d} $(BUILD_DIR)/${MAT$(PROJECT)_SO:.$(MAT_SO_EXT)=.d}
# tool, example, and test bins
TOOL_BINS := ${TOOL_OBJS:.o=.bin}
EXAMPLE_BINS := ${EXAMPLE_OBJS:.o=.bin}
# symlinks to tool bins without the ".bin" extension
TOOL_BIN_LINKS := ${TOOL_BINS:.bin=}
# Put the test binaries in build/test for convenience.
TEST_BIN_DIR := $(BUILD_DIR)/test
TEST_CU_BINS := $(addsuffix .testbin,$(addprefix $(TEST_BIN_DIR)/, \
$(foreach obj,$(TEST_CU_OBJS),$(basename $(notdir $(obj))))))
TEST_CXX_BINS := $(addsuffix .testbin,$(addprefix $(TEST_BIN_DIR)/, \
$(foreach obj,$(TEST_CXX_OBJS),$(basename $(notdir $(obj))))))
TEST_BINS := $(TEST_CXX_BINS) $(TEST_CU_BINS)
# TEST_ALL_BIN is the test binary that links caffe dynamically.
TEST_ALL_BIN := $(TEST_BIN_DIR)/test_all.testbin
##############################
# Derive compiler warning dump locations
##############################
WARNS_EXT := warnings.txt
CXX_WARNS := $(addprefix $(BUILD_DIR)/, ${CXX_SRCS:.cpp=.o.$(WARNS_EXT)})
CU_WARNS := $(addprefix $(BUILD_DIR)/cuda/, ${CU_SRCS:.cu=.o.$(WARNS_EXT)})
TOOL_WARNS := $(addprefix $(BUILD_DIR)/, ${TOOL_SRCS:.cpp=.o.$(WARNS_EXT)})
EXAMPLE_WARNS := $(addprefix $(BUILD_DIR)/, ${EXAMPLE_SRCS:.cpp=.o.$(WARNS_EXT)})
TEST_WARNS := $(addprefix $(BUILD_DIR)/, ${TEST_SRCS:.cpp=.o.$(WARNS_EXT)})
TEST_CU_WARNS := $(addprefix $(BUILD_DIR)/cuda/, ${TEST_CU_SRCS:.cu=.o.$(WARNS_EXT)})
ALL_CXX_WARNS := $(CXX_WARNS) $(TOOL_WARNS) $(EXAMPLE_WARNS) $(TEST_WARNS)
ALL_CU_WARNS := $(CU_WARNS) $(TEST_CU_WARNS)
ALL_WARNS := $(ALL_CXX_WARNS) $(ALL_CU_WARNS)
EMPTY_WARN_REPORT := $(BUILD_DIR)/.$(WARNS_EXT)
NONEMPTY_WARN_REPORT := $(BUILD_DIR)/$(WARNS_EXT)
##############################
# Derive include and lib directories
##############################
CUDA_INCLUDE_DIR := $(CUDA_DIR)/include
CUDA_LIB_DIR :=
# add <cuda>/lib64 only if it exists
ifneq ("$(wildcard $(CUDA_DIR)/lib64)","")
CUDA_LIB_DIR += $(CUDA_DIR)/lib64
endif
CUDA_LIB_DIR += $(CUDA_DIR)/lib
INCLUDE_DIRS += $(BUILD_INCLUDE_DIR) ./src ./include
ifneq ($(CPU_ONLY), 1)
INCLUDE_DIRS += $(CUDA_INCLUDE_DIR)
LIBRARY_DIRS += $(CUDA_LIB_DIR)
LIBRARIES := cudart cublas curand
endif
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial
# handle IO dependencies
USE_LEVELDB ?= 1
USE_LMDB ?= 1
# This code is taken from https://github.com/sh1r0/caffe-android-lib
USE_HDF5 ?= 1
USE_OPENCV ?= 1
ifeq ($(USE_LEVELDB), 1)
LIBRARIES += leveldb snappy
endif
ifeq ($(USE_LMDB), 1)
LIBRARIES += lmdb
endif
# This code is taken from https://github.com/sh1r0/caffe-android-lib
ifeq ($(USE_HDF5), 1)
LIBRARIES += hdf5_hl hdf5
endif
ifeq ($(USE_OPENCV), 1)
LIBRARIES += opencv_core opencv_highgui opencv_imgproc
ifeq ($(OPENCV_VERSION), 3)
LIBRARIES += opencv_imgcodecs
endif
endif
PYTHON_LIBRARIES ?= boost_python python2.7
WARNINGS := -Wall -Wno-sign-compare
##############################
# Set build directories
##############################
DISTRIBUTE_DIR ?= distribute
DISTRIBUTE_SUBDIRS := $(DISTRIBUTE_DIR)/bin $(DISTRIBUTE_DIR)/lib
DIST_ALIASES := dist
ifneq ($(strip $(DISTRIBUTE_DIR)),distribute)
DIST_ALIASES += distribute
endif
ALL_BUILD_DIRS := $(sort $(BUILD_DIR) $(addprefix $(BUILD_DIR)/, $(SRC_DIRS)) \
$(addprefix $(BUILD_DIR)/cuda/, $(SRC_DIRS)) \
$(LIB_BUILD_DIR) $(TEST_BIN_DIR) $(PY_PROTO_BUILD_DIR) $(LINT_OUTPUT_DIR) \
$(DISTRIBUTE_SUBDIRS) $(PROTO_BUILD_INCLUDE_DIR))
##############################
# Set directory for Doxygen-generated documentation
##############################
DOXYGEN_CONFIG_FILE ?= ./.Doxyfile
# should be the same as OUTPUT_DIRECTORY in the .Doxyfile
DOXYGEN_OUTPUT_DIR ?= ./doxygen
DOXYGEN_COMMAND ?= doxygen
# All the files that might have Doxygen documentation.
DOXYGEN_SOURCES := $(shell find \
src/$(PROJECT) \
include/$(PROJECT) \
python/ \
matlab/ \
examples \
tools \
-name "*.cpp" -or -name "*.hpp" -or -name "*.cu" -or -name "*.cuh" -or \
-name "*.py" -or -name "*.m")
DOXYGEN_SOURCES += $(DOXYGEN_CONFIG_FILE)
##############################
# Configure build
##############################
# Determine platform
UNAME := $(shell uname -s)
ifeq ($(UNAME), Linux)
LINUX := 1
else ifeq ($(UNAME), Darwin)
OSX := 1
OSX_MAJOR_VERSION := $(shell sw_vers -productVersion | cut -f 1 -d .)
OSX_MINOR_VERSION := $(shell sw_vers -productVersion | cut -f 2 -d .)
endif
# Linux
ifeq ($(LINUX), 1)
CXX ?= /usr/bin/g++
GCCVERSION := $(shell $(CXX) -dumpversion | cut -f1,2 -d.)
# older versions of gcc are too dumb to build boost with -Wuninitalized
ifeq ($(shell echo | awk '{exit $(GCCVERSION) < 4.6;}'), 1)
WARNINGS += -Wno-uninitialized
endif
# boost::thread is reasonably called boost_thread (compare OS X)
# We will also explicitly add stdc++ to the link target.
LIBRARIES += boost_thread stdc++
VERSIONFLAGS += -Wl,-soname,$(DYNAMIC_VERSIONED_NAME_SHORT) -Wl,-rpath,$(ORIGIN)/../lib
endif
# OS X:
# clang++ instead of g++
# libstdc++ for NVCC compatibility on OS X >= 10.9 with CUDA < 7.0
ifeq ($(OSX), 1)
CXX := /usr/bin/clang++
ifneq ($(CPU_ONLY), 1)
CUDA_VERSION := $(shell $(CUDA_DIR)/bin/nvcc -V | grep -o 'release [0-9.]*' | tr -d '[a-z ]')
ifeq ($(shell echo | awk '{exit $(CUDA_VERSION) < 7.0;}'), 1)
CXXFLAGS += -stdlib=libstdc++
LINKFLAGS += -stdlib=libstdc++
endif
# clang throws this warning for cuda headers
WARNINGS += -Wno-unneeded-internal-declaration
# 10.11 strips DYLD_* env vars so link CUDA (rpath is available on 10.5+)
OSX_10_OR_LATER := $(shell [ $(OSX_MAJOR_VERSION) -ge 10 ] && echo true)
OSX_10_5_OR_LATER := $(shell [ $(OSX_MINOR_VERSION) -ge 5 ] && echo true)
ifeq ($(OSX_10_OR_LATER),true)
ifeq ($(OSX_10_5_OR_LATER),true)
LDFLAGS += -Wl,-rpath,$(CUDA_LIB_DIR)
endif
endif
endif
# gtest needs to use its own tuple to not conflict with clang
COMMON_FLAGS += -DGTEST_USE_OWN_TR1_TUPLE=1
# boost::thread is called boost_thread-mt to mark multithreading on OS X
LIBRARIES += boost_thread-mt
# we need to explicitly ask for the rpath to be obeyed
ORIGIN := @loader_path
VERSIONFLAGS += -Wl,-install_name,@rpath/$(DYNAMIC_VERSIONED_NAME_SHORT) -Wl,-rpath,$(ORIGIN)/../../build/lib
else
ORIGIN := \$$ORIGIN
endif
# Custom compiler
ifdef CUSTOM_CXX
CXX := $(CUSTOM_CXX)
endif
# Static linking
ifneq (,$(findstring clang++,$(CXX)))
STATIC_LINK_COMMAND := -Wl,-force_load $(STATIC_NAME)
else ifneq (,$(findstring g++,$(CXX)))
STATIC_LINK_COMMAND := -Wl,--whole-archive $(STATIC_NAME) -Wl,--no-whole-archive
else
# The following line must not be indented with a tab, since we are not inside a target
$(error Cannot static link with the $(CXX) compiler)
endif
# Debugging
ifeq ($(DEBUG), 1)
COMMON_FLAGS += -DDEBUG -g -O0
NVCCFLAGS += -G
else
COMMON_FLAGS += -DNDEBUG -O2
endif
# cuDNN acceleration configuration.
ifeq ($(USE_CUDNN), 1)
LIBRARIES += cudnn
COMMON_FLAGS += -DUSE_CUDNN
endif
# NCCL acceleration configuration
ifeq ($(USE_NCCL), 1)
LIBRARIES += nccl
COMMON_FLAGS += -DUSE_NCCL
endif
# configure IO libraries
ifeq ($(USE_OPENCV), 1)
COMMON_FLAGS += -DUSE_OPENCV
endif
ifeq ($(USE_LEVELDB), 1)
COMMON_FLAGS += -DUSE_LEVELDB
endif
ifeq ($(USE_LMDB), 1)
COMMON_FLAGS += -DUSE_LMDB
ifeq ($(ALLOW_LMDB_NOLOCK), 1)
COMMON_FLAGS += -DALLOW_LMDB_NOLOCK
endif
endif
# This code is taken from https://github.com/sh1r0/caffe-android-lib
ifeq ($(USE_HDF5), 1)
COMMON_FLAGS += -DUSE_HDF5
endif
# CPU-only configuration
ifeq ($(CPU_ONLY), 1)
OBJS := $(PROTO_OBJS) $(CXX_OBJS)
TEST_OBJS := $(TEST_CXX_OBJS)
TEST_BINS := $(TEST_CXX_BINS)
ALL_WARNS := $(ALL_CXX_WARNS)
TEST_FILTER := --gtest_filter="-*GPU*"
COMMON_FLAGS += -DCPU_ONLY
endif
# Python layer support
ifeq ($(WITH_PYTHON_LAYER), 1)
COMMON_FLAGS += -DWITH_PYTHON_LAYER
LIBRARIES += $(PYTHON_LIBRARIES)
endif
# BLAS configuration (default = ATLAS)
BLAS ?= atlas
ifeq ($(BLAS), mkl)
# MKL
LIBRARIES += mkl_rt
COMMON_FLAGS += -DUSE_MKL
MKLROOT ?= /opt/intel/mkl
BLAS_INCLUDE ?= $(MKLROOT)/include
BLAS_LIB ?= $(MKLROOT)/lib $(MKLROOT)/lib/intel64
else ifeq ($(BLAS), open)
# OpenBLAS
LIBRARIES += openblas
else
# ATLAS
ifeq ($(LINUX), 1)
ifeq ($(BLAS), atlas)
# Linux simply has cblas and atlas
LIBRARIES += cblas atlas
endif
else ifeq ($(OSX), 1)
# OS X packages atlas as the vecLib framework
LIBRARIES += cblas
# 10.10 has accelerate while 10.9 has veclib
XCODE_CLT_VER := $(shell pkgutil --pkg-info=com.apple.pkg.CLTools_Executables | grep 'version' | sed 's/[^0-9]*\([0-9]\).*/\1/')
XCODE_CLT_GEQ_7 := $(shell [ $(XCODE_CLT_VER) -gt 6 ] && echo 1)
XCODE_CLT_GEQ_6 := $(shell [ $(XCODE_CLT_VER) -gt 5 ] && echo 1)
ifeq ($(XCODE_CLT_GEQ_7), 1)
BLAS_INCLUDE ?= /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/$(shell ls /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/ | sort | tail -1)/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/Headers
else ifeq ($(XCODE_CLT_GEQ_6), 1)
BLAS_INCLUDE ?= /System/Library/Frameworks/Accelerate.framework/Versions/Current/Frameworks/vecLib.framework/Headers/
LDFLAGS += -framework Accelerate
else
BLAS_INCLUDE ?= /System/Library/Frameworks/vecLib.framework/Versions/Current/Headers/
LDFLAGS += -framework vecLib
endif
endif
endif
INCLUDE_DIRS += $(BLAS_INCLUDE)
LIBRARY_DIRS += $(BLAS_LIB)
LIBRARY_DIRS += $(LIB_BUILD_DIR)
# Automatic dependency generation (nvcc is handled separately)
CXXFLAGS += -MMD -MP
# Complete build flags.
COMMON_FLAGS += $(foreach includedir,$(INCLUDE_DIRS),-I$(includedir))
CXXFLAGS += -pthread -fPIC $(COMMON_FLAGS) $(WARNINGS)
NVCCFLAGS += -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
# mex may invoke an older gcc that is too liberal with -Wuninitalized
MATLAB_CXXFLAGS := $(CXXFLAGS) -Wno-uninitialized
LINKFLAGS += -pthread -fPIC $(COMMON_FLAGS) $(WARNINGS)
USE_PKG_CONFIG ?= 0
ifeq ($(USE_PKG_CONFIG), 1)
PKG_CONFIG := $(shell pkg-config opencv --libs)
else
PKG_CONFIG :=
endif
LDFLAGS += $(foreach librarydir,$(LIBRARY_DIRS),-L$(librarydir)) $(PKG_CONFIG) \
$(foreach library,$(LIBRARIES),-l$(library))
PYTHON_LDFLAGS := $(LDFLAGS) $(foreach library,$(PYTHON_LIBRARIES),-l$(library))
# 'superclean' target recursively* deletes all files ending with an extension
# in $(SUPERCLEAN_EXTS) below. This may be useful if you've built older
# versions of Caffe that do not place all generated files in a location known
# to the 'clean' target.
#
# 'supercleanlist' will list the files to be deleted by make superclean.
#
# * Recursive with the exception that symbolic links are never followed, per the
# default behavior of 'find'.
SUPERCLEAN_EXTS := .so .a .o .bin .testbin .pb.cc .pb.h _pb2.py .cuo
# Set the sub-targets of the 'everything' target.
EVERYTHING_TARGETS := all py$(PROJECT) test warn lint
# Only build matcaffe as part of "everything" if MATLAB_DIR is specified.
ifneq ($(MATLAB_DIR),)
EVERYTHING_TARGETS += mat$(PROJECT)
endif
##############################
# Define build targets
##############################
.PHONY: all lib test clean docs linecount lint lintclean tools examples $(DIST_ALIASES) \
py mat py$(PROJECT) mat$(PROJECT) proto runtest \
superclean supercleanlist supercleanfiles warn everything
all: lib tools examples
lib: $(STATIC_NAME) $(DYNAMIC_NAME)
everything: $(EVERYTHING_TARGETS)
linecount:
cloc --read-lang-def=$(PROJECT).cloc \
src/$(PROJECT) include/$(PROJECT) tools examples \
python matlab
lint: $(EMPTY_LINT_REPORT)
lintclean:
@ $(RM) -r $(LINT_OUTPUT_DIR) $(EMPTY_LINT_REPORT) $(NONEMPTY_LINT_REPORT)
docs: $(DOXYGEN_OUTPUT_DIR)
@ cd ./docs ; ln -sfn ../$(DOXYGEN_OUTPUT_DIR)/html doxygen
$(DOXYGEN_OUTPUT_DIR): $(DOXYGEN_CONFIG_FILE) $(DOXYGEN_SOURCES)
$(DOXYGEN_COMMAND) $(DOXYGEN_CONFIG_FILE)
$(EMPTY_LINT_REPORT): $(LINT_OUTPUTS) | $(BUILD_DIR)
@ cat $(LINT_OUTPUTS) > $@
@ if [ -s "$@" ]; then \
cat $@; \
mv $@ $(NONEMPTY_LINT_REPORT); \
echo "Found one or more lint errors."; \
exit 1; \
fi; \
$(RM) $(NONEMPTY_LINT_REPORT); \
echo "No lint errors!";
$(LINT_OUTPUTS): $(LINT_OUTPUT_DIR)/%.lint.txt : % $(LINT_SCRIPT) | $(LINT_OUTPUT_DIR)
@ mkdir -p $(dir $@)
@ python $(LINT_SCRIPT) $< 2>&1 \
| grep -v "^Done processing " \
| grep -v "^Total errors found: 0" \
> $@ \
|| true
test: $(TEST_ALL_BIN) $(TEST_ALL_DYNLINK_BIN) $(TEST_BINS)
tools: $(TOOL_BINS) $(TOOL_BIN_LINKS)
examples: $(EXAMPLE_BINS)
py$(PROJECT): py
py: $(PY$(PROJECT)_SO) $(PROTO_GEN_PY)
$(PY$(PROJECT)_SO): $(PY$(PROJECT)_SRC) $(PY$(PROJECT)_HXX) | $(DYNAMIC_NAME)
@ echo CXX/LD -o $@ $<
$(Q)$(CXX) -shared -o $@ $(PY$(PROJECT)_SRC) \
-o $@ $(LINKFLAGS) -l$(LIBRARY_NAME) $(PYTHON_LDFLAGS) \
-Wl,-rpath,$(ORIGIN)/../../build/lib
mat$(PROJECT): mat
mat: $(MAT$(PROJECT)_SO)
$(MAT$(PROJECT)_SO): $(MAT$(PROJECT)_SRC) $(STATIC_NAME)
@ if [ -z "$(MATLAB_DIR)" ]; then \
echo "MATLAB_DIR must be specified in $(CONFIG_FILE)" \
"to build mat$(PROJECT)."; \
exit 1; \
fi
@ echo MEX $<
$(Q)$(MATLAB_DIR)/bin/mex $(MAT$(PROJECT)_SRC) \
CXX="$(CXX)" \
CXXFLAGS="\$$CXXFLAGS $(MATLAB_CXXFLAGS)" \
CXXLIBS="\$$CXXLIBS $(STATIC_LINK_COMMAND) $(LDFLAGS)" -output $@
@ if [ -f "$(PROJECT)_.d" ]; then \
mv -f $(PROJECT)_.d $(BUILD_DIR)/${MAT$(PROJECT)_SO:.$(MAT_SO_EXT)=.d}; \
fi
runtest: $(TEST_ALL_BIN)
$(TOOL_BUILD_DIR)/caffe
$(TEST_ALL_BIN) $(TEST_GPUID) --gtest_shuffle $(TEST_FILTER)
pytest: py
cd python; python -m unittest discover -s caffe/test
mattest: mat
cd matlab; $(MATLAB_DIR)/bin/matlab -nodisplay -r 'caffe.run_tests(), exit()'
warn: $(EMPTY_WARN_REPORT)
$(EMPTY_WARN_REPORT): $(ALL_WARNS) | $(BUILD_DIR)
@ cat $(ALL_WARNS) > $@
@ if [ -s "$@" ]; then \
cat $@; \
mv $@ $(NONEMPTY_WARN_REPORT); \
echo "Compiler produced one or more warnings."; \
exit 1; \
fi; \
$(RM) $(NONEMPTY_WARN_REPORT); \
echo "No compiler warnings!";
$(ALL_WARNS): %.o.$(WARNS_EXT) : %.o
$(BUILD_DIR_LINK): $(BUILD_DIR)/.linked
# Create a target ".linked" in this BUILD_DIR to tell Make that the "build" link
# is currently correct, then delete the one in the OTHER_BUILD_DIR in case it
# exists and $(DEBUG) is toggled later.
$(BUILD_DIR)/.linked:
@ mkdir -p $(BUILD_DIR)
@ $(RM) $(OTHER_BUILD_DIR)/.linked
@ $(RM) -r $(BUILD_DIR_LINK)
@ ln -s $(BUILD_DIR) $(BUILD_DIR_LINK)
@ touch $@
$(ALL_BUILD_DIRS): | $(BUILD_DIR_LINK)
@ mkdir -p $@
$(DYNAMIC_NAME): $(OBJS) | $(LIB_BUILD_DIR)
@ echo LD -o $@
$(Q)$(CXX) -shared -o $@ $(OBJS) $(VERSIONFLAGS) $(LINKFLAGS) $(LDFLAGS)
@ cd $(BUILD_DIR)/lib; rm -f $(DYNAMIC_NAME_SHORT); ln -s $(DYNAMIC_VERSIONED_NAME_SHORT) $(DYNAMIC_NAME_SHORT)
$(STATIC_NAME): $(OBJS) | $(LIB_BUILD_DIR)
@ echo AR -o $@
$(Q)ar rcs $@ $(OBJS)
$(BUILD_DIR)/%.o: %.cpp $(PROTO_GEN_HEADER) | $(ALL_BUILD_DIRS)
@ echo CXX $<
$(Q)$(CXX) $< $(CXXFLAGS) -c -o $@ 2> $@.$(WARNS_EXT) \
|| (cat $@.$(WARNS_EXT); exit 1)
@ cat $@.$(WARNS_EXT)
$(PROTO_BUILD_DIR)/%.pb.o: $(PROTO_BUILD_DIR)/%.pb.cc $(PROTO_GEN_HEADER) \
| $(PROTO_BUILD_DIR)
@ echo CXX $<
$(Q)$(CXX) $< $(CXXFLAGS) -c -o $@ 2> $@.$(WARNS_EXT) \
|| (cat $@.$(WARNS_EXT); exit 1)
@ cat $@.$(WARNS_EXT)
$(BUILD_DIR)/cuda/%.o: %.cu | $(ALL_BUILD_DIRS)
@ echo NVCC $<
$(Q)$(CUDA_DIR)/bin/nvcc $(NVCCFLAGS) $(CUDA_ARCH) -M $< -o ${@:.o=.d} \
-odir $(@D)
$(Q)$(CUDA_DIR)/bin/nvcc $(NVCCFLAGS) $(CUDA_ARCH) -c $< -o $@ 2> $@.$(WARNS_EXT) \
|| (cat $@.$(WARNS_EXT); exit 1)
@ cat $@.$(WARNS_EXT)
$(TEST_ALL_BIN): $(TEST_MAIN_SRC) $(TEST_OBJS) $(GTEST_OBJ) \
| $(DYNAMIC_NAME) $(TEST_BIN_DIR)
@ echo CXX/LD -o $@ $<
$(Q)$(CXX) $(TEST_MAIN_SRC) $(TEST_OBJS) $(GTEST_OBJ) \
-o $@ $(LINKFLAGS) $(LDFLAGS) -l$(LIBRARY_NAME) -Wl,-rpath,$(ORIGIN)/../lib
$(TEST_CU_BINS): $(TEST_BIN_DIR)/%.testbin: $(TEST_CU_BUILD_DIR)/%.o \
$(GTEST_OBJ) | $(DYNAMIC_NAME) $(TEST_BIN_DIR)
@ echo LD $<
$(Q)$(CXX) $(TEST_MAIN_SRC) $< $(GTEST_OBJ) \
-o $@ $(LINKFLAGS) $(LDFLAGS) -l$(LIBRARY_NAME) -Wl,-rpath,$(ORIGIN)/../lib
$(TEST_CXX_BINS): $(TEST_BIN_DIR)/%.testbin: $(TEST_CXX_BUILD_DIR)/%.o \
$(GTEST_OBJ) | $(DYNAMIC_NAME) $(TEST_BIN_DIR)
@ echo LD $<
$(Q)$(CXX) $(TEST_MAIN_SRC) $< $(GTEST_OBJ) \
-o $@ $(LINKFLAGS) $(LDFLAGS) -l$(LIBRARY_NAME) -Wl,-rpath,$(ORIGIN)/../lib
# Target for extension-less symlinks to tool binaries with extension '*.bin'.
$(TOOL_BUILD_DIR)/%: $(TOOL_BUILD_DIR)/%.bin | $(TOOL_BUILD_DIR)
@ $(RM) $@
@ ln -s $(notdir $<) $@
$(TOOL_BINS): %.bin : %.o | $(DYNAMIC_NAME)
@ echo CXX/LD -o $@
$(Q)$(CXX) $< -o $@ $(LINKFLAGS) -l$(LIBRARY_NAME) $(LDFLAGS) \
-Wl,-rpath,$(ORIGIN)/../lib
$(EXAMPLE_BINS): %.bin : %.o | $(DYNAMIC_NAME)
@ echo CXX/LD -o $@
$(Q)$(CXX) $< -o $@ $(LINKFLAGS) -l$(LIBRARY_NAME) $(LDFLAGS) \
-Wl,-rpath,$(ORIGIN)/../../lib
proto: $(PROTO_GEN_CC) $(PROTO_GEN_HEADER)
$(PROTO_BUILD_DIR)/%.pb.cc $(PROTO_BUILD_DIR)/%.pb.h : \
$(PROTO_SRC_DIR)/%.proto | $(PROTO_BUILD_DIR)
@ echo PROTOC $<
$(Q)protoc --proto_path=$(PROTO_SRC_DIR) --cpp_out=$(PROTO_BUILD_DIR) $<
$(PY_PROTO_BUILD_DIR)/%_pb2.py : $(PROTO_SRC_DIR)/%.proto \
$(PY_PROTO_INIT) | $(PY_PROTO_BUILD_DIR)
@ echo PROTOC \(python\) $<
$(Q)protoc --proto_path=src --python_out=python $<
$(PY_PROTO_INIT): | $(PY_PROTO_BUILD_DIR)
touch $(PY_PROTO_INIT)
clean:
@- $(RM) -rf $(ALL_BUILD_DIRS)
@- $(RM) -rf $(OTHER_BUILD_DIR)
@- $(RM) -rf $(BUILD_DIR_LINK)
@- $(RM) -rf $(DISTRIBUTE_DIR)
@- $(RM) $(PY$(PROJECT)_SO)
@- $(RM) $(MAT$(PROJECT)_SO)
supercleanfiles:
$(eval SUPERCLEAN_FILES := $(strip \
$(foreach ext,$(SUPERCLEAN_EXTS), $(shell find . -name '*$(ext)' \
-not -path './data/*'))))
supercleanlist: supercleanfiles
@ \
if [ -z "$(SUPERCLEAN_FILES)" ]; then \
echo "No generated files found."; \
else \
echo $(SUPERCLEAN_FILES) | tr ' ' '\n'; \
fi
superclean: clean supercleanfiles
@ \
if [ -z "$(SUPERCLEAN_FILES)" ]; then \
echo "No generated files found."; \
else \
echo "Deleting the following generated files:"; \
echo $(SUPERCLEAN_FILES) | tr ' ' '\n'; \
$(RM) $(SUPERCLEAN_FILES); \
fi
$(DIST_ALIASES): $(DISTRIBUTE_DIR)
$(DISTRIBUTE_DIR): all py | $(DISTRIBUTE_SUBDIRS)
# add proto
cp -r src/caffe/proto $(DISTRIBUTE_DIR)/
# add include
cp -r include $(DISTRIBUTE_DIR)/
mkdir -p $(DISTRIBUTE_DIR)/include/caffe/proto
cp $(PROTO_GEN_HEADER_SRCS) $(DISTRIBUTE_DIR)/include/caffe/proto
# add tool and example binaries
cp $(TOOL_BINS) $(DISTRIBUTE_DIR)/bin
cp $(EXAMPLE_BINS) $(DISTRIBUTE_DIR)/bin
# add libraries
cp $(STATIC_NAME) $(DISTRIBUTE_DIR)/lib
install -m 644 $(DYNAMIC_NAME) $(DISTRIBUTE_DIR)/lib
cd $(DISTRIBUTE_DIR)/lib; rm -f $(DYNAMIC_NAME_SHORT); ln -s $(DYNAMIC_VERSIONED_NAME_SHORT) $(DYNAMIC_NAME_SHORT)
# add python - it's not the standard way, indeed...
cp -r python $(DISTRIBUTE_DIR)/
-include $(DEPS)
cd python/
使用镜像源安装依赖库:
python3 -m pip install -r requirements.txt -i https://mirrors.hust.edu.cn/pypi/web/simple
cd .. && sudo make clean
由于caffe
不支持cuDNN8.X
,为了能在cuDNN8.X
的环境下编译通过,需要修改两个cpp
文件,路径为src/caffe/layers/
下的cudnn_conv_layer.cpp
和cudnn_deconv_layer.cpp
两个文件,分别将他们内容替换为:
/**
* @@file: cudnn_conv_layer.cpp
*/
#ifdef USE_CUDNN
#include <algorithm>
#include <vector>
#include "caffe/layers/cudnn_conv_layer.hpp"
namespace caffe
{
// Set to three for the benefit of the backward pass, which
// can use separate streams for calculating the gradient w.r.t.
// bias, filter weights, and bottom data for each group independently
#define CUDNN_STREAMS_PER_GROUP 3
/**
* TODO(dox) explain cuDNN interface
*/
template <typename Dtype>
void CuDNNConvolutionLayer<Dtype>::LayerSetUp(
const vector<Blob<Dtype> *> &bottom, const vector<Blob<Dtype> *> &top)
{
ConvolutionLayer<Dtype>::LayerSetUp(bottom, top);
// Initialize CUDA streams and cuDNN.
stream_ = new cudaStream_t[this->group_ * CUDNN_STREAMS_PER_GROUP];
handle_ = new cudnnHandle_t[this->group_ * CUDNN_STREAMS_PER_GROUP];
// Initialize algorithm arrays
fwd_algo_ = new cudnnConvolutionFwdAlgo_t[bottom.size()];
bwd_filter_algo_ = new cudnnConvolutionBwdFilterAlgo_t[bottom.size()];
bwd_data_algo_ = new cudnnConvolutionBwdDataAlgo_t[bottom.size()];
// initialize size arrays
workspace_fwd_sizes_ = new size_t[bottom.size()];
workspace_bwd_filter_sizes_ = new size_t[bottom.size()];
workspace_bwd_data_sizes_ = new size_t[bottom.size()];
// workspace data
workspaceSizeInBytes = 0;
workspaceData = NULL;
workspace = new void *[this->group_ * CUDNN_STREAMS_PER_GROUP];
for (size_t i = 0; i < bottom.size(); ++i)
{
// initialize all to default algorithms
fwd_algo_[i] = (cudnnConvolutionFwdAlgo_t)0;
bwd_filter_algo_[i] = (cudnnConvolutionBwdFilterAlgo_t)0;
bwd_data_algo_[i] = (cudnnConvolutionBwdDataAlgo_t)0;
// default algorithms don't require workspace
workspace_fwd_sizes_[i] = 0;
workspace_bwd_data_sizes_[i] = 0;
workspace_bwd_filter_sizes_[i] = 0;
}
for (int g = 0; g < this->group_ * CUDNN_STREAMS_PER_GROUP; g++)
{
CUDA_CHECK(cudaStreamCreate(&stream_[g]));
CUDNN_CHECK(cudnnCreate(&handle_[g]));
CUDNN_CHECK(cudnnSetStream(handle_[g], stream_[g]));
workspace[g] = NULL;
}
// Set the indexing parameters.
bias_offset_ = (this->num_output_ / this->group_);
// Create filter descriptor.
const int *kernel_shape_data = this->kernel_shape_.cpu_data();
const int kernel_h = kernel_shape_data[0];
const int kernel_w = kernel_shape_data[1];
cudnn::createFilterDesc<Dtype>(&filter_desc_,
this->num_output_ / this->group_, this->channels_ / this->group_,
kernel_h, kernel_w);
// Create tensor descriptor(s) for data and corresponding convolution(s).
for (int i = 0; i < bottom.size(); i++)
{
cudnnTensorDescriptor_t bottom_desc;
cudnn::createTensor4dDesc<Dtype>(&bottom_desc);
bottom_descs_.push_back(bottom_desc);
cudnnTensorDescriptor_t top_desc;
cudnn::createTensor4dDesc<Dtype>(&top_desc);
top_descs_.push_back(top_desc);
cudnnConvolutionDescriptor_t conv_desc;
cudnn::createConvolutionDesc<Dtype>(&conv_desc);
conv_descs_.push_back(conv_desc);
}
// Tensor descriptor for bias.
if (this->bias_term_)
{
cudnn::createTensor4dDesc<Dtype>(&bias_desc_);
}
handles_setup_ = true;
}
template <typename Dtype>
void CuDNNConvolutionLayer<Dtype>::Reshape(
const vector<Blob<Dtype> *> &bottom, const vector<Blob<Dtype> *> &top)
{
ConvolutionLayer<Dtype>::Reshape(bottom, top);
CHECK_EQ(2, this->num_spatial_axes_)
<< "CuDNNConvolution input must have 2 spatial axes "
<< "(e.g., height and width). "
<< "Use 'engine: CAFFE' for general ND convolution.";
bottom_offset_ = this->bottom_dim_ / this->group_;
top_offset_ = this->top_dim_ / this->group_;
const int height = bottom[0]->shape(this->channel_axis_ + 1);
const int width = bottom[0]->shape(this->channel_axis_ + 2);
const int height_out = top[0]->shape(this->channel_axis_ + 1);
const int width_out = top[0]->shape(this->channel_axis_ + 2);
const int *pad_data = this->pad_.cpu_data();
const int pad_h = pad_data[0];
const int pad_w = pad_data[1];
const int *stride_data = this->stride_.cpu_data();
const int stride_h = stride_data[0];
const int stride_w = stride_data[1];
#if CUDNN_VERSION_MIN(8, 0, 0)
int RetCnt;
bool found_conv_algorithm;
size_t free_memory, total_memory;
cudnnConvolutionFwdAlgoPerf_t fwd_algo_pref_[4];
cudnnConvolutionBwdDataAlgoPerf_t bwd_data_algo_pref_[4];
// get memory sizes
cudaMemGetInfo(&free_memory, &total_memory);
#else
// Specify workspace limit for kernels directly until we have a
// planning strategy and a rewrite of Caffe's GPU memory mangagement
size_t workspace_limit_bytes = 8 * 1024 * 1024;
#endif
for (int i = 0; i < bottom.size(); i++)
{
cudnn::setTensor4dDesc<Dtype>(&bottom_descs_[i],
this->num_,
this->channels_ / this->group_, height, width,
this->channels_ * height * width,
height * width, width, 1);
cudnn::setTensor4dDesc<Dtype>(&top_descs_[i],
this->num_,
this->num_output_ / this->group_, height_out, width_out,
this->num_output_ * this->out_spatial_dim_,
this->out_spatial_dim_, width_out, 1);
cudnn::setConvolutionDesc<Dtype>(&conv_descs_[i], bottom_descs_[i],
filter_desc_, pad_h, pad_w,
stride_h, stride_w);
#if CUDNN_VERSION_MIN(8, 0, 0)
// choose forward algorithm for filter
// in forward filter the CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED is not implemented in cuDNN 8
CUDNN_CHECK(cudnnGetConvolutionForwardAlgorithm_v7(handle_[0],
bottom_descs_[i],
filter_desc_,
conv_descs_[i],
top_descs_[i],
4,
&RetCnt,
fwd_algo_pref_));
found_conv_algorithm = false;
for (int n = 0; n < RetCnt; n++)
{
if (fwd_algo_pref_[n].status == CUDNN_STATUS_SUCCESS &&
fwd_algo_pref_[n].algo != CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED &&
fwd_algo_pref_[n].memory < free_memory)
{
found_conv_algorithm = true;
fwd_algo_[i] = fwd_algo_pref_[n].algo;
workspace_fwd_sizes_[i] = fwd_algo_pref_[n].memory;
break;
}
}
if (!found_conv_algorithm)
LOG(ERROR) << "cuDNN did not return a suitable algorithm for convolution.";
else
{
// choose backward algorithm for filter
// for better or worse, just a fixed constant due to the missing
// cudnnGetConvolutionBackwardFilterAlgorithm in cuDNN version 8.0
bwd_filter_algo_[i] = CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0;
// twice the amount of the forward search to be save
workspace_bwd_filter_sizes_[i] = 2 * workspace_fwd_sizes_[i];
}
// choose backward algo for data
CUDNN_CHECK(cudnnGetConvolutionBackwardDataAlgorithm_v7(handle_[0],
filter_desc_,
top_descs_[i],
conv_descs_[i],
bottom_descs_[i],
4,
&RetCnt,
bwd_data_algo_pref_));
found_conv_algorithm = false;
for (int n = 0; n < RetCnt; n++)
{
if (bwd_data_algo_pref_[n].status == CUDNN_STATUS_SUCCESS &&
bwd_data_algo_pref_[n].algo != CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD &&
bwd_data_algo_pref_[n].algo != CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED &&
bwd_data_algo_pref_[n].memory < free_memory)
{
found_conv_algorithm = true;
bwd_data_algo_[i] = bwd_data_algo_pref_[n].algo;
workspace_bwd_data_sizes_[i] = bwd_data_algo_pref_[n].memory;
break;
}
}
if (!found_conv_algorithm)
LOG(ERROR) << "cuDNN did not return a suitable algorithm for convolution.";
#else
// choose forward and backward algorithms + workspace(s)
CUDNN_CHECK(cudnnGetConvolutionForwardAlgorithm(handle_[0],
bottom_descs_[i],
filter_desc_,
conv_descs_[i],
top_descs_[i],
CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes,
&fwd_algo_[i]));
CUDNN_CHECK(cudnnGetConvolutionForwardWorkspaceSize(handle_[0],
bottom_descs_[i],
filter_desc_,
conv_descs_[i],
top_descs_[i],
fwd_algo_[i],
&(workspace_fwd_sizes_[i])));
// choose backward algorithm for filter
CUDNN_CHECK(cudnnGetConvolutionBackwardFilterAlgorithm(handle_[0],
bottom_descs_[i], top_descs_[i], conv_descs_[i], filter_desc_,
CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes, &bwd_filter_algo_[i]));
// get workspace for backwards filter algorithm
CUDNN_CHECK(cudnnGetConvolutionBackwardFilterWorkspaceSize(handle_[0],
bottom_descs_[i], top_descs_[i], conv_descs_[i], filter_desc_,
bwd_filter_algo_[i], &workspace_bwd_filter_sizes_[i]));
// choose backward algo for data
CUDNN_CHECK(cudnnGetConvolutionBackwardDataAlgorithm(handle_[0],
filter_desc_, top_descs_[i], conv_descs_[i], bottom_descs_[i],
CUDNN_CONVOLUTION_BWD_DATA_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes, &bwd_data_algo_[i]));
// get workspace size
CUDNN_CHECK(cudnnGetConvolutionBackwardDataWorkspaceSize(handle_[0],
filter_desc_, top_descs_[i], conv_descs_[i], bottom_descs_[i],
bwd_data_algo_[i], &workspace_bwd_data_sizes_[i]));
#endif
}
// reduce over all workspace sizes to get a maximum to allocate / reallocate
size_t total_workspace_fwd = 0;
size_t total_workspace_bwd_data = 0;
size_t total_workspace_bwd_filter = 0;
for (size_t i = 0; i < bottom.size(); i++)
{
total_workspace_fwd = std::max(total_workspace_fwd,
workspace_fwd_sizes_[i]);
total_workspace_bwd_data = std::max(total_workspace_bwd_data,
workspace_bwd_data_sizes_[i]);
total_workspace_bwd_filter = std::max(total_workspace_bwd_filter,
workspace_bwd_filter_sizes_[i]);
}
// get max over all operations
size_t max_workspace = std::max(total_workspace_fwd,
total_workspace_bwd_data);
max_workspace = std::max(max_workspace, total_workspace_bwd_filter);
// ensure all groups have enough workspace
size_t total_max_workspace = max_workspace *
(this->group_ * CUDNN_STREAMS_PER_GROUP);
// this is the total amount of storage needed over all groups + streams
if (total_max_workspace > workspaceSizeInBytes)
{
DLOG(INFO) << "Reallocating workspace storage: " << total_max_workspace;
workspaceSizeInBytes = total_max_workspace;
// free the existing workspace and allocate a new (larger) one
cudaFree(this->workspaceData);
cudaError_t err = cudaMalloc(&(this->workspaceData), workspaceSizeInBytes);
if (err != cudaSuccess)
{
// force zero memory path
for (int i = 0; i < bottom.size(); i++)
{
workspace_fwd_sizes_[i] = 0;
workspace_bwd_filter_sizes_[i] = 0;
workspace_bwd_data_sizes_[i] = 0;
fwd_algo_[i] = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM;
bwd_filter_algo_[i] = CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0;
bwd_data_algo_[i] = CUDNN_CONVOLUTION_BWD_DATA_ALGO_0;
}
// NULL out all workspace pointers
for (int g = 0; g < (this->group_ * CUDNN_STREAMS_PER_GROUP); g++)
{
workspace[g] = NULL;
}
// NULL out underlying data
workspaceData = NULL;
workspaceSizeInBytes = 0;
}
// if we succeed in the allocation, set pointer aliases for workspaces
for (int g = 0; g < (this->group_ * CUDNN_STREAMS_PER_GROUP); g++)
{
workspace[g] = reinterpret_cast<char *>(workspaceData) + g * max_workspace;
}
}
// Tensor descriptor for bias.
if (this->bias_term_)
{
cudnn::setTensor4dDesc<Dtype>(&bias_desc_,
1, this->num_output_ / this->group_, 1, 1);
}
}
template <typename Dtype>
CuDNNConvolutionLayer<Dtype>::~CuDNNConvolutionLayer()
{
// Check that handles have been setup before destroying.
if (!handles_setup_)
{
return;
}
for (int i = 0; i < bottom_descs_.size(); i++)
{
cudnnDestroyTensorDescriptor(bottom_descs_[i]);
cudnnDestroyTensorDescriptor(top_descs_[i]);
cudnnDestroyConvolutionDescriptor(conv_descs_[i]);
}
if (this->bias_term_)
{
cudnnDestroyTensorDescriptor(bias_desc_);
}
cudnnDestroyFilterDescriptor(filter_desc_);
for (int g = 0; g < this->group_ * CUDNN_STREAMS_PER_GROUP; g++)
{
cudaStreamDestroy(stream_[g]);
cudnnDestroy(handle_[g]);
}
cudaFree(workspaceData);
delete[] stream_;
delete[] handle_;
delete[] fwd_algo_;
delete[] bwd_filter_algo_;
delete[] bwd_data_algo_;
delete[] workspace_fwd_sizes_;
delete[] workspace_bwd_data_sizes_;
delete[] workspace_bwd_filter_sizes_;
}
INSTANTIATE_CLASS(CuDNNConvolutionLayer);
} // namespace caffe
#endif
cudnn_deconv_layer.cpp
/**
* @@file: cudnn_deconv_layer.cpp
*/
#ifdef USE_CUDNN
#include <algorithm>
#include <vector>
#include "caffe/layers/cudnn_deconv_layer.hpp"
namespace caffe
{
// Set to three for the benefit of the backward pass, which
// can use separate streams for calculating the gradient w.r.t.
// bias, filter weights, and bottom data for each group independently
#define CUDNN_STREAMS_PER_GROUP 3
/**
* TODO(dox) explain cuDNN interface
*/
template <typename Dtype>
void CuDNNDeconvolutionLayer<Dtype>::LayerSetUp(
const vector<Blob<Dtype> *> &bottom, const vector<Blob<Dtype> *> &top)
{
DeconvolutionLayer<Dtype>::LayerSetUp(bottom, top);
// Initialize CUDA streams and cuDNN.
stream_ = new cudaStream_t[this->group_ * CUDNN_STREAMS_PER_GROUP];
handle_ = new cudnnHandle_t[this->group_ * CUDNN_STREAMS_PER_GROUP];
// Initialize algorithm arrays
fwd_algo_ = new cudnnConvolutionFwdAlgo_t[bottom.size()];
bwd_filter_algo_ = new cudnnConvolutionBwdFilterAlgo_t[bottom.size()];
bwd_data_algo_ = new cudnnConvolutionBwdDataAlgo_t[bottom.size()];
// initialize size arrays
workspace_fwd_sizes_ = new size_t[bottom.size()];
workspace_bwd_filter_sizes_ = new size_t[bottom.size()];
workspace_bwd_data_sizes_ = new size_t[bottom.size()];
// workspace data
workspaceSizeInBytes = 0;
workspaceData = NULL;
workspace = new void *[this->group_ * CUDNN_STREAMS_PER_GROUP];
for (size_t i = 0; i < bottom.size(); ++i)
{
// initialize all to default algorithms
fwd_algo_[i] = (cudnnConvolutionFwdAlgo_t)0;
bwd_filter_algo_[i] = (cudnnConvolutionBwdFilterAlgo_t)0;
bwd_data_algo_[i] = (cudnnConvolutionBwdDataAlgo_t)0;
// default algorithms don't require workspace
workspace_fwd_sizes_[i] = 0;
workspace_bwd_data_sizes_[i] = 0;
workspace_bwd_filter_sizes_[i] = 0;
}
for (int g = 0; g < this->group_ * CUDNN_STREAMS_PER_GROUP; g++)
{
CUDA_CHECK(cudaStreamCreate(&stream_[g]));
CUDNN_CHECK(cudnnCreate(&handle_[g]));
CUDNN_CHECK(cudnnSetStream(handle_[g], stream_[g]));
workspace[g] = NULL;
}
// Set the indexing parameters.
bias_offset_ = (this->num_output_ / this->group_);
// Create filter descriptor.
const int *kernel_shape_data = this->kernel_shape_.cpu_data();
const int kernel_h = kernel_shape_data[0];
const int kernel_w = kernel_shape_data[1];
cudnn::createFilterDesc<Dtype>(&filter_desc_,
this->channels_ / this->group_,
this->num_output_ / this->group_,
kernel_h,
kernel_w);
// Create tensor descriptor(s) for data and corresponding convolution(s).
for (int i = 0; i < bottom.size(); i++)
{
cudnnTensorDescriptor_t bottom_desc;
cudnn::createTensor4dDesc<Dtype>(&bottom_desc);
bottom_descs_.push_back(bottom_desc);
cudnnTensorDescriptor_t top_desc;
cudnn::createTensor4dDesc<Dtype>(&top_desc);
top_descs_.push_back(top_desc);
cudnnConvolutionDescriptor_t conv_desc;
cudnn::createConvolutionDesc<Dtype>(&conv_desc);
conv_descs_.push_back(conv_desc);
}
// Tensor descriptor for bias.
if (this->bias_term_)
{
cudnn::createTensor4dDesc<Dtype>(&bias_desc_);
}
handles_setup_ = true;
}
template <typename Dtype>
void CuDNNDeconvolutionLayer<Dtype>::Reshape(
const vector<Blob<Dtype> *> &bottom, const vector<Blob<Dtype> *> &top)
{
DeconvolutionLayer<Dtype>::Reshape(bottom, top);
CHECK_EQ(2, this->num_spatial_axes_)
<< "CuDNNDeconvolutionLayer input must have 2 spatial axes "
<< "(e.g., height and width). "
<< "Use 'engine: CAFFE' for general ND convolution.";
bottom_offset_ = this->bottom_dim_ / this->group_;
top_offset_ = this->top_dim_ / this->group_;
const int height = bottom[0]->shape(this->channel_axis_ + 1);
const int width = bottom[0]->shape(this->channel_axis_ + 2);
const int height_out = top[0]->shape(this->channel_axis_ + 1);
const int width_out = top[0]->shape(this->channel_axis_ + 2);
const int *pad_data = this->pad_.cpu_data();
const int pad_h = pad_data[0];
const int pad_w = pad_data[1];
const int *stride_data = this->stride_.cpu_data();
const int stride_h = stride_data[0];
const int stride_w = stride_data[1];
#if CUDNN_VERSION_MIN(8, 0, 0)
int RetCnt;
bool found_conv_algorithm;
size_t free_memory, total_memory;
cudnnConvolutionFwdAlgoPerf_t fwd_algo_pref_[4];
cudnnConvolutionBwdDataAlgoPerf_t bwd_data_algo_pref_[4];
// get memory sizes
cudaMemGetInfo(&free_memory, &total_memory);
#else
// Specify workspace limit for kernels directly until we have a
// planning strategy and a rewrite of Caffe's GPU memory mangagement
size_t workspace_limit_bytes = 8 * 1024 * 1024;
#endif
for (int i = 0; i < bottom.size(); i++)
{
cudnn::setTensor4dDesc<Dtype>(&bottom_descs_[i],
this->num_,
this->channels_ / this->group_,
height,
width,
this->channels_ * height * width,
height * width,
width,
1);
cudnn::setTensor4dDesc<Dtype>(&top_descs_[i],
this->num_,
this->num_output_ / this->group_,
height_out,
width_out,
this->num_output_ * height_out * width_out,
height_out * width_out,
width_out,
1);
cudnn::setConvolutionDesc<Dtype>(&conv_descs_[i],
top_descs_[i],
filter_desc_,
pad_h,
pad_w,
stride_h,
stride_w);
#if CUDNN_VERSION_MIN(8, 0, 0)
// choose forward algorithm for filter
// in forward filter the CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED is not implemented in cuDNN 8
CUDNN_CHECK(cudnnGetConvolutionForwardAlgorithm_v7(handle_[0],
top_descs_[i],
filter_desc_,
conv_descs_[i],
bottom_descs_[i],
4,
&RetCnt,
fwd_algo_pref_));
found_conv_algorithm = false;
for (int n = 0; n < RetCnt; n++)
{
if (fwd_algo_pref_[n].status == CUDNN_STATUS_SUCCESS &&
fwd_algo_pref_[n].algo != CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED &&
fwd_algo_pref_[n].memory < free_memory)
{
found_conv_algorithm = true;
fwd_algo_[i] = fwd_algo_pref_[n].algo;
workspace_fwd_sizes_[i] = fwd_algo_pref_[n].memory;
break;
}
}
if (!found_conv_algorithm)
LOG(ERROR) << "cuDNN did not return a suitable algorithm for convolution.";
else
{
// choose backward algorithm for filter
// for better or worse, just a fixed constant due to the missing
// cudnnGetConvolutionBackwardFilterAlgorithm in cuDNN version 8.0
bwd_filter_algo_[i] = CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0;
// twice the amount of the forward search to be save
workspace_bwd_filter_sizes_[i] = 2 * workspace_fwd_sizes_[i];
}
// choose backward algo for data
CUDNN_CHECK(cudnnGetConvolutionBackwardDataAlgorithm_v7(handle_[0],
filter_desc_,
bottom_descs_[i],
conv_descs_[i],
top_descs_[i],
4,
&RetCnt,
bwd_data_algo_pref_));
found_conv_algorithm = false;
for (int n = 0; n < RetCnt; n++)
{
if (bwd_data_algo_pref_[n].status == CUDNN_STATUS_SUCCESS &&
bwd_data_algo_pref_[n].algo != CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD &&
bwd_data_algo_pref_[n].algo != CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD_NONFUSED &&
bwd_data_algo_pref_[n].memory < free_memory)
{
found_conv_algorithm = true;
bwd_data_algo_[i] = bwd_data_algo_pref_[n].algo;
workspace_bwd_data_sizes_[i] = bwd_data_algo_pref_[n].memory;
break;
}
}
if (!found_conv_algorithm)
LOG(ERROR) << "cuDNN did not return a suitable algorithm for convolution.";
#else
// choose forward and backward algorithms + workspace(s)
CUDNN_CHECK(cudnnGetConvolutionForwardAlgorithm(
handle_[0],
top_descs_[i],
filter_desc_,
conv_descs_[i],
bottom_descs_[i],
CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes,
&fwd_algo_[i]));
// We have found that CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM is
// buggy. Thus, if this algo was chosen, choose winograd instead. If
// winograd is not supported or workspace is larger than threshold, choose
// implicit_gemm instead.
if (fwd_algo_[i] == CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM)
{
size_t winograd_workspace_size;
cudnnStatus_t status = cudnnGetConvolutionForwardWorkspaceSize(
handle_[0],
top_descs_[i],
filter_desc_,
conv_descs_[i],
bottom_descs_[i],
CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD,
&winograd_workspace_size);
if (status != CUDNN_STATUS_SUCCESS ||
winograd_workspace_size >= workspace_limit_bytes)
{
fwd_algo_[i] = CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM;
}
else
{
fwd_algo_[i] = CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD;
}
}
CUDNN_CHECK(cudnnGetConvolutionForwardWorkspaceSize(
handle_[0],
top_descs_[i],
filter_desc_,
conv_descs_[i],
bottom_descs_[i],
fwd_algo_[i],
&(workspace_fwd_sizes_[i])));
// choose backward algorithm for filter
CUDNN_CHECK(cudnnGetConvolutionBackwardFilterAlgorithm(
handle_[0],
top_descs_[i],
bottom_descs_[i],
conv_descs_[i],
filter_desc_,
CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes,
&bwd_filter_algo_[i]));
// get workspace for backwards filter algorithm
CUDNN_CHECK(cudnnGetConvolutionBackwardFilterWorkspaceSize(
handle_[0],
top_descs_[i],
bottom_descs_[i],
conv_descs_[i],
filter_desc_,
bwd_filter_algo_[i],
&workspace_bwd_filter_sizes_[i]));
// choose backward algo for data
CUDNN_CHECK(cudnnGetConvolutionBackwardDataAlgorithm(
handle_[0],
filter_desc_,
bottom_descs_[i],
conv_descs_[i],
top_descs_[i],
CUDNN_CONVOLUTION_BWD_DATA_SPECIFY_WORKSPACE_LIMIT,
workspace_limit_bytes,
&bwd_data_algo_[i]));
// get workspace size
CUDNN_CHECK(cudnnGetConvolutionBackwardDataWorkspaceSize(
handle_[0],
filter_desc_,
bottom_descs_[i],
conv_descs_[i],
top_descs_[i],
bwd_data_algo_[i],
&workspace_bwd_data_sizes_[i]));
#endif
}
// reduce over all workspace sizes to get a maximum to allocate / reallocate
size_t total_workspace_fwd = 0;
size_t total_workspace_bwd_data = 0;
size_t total_workspace_bwd_filter = 0;
for (size_t i = 0; i < bottom.size(); i++)
{
total_workspace_fwd = std::max(total_workspace_fwd,
workspace_fwd_sizes_[i]);
total_workspace_bwd_data = std::max(total_workspace_bwd_data,
workspace_bwd_data_sizes_[i]);
total_workspace_bwd_filter = std::max(total_workspace_bwd_filter,
workspace_bwd_filter_sizes_[i]);
}
// get max over all operations
size_t max_workspace = std::max(total_workspace_fwd,
total_workspace_bwd_data);
max_workspace = std::max(max_workspace, total_workspace_bwd_filter);
// ensure all groups have enough workspace
size_t total_max_workspace = max_workspace *
(this->group_ * CUDNN_STREAMS_PER_GROUP);
// this is the total amount of storage needed over all groups + streams
if (total_max_workspace > workspaceSizeInBytes)
{
DLOG(INFO) << "Reallocating workspace storage: " << total_max_workspace;
workspaceSizeInBytes = total_max_workspace;
// free the existing workspace and allocate a new (larger) one
cudaFree(this->workspaceData);
cudaError_t err = cudaMalloc(&(this->workspaceData), workspaceSizeInBytes);
if (err != cudaSuccess)
{
// force zero memory path
for (int i = 0; i < bottom.size(); i++)
{
workspace_fwd_sizes_[i] = 0;
workspace_bwd_filter_sizes_[i] = 0;
workspace_bwd_data_sizes_[i] = 0;
fwd_algo_[i] = CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING;
bwd_filter_algo_[i] = CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0;
bwd_data_algo_[i] = CUDNN_CONVOLUTION_BWD_DATA_ALGO_0;
}
// NULL out all workspace pointers
for (int g = 0; g < (this->group_ * CUDNN_STREAMS_PER_GROUP); g++)
{
workspace[g] = NULL;
}
// NULL out underlying data
workspaceData = NULL;
workspaceSizeInBytes = 0;
}
// if we succeed in the allocation, set pointer aliases for workspaces
for (int g = 0; g < (this->group_ * CUDNN_STREAMS_PER_GROUP); g++)
{
workspace[g] = reinterpret_cast<char *>(workspaceData) + g * max_workspace;
}
}
// Tensor descriptor for bias.
if (this->bias_term_)
{
cudnn::setTensor4dDesc<Dtype>(
&bias_desc_, 1, this->num_output_ / this->group_, 1, 1);
}
}
template <typename Dtype>
CuDNNDeconvolutionLayer<Dtype>::~CuDNNDeconvolutionLayer()
{
// Check that handles have been setup before destroying.
if (!handles_setup_)
{
return;
}
for (int i = 0; i < bottom_descs_.size(); i++)
{
cudnnDestroyTensorDescriptor(bottom_descs_[i]);
cudnnDestroyTensorDescriptor(top_descs_[i]);
cudnnDestroyConvolutionDescriptor(conv_descs_[i]);
}
if (this->bias_term_)
{
cudnnDestroyTensorDescriptor(bias_desc_);
}
cudnnDestroyFilterDescriptor(filter_desc_);
for (int g = 0; g < this->group_ * CUDNN_STREAMS_PER_GROUP; g++)
{
cudaStreamDestroy(stream_[g]);
cudnnDestroy(handle_[g]);
}
cudaFree(workspaceData);
delete[] workspace;
delete[] stream_;
delete[] handle_;
delete[] fwd_algo_;
delete[] bwd_filter_algo_;
delete[] bwd_data_algo_;
delete[] workspace_fwd_sizes_;
delete[] workspace_bwd_data_sizes_;
delete[] workspace_bwd_filter_sizes_;
}
INSTANTIATE_CLASS(CuDNNDeconvolutionLayer);
} // namespace caffe
#endif
由于cuDNN
修改了API
,在cudnn.h
文件中不再指出cuDNN
的版本号,而是放在了cudnn_version.h
文件中,所以,将cudnn_version.h
中对于版本段的代码复制到cudnn.h
文件中,代码如下:
locate cudnn_version.h
sudo vi /usr/local/cuda/targets/x86_64-linux/include/cudnn_version.h
复制其中的非注释部分:
sudo vi /usr/local/cuda/targets/x86_64-linux/include/cudnn.h
粘贴到最开头:
然后打开caffe/include/caffe/util/cudnn.hpp
文件并指定cudnn.h
路径:
vi include/caffe/util/cudnn.hpp
之后重新执行编译:
sudo make all -j$(nproc)
生成以下静态库和共享库文件:
测试,时间较慢,耐心等待~
sudo make test -j$(nproc)
sudo make runtest -j$(nproc)
sudo make pycaffe -j$(nproc)
可能会有报错,但问题不大,我们只是需要那些库文件~
VTK-8.2.0及PCL-1.9.1 (EOL)
下载VTK-8.2.0.zip
:
解压之后,进入文件夹打开终端:
sudo apt install -y cmake-gui && mkdir build && cd build/ && cmake-gui ..
单击Configure
:
勾选以下两项后单击Configure
和Generate
:
Module/Module_vtkGUISupportQt
:
VTK/VTK_Group_Qt
:
sudo make -j$(nproc)
sudo make -j$(nproc) install
接下来安装pcl
:
git clone -b pcl-1.9.1 https://github.com/PointCloudLibrary/pcl.git pcl-1.9.1
或公益加速源:
git clone -b pcl-1.9.1 https://ghp.ci/https://github.com/PointCloudLibrary/pcl.git pcl-1.9.1
之后进入文件夹打开终端输入:
mkdir build && cd build/
cmake \
-D CMAKE_BUILD_TYPE=None \
-D CMAKE_INSTALL_PREFIX=/usr \
-D BUILD_GPU=ON-DBUILD_apps=ON \
-D BUILD_examples=ON \
-D CMAKE_INSTALL_PREFIX=/usr \
..
sudo make -j$(nproc)
sudo make -j$(nproc) install
NOT RECOMMENDED (EOL)
ROS-melodic(有些图忘记截了)
sudo gpg --keyserver 'hkp://keyserver.ubuntu.com:80' --recv-key C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654
sudo gpg --export C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654 | sudo tee /usr/share/keyrings/ros.gpg > /dev/null
sudo tee /etc/apt/sources.list.d/ros-latest.list > /dev/null << EOF
deb [signed-by=/usr/share/keyrings/ros.gpg] https://mirrors.hust.edu.cn/ros/ubuntu/ $(lsb_release -sc) main
EOF
sudo apt update -y
sudo apt install -y ros-melodic-desktop-full
echo 'source /opt/ros/melodic/setup.bash' >> ~/.bashrc
source ~/.bashrc
sudo apt install -y python-rosdep python-rosinstall python-rosinstall-generator python-wstool build-essential
sudo apt install -y python3-pip
使用镜像源加速pip
下载:
sudo pip3 install rosdepc -i https://mirrors.hust.edu.cn/pypi/web/simple
sudo rosdepc init
rosdepc update
sudo chmod 777 -R ~/.ros/
roscore
再新建两个终端,分别输入:
rosrun turtlesim turtlesim_node
rosrun turtlesim turtle_teleop_key
在rosrun turtlesim turtle_teleop_key
所在终端点击一下任意位置,然后使用←↕→
小键盘控制,看小海龟会不会动,如果会动则安装成功。
OpenCV-3.4.16及其扩展模块(Ubuntu18.04)
安装所需依赖库,打开终端,输入:
sudo tee -a /etc/apt/sources.list > /dev/null << EOF
# libjasper1, libjasper-dev
deb https://mirrors.hust.edu.cn/ubuntu/ xenial-security main
EOF
sudo apt update -y
sudo apt install -y libjasper1 libjasper-dev
# 注释掉xenial软件源
sudo sed -i '/^deb https:\/\/mirrors.hust.edu.cn\/ubuntu\/ xenial-security main/s/^/# /' /etc/apt/sources.list && sudo apt update -y
sudo apt install -y cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libdc1394-22-dev liblapacke-dev checkinstall
git clone -b 3.4.16 https://github.com/opencv/opencv.git opencv-3.4.16
或公益加速源:
git clone -b 3.4.16 https://ghp.ci/https://github.com/opencv/opencv.git opencv-3.4.16
cd opencv-3.4.16/
git clone -b 3.4.16 https://github.com/opencv/opencv_contrib.git opencv-3.4.16
或公益加速源:
git clone -b 3.4.16 https://ghp.ci/https://github.com/opencv/opencv_contrib.git opencv_contrib-3.4.16
mkdir build && cd build/
接下来编译安装,注意此命令的OPENCV_EXTRA_MODULES_PATH=
后边的路径是你电脑下的绝对路径,请自行修改:
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK=ON \
-D WITH_VTK=ON \
-D WITH_ADE=OFF \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_LAPACK=OFF \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D CUDA_CUDA_LIBRARY=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-3.4.16/opencv_contrib-3.4.16/modules \
..
过程中会出现IPPICV: Download: ippicv_2020_lnx_intel64_20191018_general.tgz
。
解决方法:
cd $(git rev-parse --show-toplevel)/ && mkdir downloads && realpath downloads/
复制绝对路径后:
打开这个ippicv.cmake
:
把绝对路径复制进去:
然后把下面网址下载的文件cp
进去就行了(或者开头百度云分享链接中自取~)。
https://github.com/opencv/opencv_3rdparty
然后重新打开终端,再次输入(别忘了改路径):
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK=ON \
-D WITH_VTK=ON \
-D WITH_ADE=OFF \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_LAPACK=OFF \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D CUDA_CUDA_LIBRARY=ON \
-D OPENCV_ENABLE_NONFREE=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-3.4.16/opencv_contrib-3.4.16/modules \
..
这些.i
文件需要在国外服务器上下载,网上说下载好文件直接把他们放进相对应的目录下就行,实测不行(建议科学的上网,想试试网上说法的:
https://blog.csdn.net/curious_undergather/article/details/111639199
文件的话,开头百度云分享链接里都有)
sudo make -j$(nproc)
打开那个头文件,把报错所在行改为:
#include "lapacke.h"
sudo make -j$(nproc)
sudo make -j$(nproc) install
sudo tee /etc/ld.so.conf.d/opencv.conf > /dev/null << EOF
/usr/local/lib
EOF
sudo ldconfig
sudo tee -a /etc/profile > /dev/null << 'EOF'
export PKG_CONFIG_PATH=${PKG_CONFIG_PATH}:/usr/local/lib/pkgconfig
EOF
source /etc/profile
测试:
cd ../samples/cpp/example_cmake/
cmake -j$(nproc) .
sudo make -j$(nproc)
./opencv_example
安装成功!
设置cv_bridge
的版本(ROS-melodic
,经实践发现毫无效果):
sudo gedit /opt/ros/melodic/share/cv_bridge/cmake/cv_bridgeConfig.cmake
修改其中的以下内容:
# @line: 94行左右
if(NOT "include;/usr/include;/usr/include/opencv " STREQUAL " ") # [!code --]
set(cv_bridge_INCLUDE_DIRS "") # [!code --]
set(_include_dirs "include;/usr/include;/usr/include/opencv") # [!code --]
if(NOT "include;/usr/local/include/opencv;/usr/local/include/opencv2 " STREQUAL " ") # [!code ++]
set(cv_bridge_INCLUDE_DIRS "") # [!code ++]
set(_include_dirs "include;/usr/local/include/opencv;/usr/local/include/opencv;/usr/local/include/;/usr/include") # [!code ++]
#################################分割线#################################
# @line: 119行左右
set(libraries "cv_bridge;/usr/lib/x86_64-linux-gnu/libopencv_core.so.3.2.0;/usr/lib/x86_64-linux-gnu/libopencv_imgproc.so.3.2.0;/usr/lib/x86_64-linux-gnu/libopencv_imgcodecs.so.3.2.0") # [!code --]
set(libraries "cv_bridge;/usr/local/lib/libopencv_core.so.3.4.16;/usr/local/lib/libopencv_imgproc.so.3.4.16;/usr/local/lib/libopencv_imgcodecs.so.3.4.16") # [!code ++]
opencv-3.4.4
的cmake
命令:
cmake \
-D CMAKE_BUILD_TYPE=RELEASE \
-D WITH_GTK_2_X=ON \
-D WITH_CUDA=ON \
-D WITH_CUDNN=ON \
-D WITH_OPENMP=ON \
-D WITH_FFMPEG=ON \
-D WITH_OPENGL=ON \
-D WITH_LAPACK=OFF \
-D BUILD_TESTS=OFF \
-D WITH_NVCUVID=ON \
-D CUDA_ARCH_BIN=8.6 \
-D OPENCV_DNN_CUDA=ON \
-D CUDA_GENERATION=Auto \
-D OPENCV_ENABLE_NONFREE=ON \
-D BUILD_opencv_xfeatures2d=ON \
-D OPENCV_GENERATE_PKGCONFIG=YES \
-D ENABLE_PRECOMPILED_HEADERS=OFF \
-D CMAKE_EXE_LINKER_FLAGS=-lcblas \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D CUDA_HOST_COMPILER:FILEPATH=/usr/bin/gcc-7 \
-D OPENCV_EXTRA_MODULES_PATH=/home/m0rtzz/Programs/opencv-3.4.4/opencv_contrib-3.4.4/modules \
..
OpenCV3配置darknet_ros工作空间(OpenCV3)
git clone --recursive https://github.com/leggedrobotics/darknet_ros.git darknet_ros
或公益加速源:
git clone --recursive https://ghp.ci/https://github.com/leggedrobotics/darknet_ros.git darknet_ros
后边内容和ESSENTIAL部分中的步骤类似。
Azure Kinect SDK-v1.4.0(源码编译)
Reference:
git clone -b v1.4.0 https://github.com/microsoft/Azure-Kinect-Sensor-SDK.git Azure-Kinect-Sensor-SDK-v1.4.0
或公益加速源:
git clone -b v1.4.0 https://ghp.ci/https://github.com/microsoft/Azure-Kinect-Sensor-SDK.git Azure-Kinect-Sensor-SDK-v1.4.0
sudo dpkg --add-architecture amd64
sudo apt update -y
sudo apt install -y ninja-build doxygen clang gcc-multilib g++-multilib python3 nasm libsoundio-dev libvulkan-dev libx11-dev libxcursor-dev libxinerama-dev libxrandr-dev libudev-dev mesa-common-dev uuid-dev
wget -q --show-progress https://packages.microsoft.com/ubuntu/18.04/prod/pool/main/libk/libk4a1.4/libk4a1.4_1.4.2_amd64.deb -O ./libk4a1.4_1.4.2_amd64.deb
解压.deb
文件,再解压内部的data.tar.gz
文件,并进入data/usr/lib/x86_64-linux-gnu/
文件夹,打开终端输入:
sudo cp libdepthengine.so.2.0 /usr/lib/x86_64-linux-gnu/
sudo cp /usr/lib/x86_64-linux-gnu/libdepthengine.so.2.0 /usr/lib/
随后进入下载好的Azure-Kinect-Sensor-SDK-v1.4.0/
文件夹下打开终端输入:
cmake -GNinja ..
注意此步过程中extern/libyuv/src
克隆较慢原因是使用了google
的网站,我们把对应文件的克隆url
改为GitHub
的就能正常克隆了,在Azure-Kinect-Sensor-SDK-v1.4.0/
文件夹下键盘Ctrl+H
显示隐藏文件,打开.gitmodules
文件,修改libyuv
的部分为:
[submodule "extern/libyuv/src"]
path = extern/libyuv/src
url = https://github.com/lemenkov/libyuv.git
保存后关闭
之后打开.git/
文件夹下的config
文件,修改libyuv
的部分为:
[submodule "extern/libyuv/src"]
active = true
url = https://github.com/lemenkov/libyuv.git
接下来就能正常克隆了,但是速度还是很慢,请耐心等待~
保存后关闭,打开终端,输入:
cmake -GNinja ..
克隆完成后为如图所示:
之后输入:
sudo ninja -j$(nproc)
最后输入:
sudo ninja install
之后测试一下:
sudo ./bin/k4aviewer
授予权限:
cd .. && sudo cp scripts/99-k4a.rules /etc/udev/rules.d
NOT REQUIRED
libfreenect2 (EOL)
git clone https://github.com/OpenKinect/libfreenect2.git libfreenect2
或公益加速源:
git clone https://ghp.ci/https://github.com/OpenKinect/libfreenect2.git libfreenect2
cd libfreenect2/ && mkdir build && cd build/
cmake \
-D ENABLE_CXX11=ON \
..
sudo make -j$(nproc)
sudo make -j$(nproc) install
sudo cp ../platform/linux/udev/90-kinect2.rules /etc/udev/rules.d/
[!WARNING]
以下为科研项目所需,比赛无需安装。
CarlaUE4
[!IMPORTANT]
必须是CarlaUnreal的UE仓库中的carla分支才可以通过安装Carla时的编译。
find . -name "*.sh" -exec dos2unix {} +
find . -name "*.sh" -exec chmod +x {} +
sudo chown -R m0rtzz: *
若报错:
因Epic
更新了gitdeps
,但Github
上却没有更新,所以需要进入GitHub
官方仓库Release
界面寻找对应版本的Commit.gitdeps.xml
替换原来的文件即可:
https://github.com/EpicGames/UnrealEngine/releases/tag
若报错:
鄙人认为是因执行Setup.sh
脚本未赋予root
权限导致依赖未安装完整,所以再次执行:
sudo ./Setup.sh
// @file: CubemapUnwrapUtils.cpp
// Use
RHICmdList.GetBoundVertexShader() instead of GetVertexShader()
RHICmdList.GetBoundPixelShader() instead of GetPixelShader()
// Instead of the given macros, use code as below.
GraphicsPSOInit.BoundShaderState.VertexShaderRHI = VertexShader.GetVertexShader();
GraphicsPSOInit.BoundShaderState.PixelShaderRHI = PixelShader.GetPixelShader();
http://cdn.unrealengine.com/Toolchain_Linux/native-linux-v17_clang-10.0.10centos.tar.gz
cd your-path/UnrealEngine_4.26/Engine/Extras/ThirdPartyNotUE/SDKs/HostLinux/Linux_x64/
tar -zxvf native-linux-v17_clang-10.0.1-centos7.tar.gz
CARLA-0.9.14(添加fisheye sensor模块)
修改Update.sh
下载网址为南方科技大学镜像站的网址:
# CONTENT_LINK=http://carla-assets.s3.amazonaws.com/${CONTENT_ID}.tar.gz
CONTENT_LINK=https://mirrors.sustech.edu.cn/carla/carla_content/${CONTENT_ID}.tar.gz
// @file: test_streaming.cpp
// Line 58
carla::streaming::low_level::Server<tcp::Server> srv(io.service, TESTING_PORT);
// Line 63
carla::streaming::low_level::Client<tcp::Client> c;
// Line 93
carla::streaming::low_level::Server<tcp::Server> srv(io.service, TESTING_PORT);
// Line 96
carla::streaming::low_level::Client<tcp::Client> c;
# @file: Package.sh(https://github.com/annaornatskaya/carla/tree/fisheye-sensor)
# copy_if_changed "./Plugins/" "${DESTINATION}/Plugins/"
copy_if_changed "./Unreal/CarlaUE4/Content/Carla/HDMaps/*.pcd" "${DESTINATION}/HDMaps/"
copy_if_changed "./Unreal/CarlaUE4/Content/Carla/HDMaps/Readme.md" "${DESTINATION}/HDMaps/README"
# NOTE: Modified by M0rtzz
if [ -d "./Plugins/" ] ; then
copy_if_changed "./Plugins/" "${DESTINATION}/Plugins/"
fi
更多推荐
所有评论(0)