faiss源码编译

faiss编译

qq_40231381

811人浏览 · 2023-08-09 21:13:49

qq_40231381 · 2023-08-09 21:13:49 发布

faiss源码编译

faiss源码下载
faiss安装

faiss源码下载

link
https://github.com/facebookresearch/faiss/tree/v1.7.4

faiss安装

官方文档

link
https://github.com/facebookresearch/faiss/blob/v1.7.4/INSTALL.md

准备

1.安装cmake

1）https://cmake.org/download/下载3.23以上版本
2）解压
3）建软链
sudo ln -sf /home/xxx/cmake-3.27.1-linux-x86_64/bin/* /usr/bin/

2.安装依赖库OpenBLAS

git clone https://github.com/xianyi/OpenBLAS.git
cd OpenBLAS
make
make PREFIX=/path/to/your/installation install
#之后将编译好的动态库链接至/usr/lib目录下
ln -s /opt/OpenBLAS/lib/libopenblas.so /usr/lib/libopenblas.so
#在～/.bashrc中添加
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/OpenBLAS/lib

3.安装gfortran(第4步安装LAPACK需要gfortran编译)

ubuntu下：sudo apt-get install gfortran

4.安装依赖库LAPACK

1.下载3.10.0
https://www.netlib.org/lapack/#_lapack_version_3_11_0_2
2.修改lapack-3.10.0/Makefile,因为lapack以来于blas库，所以需要做如下修改(注释第一句话，去掉注释第二句话):

#lib: lapacklib tmglib
lib: blaslib variants lapacklig tmglib

3.接着进行编译：

# 编译所有的lapack文件
make

# 进入lapacke 文件夹，这个文件夹包含lapack的C语言接口文件 
cd lapacke

# 编译lapacke
make

4.由于lapack的makefile文件中没有make isntall 命令，需要手工进行安装

# 将lapacke的头文件复制到系统头文件目录
sudo cp include/*.h /usr/include  

# 返回到 lapack-3.10.0 目录 
cd .. 

# 将生成的所有库文件复制到系统库目录 
sudo cp *.a /usr/lib

编译faiss

1.执行cmake

$ cmake -B build . -DFAISS_ENABLE_GPU=OFF -DFAISS_ENABLE_PYTHON=OFF -DBUILD_SHARED_LIBS=ON -DBUILD_TESTING=ON

2.invoking Make
$ make -C build -j faiss
执行完成后发现在build/faiss下生成libfaiss.a（静态库）和libfaiss.so（动态库），编译完成。

demo测试

1.官网下载 cpp demo程序

https://github.com/facebookresearch/faiss/blob/main/demos/demo_ivfpq_indexing.cpp
It creates a small index, stores it and performs some searches. A normal runtime is around 20s. With a fast machine and Intel MKL’s BLAS it runs in 2.5s.
2. 编译

$ make -C build demo_ivfpq_indexing

and subsequently ran with
3.执行

$ ./build/demos/demo_ivfpq_indexing

结果：

xxx:~/faiss-project/faiss-1.7.4/build$ ./demos/demo_ivfpq_indexing
[0.003 s] Generating 100000 vectors in 128D for training
[1.381 s] Training the index
Training level-1 quantizer
Training level-1 quantizer on 100000 vectors in 128D
Training IVF residual
  Input training set too big (max size is 65536), sampling 65536 / 100000 vectors
computing residuals
training 4x256 product quantizer on 65536 vectors in 128D
Training PQ slice 0/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.02 s
  Iteration 24 (15.29 s, search 14.91 s): objective=113089 imbalance=1.004 nsplit=0
Training PQ slice 1/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.02 s
  Iteration 24 (15.73 s, search 15.39 s): objective=113306 imbalance=1.003 nsplit=0
Training PQ slice 2/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.02 s
  Iteration 24 (15.66 s, search 15.33 s): objective=113170 imbalance=1.004 nsplit=0
Training PQ slice 3/4
Clustering 65536 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.02 s
  Iteration 24 (15.24 s, search 14.91 s): objective=113194 imbalance=1.003 nsplit=0
precomputing IVFPQ tables type 1
[111.629 s] storing the pre-trained index to /tmp/index_trained.faissindex
[111.632 s] Building a dataset of 200000 vectors to index
[115.911 s] Adding the vectors to the index
IndexIVFPQ::add_core_o: adding 0:32768 / 200000
 add_core times: 0.004 686.171 26.808
IndexIVFPQ::add_core_o: adding 32768:65536 / 200000
 add_core times: 0.002 534.957 20.650
IndexIVFPQ::add_core_o: adding 65536:98304 / 200000
 add_core times: 0.003 593.247 25.669
IndexIVFPQ::add_core_o: adding 98304:131072 / 200000
 add_core times: 0.003 632.615 26.246
IndexIVFPQ::add_core_o: adding 131072:163840 / 200000
 add_core times: 0.004 532.333 24.529
IndexIVFPQ::add_core_o: adding 163840:196608 / 200000
 add_core times: 0.003 585.496 23.380
IndexIVFPQ::add_core_o: adding 196608:200000 / 200000
 add_core times: 0.002 30.934 1.796
[131.263 s] imbalance factor: 1.23044
[131.286 s] Searching the 5 nearest neighbors of 9 vectors in the index
[131.294 s] Query results (vector ids, then distances):
query  0:    1234  151960  126804   77560   57662
     dis: 7.43317  10.134 10.2898 10.6181 10.6383
query  1:    1235   62664  121471  103066   50589
     dis: 8.46871 11.0466 11.1016 11.3478 11.5615
query  2:    1236  136538  118385  149577  160936
     dis: 8.04413 10.1901 10.6975 10.7608  10.848
query  3:    1237   94345  115499  158406   18452
     dis: 8.17936  10.499 10.9742  11.096 11.1009
query  4:    1238   61048  144192  166318  120405
     dis: 6.68831 9.85044 9.89698 9.96831 10.2239
query  5:    1239  138293   82900  157864  146625
     dis: 7.97947 10.3305 10.3393 10.6917 10.7038
query  6:    1240   38741   26375  166753  120791
     dis: 8.54106  11.187 11.4251 11.6029 11.6392
query  7:    1241   57306  180584    6840  133189
     dis: 7.54973 10.7408 10.9239 10.9464 11.0586
query  8:    1242   71537  115685   18620  191950
     dis: 7.45555 10.1865 10.4495 10.7086 10.7492
note that the nearest neighbor is not at distance 0 due to quantization errors