faiss 一个快速的人脸搜索库



  • faiss是什么

    很多人不知道人脸识别是如何实现, 实际上从1000个人脸库中,查找出当前视频画面下检测到的人脸都是谁,是一个很难得任务,为什么呢?
    从算法的角度分析,你需要对检测到的人脸进行遍历,每一个人脸都与1000个进行距离计算,得到距离最小值,如果距离大于某一个阈值,则认为我们识别出来了某个人,但是试想,假如画面有5个人,那么你每一张图片都要进行5000次计算. 这个计算量是非常大的.
    而faiss的库就是来解决这个问题的.

    解决什么问题?

    我们把上面的问题normalize一下,问题应该这样描述:

    • 我有10000个数据,每个数据都是128维的 (你看作是向量);
    • 我有3个向量,也是128维度的.

    好,那么问题就出来了. 我想要做的事情就是: 从上面10000个数据中,找到距离我3个向量最相似的向量. 这其实本质就是一个相似度搜索.

    而faiss可以完美的解决上述问题. 以后你遇到人脸比对问题,千万别说你啥也不懂, 别忘了本网站就是来给你醍醐灌顶的, 很多学校和工作上学不到的东西, 旁门左道这里都有.
    最后说一句, 目前很多公司的人脸考勤机, 银行风控系统, 一些压箱底的业余算法, 实际上都是基于faiss做的, 你学会了之后说不定可以面试时假装参与过某大型海量数据智能搜索项目..

    入门

    关于faiss如何编译不说了, 你可以用gpu也可以用cpu, 实际上CPU速度已经非常快了. 应付日常的人脸比对没有问题.

    #include "faiss/IndexBinaryFlat.h"
    #include "faiss/IndexFlat.h"
    #include <cstdio>
    #include <cstdlib>
    #include <iostream>
    
    using namespace std;
    
    int main()
    {
    
        // define serverl high dimension
        // search 10 records from 100000 database
        int n_dim = 64;
        float *xb = new float[10000 * n_dim];
        // we just find 3 vectors cloest
        float *xq = new float[3 * n_dim];
    
        for (int i = 0; i < 10000; i++)
        {
            for (int j = 0; j < n_dim; j++)
            {
                xb[n_dim * i + j] = drand48();
            }
            xb[n_dim * i] += i / 1000.;
        }
    
        // Just for logging first 4 lines
        for (int i = 0; i < 4; i++)
        {
            for (int j = 0; j < n_dim; j++)
            {
                cout << xb[n_dim * i + j] << " ";
            }
            cout << endl;
        }
    
        for (int i = 0; i < 3; i++)
        {
            for (int j = 0; j < n_dim; j++)
            {
                xq[n_dim * i + j] = drand48();
            }
            xq[n_dim * i] += i / 1000.;
        }
    
        cout << "\n\n to query data: \n";
        for (int i = 0; i < 3; i++)
        {
            for (int j = 0; j < n_dim; j++)
            {
                cout << xq[n_dim * i + j] << " ";
            }
            cout << endl;
        }
        cout << "data init done.\n";
    
        faiss::IndexFlatL2 index(n_dim);
        // create a Index object with 128 dimensions
        index.add(10000, xb);
    
        cout << "ntotal: " << index.ntotal << endl;
    
        const int k = 5;
        // I contains 3 to query, get frist 5
        long *I = new long[3 * k];
        float *D = new float[3 * k];
        // we will search frist 5 most closest vector of provide 3 xq vectors
        index.search(3, xq, k, D, I);
    
        // log out result
        cout << "I: \n";
        for (int i = 0; i < 3; i++)
        {
            for (int j = 0; j < k; j++)
            {
                // cout << I[i * k + j] << " ";
                printf("%5ld ", I[i * k + j]);
            }
            cout << endl;
        }
    
        cout << "Data: \n";
        for (int i = 0; i < 3; i++)
        {
            cout << "to query: " << to_string(i) << ": \n";
            /* code */
            for (int j = 0; j < k; j++)
            {
                /* code */
                cout << "find at " << I[i*k+j] << " row\n";
                for(int r = 0; r < n_dim; r++)
                {
                    /* code */
                    cout << xb[I[i*k+j]*n_dim + r] << " ";
                }
                cout << "\ndistance: " << D[i*k+j] << endl;
                cout << endl;
                
            }
            cout << endl;
        }
    }
    

    上述就是所有算法了.

    输出你会看到结果如下:

    I: 
      225    65   861   419    55 
      273   798   843   248   442 
      218   234  1036   666   462 
    Data: 
    to query: 0: 
    find at 225 row
    0.466021 0.157434 0.358225 0.925175 0.767356 0.947012 0.426969 0.643445 0.738131 0.728681 0.314912 0.606148 0.734014 0.061149 0.52515 0.278242 0.410853 0.575736 0.476858 0.716448 0.50381 0.735936 0.711614 0.726204 0.79187 0.532111 0.249043 0.495299 0.48978 0.614022 0.223693 0.907543 0.0861676 0.330432 0.339118 0.531724 0.497999 0.489021 0.570725 0.116967 0.153322 0.270842 0.472179 0.243972 0.0404449 0.671394 0.255326 0.286576 0.119363 0.52595 0.962982 0.749311 0.73053 0.681197 0.808226 0.349587 0.660384 0.160472 0.46143 0.408649 0.438705 0.338531 0.404171 0.470306 
    distance: 6.27085
    
    find at 65 row
    0.505255 0.28922 0.631373 0.918576 0.181592 0.588332 0.775476 0.771283 0.0793275 0.0847964 0.315319 0.457572 0.224771 0.68705 0.926852 0.529911 0.739419 0.801049 0.290878 0.515319 0.409873 0.803053 0.356623 0.800135 0.20588 0.337549 0.828714 0.880484 0.693433 0.33952 0.226299 0.812698 0.125389 0.490504 0.235795 0.0835198 0.562164 0.505707 0.443148 0.0495714 0.335637 0.114078 0.42344 0.50769 0.314323 0.874968 0.449576 0.556408 0.00430368 0.504196 0.199993 0.413889 0.478442 0.230585 0.068269 0.254272 0.369448 0.832987 0.353961 0.913728 0.79681 0.52022 0.420583 0.766039 
    distance: 6.4031
    
    find at 861 row
    0.873469 0.750993 0.857495 0.47256 0.782863 0.476003 0.349614 0.16134 0.344077 0.440439 0.867413 0.116205 0.171824 0.629315 0.713136 0.260927 0.967486 0.329049 0.28973 0.461988 0.661987 0.554029 0.603161 0.257836 0.201252 0.928316 0.446082 0.876206 0.614237 0.349172 0.832989 0.635202 0.271704 0.97169 0.485299 0.455559 0.63825 0.581464 0.072068 0.324728 0.792392 0.16425 0.034549 0.0866274 0.333672 0.478069 0.0246297 0.588204 0.258855 0.11482 0.710857 0.715049 0.921586 0.976496 0.221699 0.178565 0.194623 0.247428 0.15257 0.577608 0.537408 0.921213 0.537721 0.514654 
    distance: 6.47911
    
    find at 419 row
    0.693233 0.36702 0.463789 0.302791 0.984381 0.798606 0.906685 0.64936 0.0597382 0.439838 0.716839 0.842966 0.506091 0.833202 0.584998 0.932343 0.911039 0.919794 0.232314 0.546304 0.0473816 0.0195894 0.626377 0.205462 0.737214 0.979008 0.729008 0.884622 0.301808 0.451535 0.563585 0.958041 0.546344 0.120554 0.381652 0.531712 0.808454 0.425854 0.0982931 0.864027 0.226094 0.415601 0.989373 0.537119 0.34752 0.254235 0.265532 0.469958 0.260998 0.604158 0.500142 0.980621 0.0147639 0.136788 0.761824 0.394391 0.631748 0.278178 0.53514 0.157486 0.105327 0.558782 0.282595 0.819856 
    distance: 6.49069
    
    find at 55 row
    0.257709 0.0859596 0.00411785 0.823344 0.965238 0.336131 0.316234 0.899132 0.213677 0.97542 0.556926 0.0236263 0.574166 0.493997 0.623751 0.57421 0.683779 0.610097 0.420989 0.864925 0.623292 0.580468 0.386176 0.413755 0.26925 0.924793 0.676345 0.805018 0.2051 0.206813 0.469955 0.489271 0.927807 0.877865 0.461383 0.285862 0.701429 0.77285 0.705987 0.413759 0.633338 0.0535123 0.551255 0.0130009 0.676162 0.759635 0.827795 0.53618 0.0360931 0.585662 0.715104 0.306481 0.346177 0.620009 0.511843 0.441329 0.344913 0.0284955 0.777338 0.229621 0.29236 0.634718 0.146859 0.692429 
    distance: 6.49767
    
    
    to query: 1: 
    find at 273 row
    0.677296 0.585997 0.0553738 0.864789 0.11853 0.827149 0.297492 0.408547 0.451095 0.45367 0.615399 0.612543 0.60031 0.406916 0.458278 0.481905 0.700731 0.90727 0.408412 0.206795 0.823983 0.566926 0.59377 0.655704 0.638159 0.445796 0.902688 0.653724 0.129332 0.134317 0.967511 0.759879 0.574875 0.46029 0.897517 0.471399 0.793974 0.0578429 0.996367 0.790528 0.176969 0.704368 0.732001 0.829726 0.905669 0.0661185 0.901356 0.307939 0.368865 0.725505 0.69113 0.123736 0.000904269 0.164443 0.449403 0.66592 0.529893 0.710248 0.799975 0.63966 0.0221227 0.416043 0.928486 0.212836 
    distance: 6.14798
    
    find at 798 row
    1.15928 0.543615 0.224626 0.916172 0.90518 0.659441 0.629947 0.537924 0.392073 0.941471 0.678197 0.965487 0.375038 0.749908 0.534098 0.72416 0.0459989 0.523811 0.047497 0.290382 0.520448 0.744092 0.222655 0.898367 0.122647 0.521324 0.713795 0.836127 0.107169 0.505237 0.360213 0.851549 0.7897 0.682583 0.140249 0.429699 0.412399 0.0805599 0.921543 0.215152 0.0774941 0.573476 0.801407 0.926418 0.791248 0.19095 0.780052 0.697448 0.565905 0.227274 0.683294 0.853651 0.830031 0.374929 0.170605 0.921973 0.251035 0.262592 0.927703 0.937581 0.83409 0.52176 0.67702 0.359982 
    distance: 7.00169
    
    find at 843 row
    1.10931 0.359975 0.801474 0.728754 0.416373 0.778199 0.736728 0.122723 0.951517 0.880154 0.411319 0.0441091 0.0909334 0.596108 0.899065 0.226211 0.851917 0.839522 0.478784 0.623026 0.487713 0.907234 0.684378 0.838186 0.0877218 0.997805 0.762191 0.782982 0.0210723 0.363053 0.948904 0.623911 0.0989273 0.61534 0.858033 0.178676 0.60604 0.254504 0.660599 0.468336 0.798998 0.952731 0.70202 0.570841 0.381107 0.0833921 0.0768369 0.45147 0.589095 0.331361 0.428048 0.288646 0.314958 0.549034 0.618986 0.454374 0.540626 0.786055 0.791562 0.748264 0.890162 0.224111 0.851973 0.489301 
    distance: 7.12636
    
    find at 248 row
    0.914648 0.305961 0.495205 0.874023 0.458822 0.943957 0.123607 0.133754 0.907432 0.911847 0.363161 0.314546 0.202649 0.797917 0.437428 0.503002 0.262552 0.976047 0.784578 0.62072 0.691318 0.266727 0.324286 0.473234 0.679304 0.286232 0.618868 0.0670787 0.529058 0.418717 0.312213 0.859206 0.467795 0.116242 0.738621 0.284527 0.729952 0.488854 0.51562 0.11724 0.903801 0.54024 0.556299 0.648388 0.777709 0.724234 0.87198 0.759778 0.391615 0.924164 0.977455 0.212833 0.397751 0.312046 0.679121 0.585377 0.815007 0.539857 0.619751 0.530813 0.679936 0.203051 0.599092 0.419228 
    distance: 7.18848
    
    find at 442 row
    1.0593 0.958399 0.857755 0.508242 0.723734 0.845 0.94509 0.0920004 0.567447 0.681435 0.850301 0.365953 0.510369 0.75197 0.782022 0.022814 0.521615 0.261907 0.370121 0.737841 0.363219 0.766053 0.350891 0.469102 0.227089 0.296823 0.472253 0.626026 0.208972 0.558516 0.252179 0.418061 0.554042 0.701471 0.819957 0.956868 0.160271 0.355167 0.986034 0.256701 0.220898 0.932148 0.708058 0.346972 0.169928 0.00512246 0.362594 0.232563 0.761832 0.48826 0.386363 0.380883 0.368631 0.741925 0.43805 0.0155159 0.567612 0.779756 0.850531 0.505262 0.177619 0.84065 0.72049 0.0713463 
    distance: 7.37195
    
    
    to query: 2: 
    find at 218 row
    0.789956 0.127211 0.350287 0.484015 0.820771 0.461537 0.850525 0.660183 0.323255 0.473371 0.439952 0.570391 0.405359 0.830808 0.880974 0.497624 0.7853 0.99512 0.59151 0.969219 0.305285 0.502814 0.413822 0.859267 0.512544 0.774588 0.463374 0.0351958 0.938322 0.272807 0.294392 0.643575 0.378597 0.757543 0.371534 0.423003 0.438085 0.83781 0.409435 0.534455 0.90046 0.128706 0.319977 0.500031 0.176647 0.396892 0.405317 0.893858 0.656467 0.786754 0.745251 0.403624 0.447224 0.889079 0.971002 0.0132697 0.879131 0.254961 0.392535 0.0976032 0.648378 0.322591 0.679886 0.684816 
    distance: 6.59329
    
    find at 234 row
    1.17942 0.76557 0.759288 0.291068 0.189831 0.529297 0.131956 0.0411172 0.376996 0.0831702 0.784108 0.650577 0.109366 0.147236 0.319566 0.446089 0.189489 0.874222 0.369946 0.776055 0.188854 0.781561 0.278282 0.275609 0.926888 0.811937 0.033379 0.00249271 0.604275 0.454331 0.219904 0.625784 0.868002 0.301576 0.705079 0.28289 0.371428 0.632659 0.808288 0.213335 0.272214 0.737722 0.58116 0.0190753 0.627956 0.0410134 0.676979 0.344506 0.300955 0.302649 0.801941 0.884695 0.969953 0.61508 0.437414 0.152995 0.321124 0.766734 0.000507648 0.120141 0.809633 0.175838 0.796751 0.192669 
    distance: 6.62435
    
    find at 1036 row
    1.64188 0.552404 0.11612 0.661468 0.488598 0.852641 0.787818 0.178657 0.280292 0.0725081 0.984331 0.478286 0.0196322 0.632602 0.875309 0.24 0.601203 0.0437491 0.370092 0.719343 0.313233 0.649225 0.851744 0.186134 0.689183 0.810082 0.0880837 0.596978 0.460979 0.153177 0.222988 0.19667 0.475848 0.91624 0.51381 0.100563 0.421953 0.638504 0.719034 0.608129 0.44102 0.108308 0.0407314 0.725532 0.732938 0.380898 0.903016 0.585978 0.0749376 0.000878291 0.793043 0.620847 0.486585 0.125832 0.92559 0.187296 0.724833 0.762369 0.714592 0.150298 0.747136 0.645827 0.371833 0.665408 
    distance: 6.70976
    
    find at 666 row
    0.930705 0.0620238 0.735576 0.953759 0.495099 0.339989 0.204248 0.118341 0.329635 0.669598 0.594073 0.501553 0.0600229 0.250359 0.533784 0.233426 0.288357 0.553333 0.108614 0.543632 0.379357 0.45851 0.774319 0.322116 0.965752 0.264567 0.631253 0.872114 0.429717 0.48243 0.088136 0.384168 0.925825 0.690061 0.385414 0.89012 0.656293 0.609912 0.707917 0.496024 0.598189 0.122142 0.48215 0.731042 0.758647 0.435827 0.0392148 0.839645 0.795122 0.787899 0.822416 0.756708 0.683459 0.146435 0.249589 0.574608 0.200415 0.673407 0.802742 0.289194 0.665659 0.504368 0.838461 0.0141903 
    distance: 6.78529
    
    find at 462 row
    0.521379 0.306417 0.0181092 0.444188 0.338208 0.555521 0.346579 0.233428 0.301085 0.672265 0.856702 0.345093 0.112477 0.983967 0.726925 0.198079 0.442038 0.150574 0.112896 0.963589 0.775355 0.474437 0.980524 0.547615 0.134789 0.138315 0.56777 0.639174 0.840585 0.751075 0.393751 0.619752 0.838916 0.622288 0.834107 0.316815 0.585204 0.559258 0.76408 0.893745 0.338895 0.832426 0.421515 0.788684 0.0436307 0.577539 0.575335 0.499603 0.231649 0.976224 0.993221 0.13217 0.803416 0.401813 0.605905 0.216065 0.246232 0.546508 0.263178 0.105177 0.225365 0.711612 0.495208 0.824331 
    distance: 6.83551
    
    

    我们从10000个数据集中, 找到了与需要查找的 3个向量最相似的向量, 并且每一个向量找到了5个最相似的值. 并给出了距离值.

    这里要注意, faiss没有给结果取根号, 原因很简单, 不用根号也能表示顺序, 而如果娶了根号, 那么它的计算量将非常大, 没有必要, 你最终结果再计算根号也是可以的.

    那么问题来了,人脸特征如何提取呢?

    欢迎大家时刻关注我们的算法平台: http://codes.strangeai.pro , 我们将实时更新最快的人脸检测与比对算法.


登录后回复