Friday, 29 April 2016

Content based image retrieval(CBIR) 02--Flow of CBIR, part B

    This is the second part of the the flow of CBIR, I would record step 6 and step 7 in this post, although there are two steps only, the last step is a little bit complicated.

Step 6 : Build inverted index


void cbir_bovw::build_code_book(size_t code_size)
{
   hist_type hist;
   hist.load(setting_["hist"].GetString() +
             std::string("_") +
             std::to_string(code_size));

   invert_index invert;
   ocv::cbir::build_inverted_index(hist, invert);
   invert.save(setting_["inverted_index"].GetString() +
              std::string("_") +
              std::to_string(code_size));
}

  This part is quite straigh forward, the invert_index is simply an encapsulation of std::map and std::vector. Apply inverted index may improve the accuracy of the CBIR system, this need to measure.

Step 7 : Search image

  After step 6, I have prepared most of the tools of this CBIR system, it is time to start searching. I have four ways to search the image, it is shown at pic00.
pic00

  As usual, a graph is worth a thousand words. The first solution(pic01) is the most easiest one, without IDF(inverse document frequency) and spatial information.
pic01



//api of this function is suck, but I think it is
//acceptable in this small example.However, in real case,
//we should not let this kind of codes exist, bad codes
//will attract more bad codes, in the end, your projects
//will become ultra hard to maintain
double measure_impl(Searcher &searcher,
                    ocv::cbir::f2d_detector &f2d,
                    BOVW const &bovw,
                    hist_type const &hist,
                    arma::Mat<cbir_bovw::feature_type> const &code_book,                    
                    rapidjson::Document const &doc,
                    rapidjson::Document const &setting)
{
    //toal_score save the number of "hit" image of ukbench
    double total_score = 0;
    auto const folder =
            std::string(setting["img_folder"].GetString());
    
    auto const files = ocv::file::get_directory_files(folder);
    for(int i = 0; i != files.size(); ++i){        
        cv::Mat gray =
                cv::imread(folder + "/" + files[i],
                           cv::IMREAD_GRAYSCALE);
        //f2d.get_descriptor is the bottle neck
        //of the program, more than 85% of computation
        //times come by it
        auto describe =
                f2d.get_descriptor(gray);
        //transfer cv::Mat to arma::Mat without copy
        arma::Mat const
                arma_features(describe.second.ptr<cbir_bovw::feature_type>(0),
                              describe.second.cols,
                              describe.second.rows,
                              false);
        //build the histogram of the image we want to search
        auto const target_hist =
                bovw.describe(arma_features,
                              code_book);   
        //search the image     
        auto const result =
                searcher.search(target_hist, hist);        

        //find relevant file of the image "files[i]"
        auto const &value = doc[files[i].c_str()];
        std::set relevant;
        for(rapidjson::SizeType j = 0;
            j != value.Size(); ++j){
            relevant.insert(value[j].GetString());
        }
        //increment total_score if the first 4 images
        //of the search result belongs to relevant image
        for(size_t j = 0; j != relevant.size(); ++j){
            auto it = relevant.find(files[result[j]]);
            if(it != std::end(relevant)){
                ++total_score;         
            }
        }        
    }

    return total_score;
}

  This is it, I wrote down how to apply IDF and spatial info on github.


Results

Without inverse document frequency(IDF) and spatial verification(pic01) : 3.044
With inverse document frequency : 3.035
With spatial verfication : 3.082
With inverse document frequency and spatial verification : 3.13

  In conclusion, if I apply IDF and spatial verification, I am able to get best results. The results could be improve if I invest more times to tune the parameters, like the number of code books, parameter of kaze, use another feature extractor to extract the features etc.

Problems of this solution

1 : It is slow, it took me about 300ms~500ms to extract kaze features and keypoints from a 640x480 image, single channel.
2 : It consume a lot of memory, kaze use about 150MB to extract keypoints and features.

  If your applications only run on local machine, this is not a problem, but if you want to develop a web app, this would be a serious problem. We need a much faster yet quite accurate CBIR system if we want to deploy it on high traffic web app, just like TinEye and Google did.