Skip to content

Introduction

FMIndex Collection is written in modern C++20 and provides a set of concepts, classes and functions required ito provide FM-Indices and their internal support structures.

Functionality

  • (bidirectional) FM-Indices
  • search schemes generator
  • several search algorithms (with edit distance or hamming distance)
  • some bit vector implementations
  • many more vector with rank supports (for vectors over type uint8_t)

Dependencies

Usage

CPM

CPMAddPackage(
  NAME fmindex_collection
  GITHUB_REPOSITORY SGSSGene/fmindex_collection
  GIT_TAG v0.0.0 # put newest version here
  SYSTEM YES
)
...
target_link_libraries(${PROJECT_NAME}
    fmindex_collection::fmindex_collection
)

C++

#include <fmindex-collection/fmindex-collection.h>
#include <fmt/format.h>
void someFunction() {
    // your database/the data you want to search through
    auto reference = std::vector<std::vector<uint8_t>> {
        {1, 1, 1, 2, 2, 2, 3, 2, 4, 1, 1, 1},
        {1, 2, 1, 2, 3, 4, 3},
    };
    // Creating an FM-Inddex
    using OccTable = occtable::Interleaved_16<5>; // largest character + 1 (not allowed to have 0)
    auto index = FMIndex<OccTableTable>{reference, /*samplingRate*/16, /*threadNbr*/1};

    // The stuff you are searching for
    auto queries = std::vector<std::vector<uint8_t>>{{1, 3}, {4, 2}};

    search_backtracking::search(index, queries, /*.numberOfAllowedErrors=*/0, [&](size_t queryId, auto cursor, size_t errors) {
        (void)errors;
        fmt::print("found something {} {}\n", queryId, cursor.count());
        for (auto i : cursor) {
            auto [chr, pos] = index.locate(i);
            fmt::print("chr/pos: {}/{}\n", chr, pos);
        }
    });