Take two character strings in C or C++,
s2. It is rather trivial to check if one contains the other exactly.
The following will return
true in case
s2 is a substring of
#include <string> str.find(str2) != string::npos
#include <boost/algorithm/string.hpp> boost::algorithm::contains(s1, s2)
What I am looking for is an analogous way of efficiently (in terms of speed, not memory) finding whether a string is approximately contained in / is approximately a substring of another up to a given difference threshold. Pretty much like the
agrep bash command does in Unix systems for finding files.
For example, suppose the function was called
approx_contains(s1, s2, 4) could return true/false in case
s2 is contained within
s1 up to a Levenshtein distance of 4.
Searching online, I only found numerous references and questions on how to calculate Levenshtein distance between two strings, or theoretical algorithms for approximate string pattern matching that go beyond simply checking whether or not one string contains another – which here becomes wasteful.
Trying very hard to avoid reinventing the wheel, how could someone do such a checking in C or C++?
Source: Windows Questions C++