Hi,
I am trying to create a regex formula that takes a series of words and forms a regex that searches for any of these words in a body of text. It needs to be able to search for either a whole word, or the word as a non-whole word depending on a setting
for each one. So far I have been using:
tempKeywordStr = replaceSpecialChars(keywords_to_search(j).keyword)
If keyword_substring_search = False Then
' Add into string to use in regular expression. \b is used to indicate whole words only
keywordListstr = keywordListstr & "\b" & tempKeywordStr & "\b" & "|"
Else
' Add into string to use in regular expression, search within words
keywordListstr = keywordListstr & tempKeywordStr & "|"
End If
Function replaceSpecialChars(ByVal input As String) As String
Dim temp As String = input
'Escape any special characters apart from the out two
temp = Replace(temp, "\", "\\")
temp = Replace(temp, "+", "\+")
temp = Replace(temp, "*", "\*")
temp = Replace(temp, "?", "\?")
temp = Replace(temp, "|", "\|")
temp = Replace(temp, "{", "\{")
temp = Replace(temp, "}", "\}")
temp = Replace(temp, "[", "\[")
temp = Replace(temp, "]", "\]")
temp = Replace(temp, "(", "\(")
temp = Replace(temp, ")", "\)")
temp = Replace(temp, "^", "\^")
temp = Replace(temp, "$", "\$")
temp = Replace(temp, ".", "\.")
temp = Replace(temp, "#", "\#")
Return temp
End Function
The problem with this is it doesn't seem to pick up words such as "(test)" where the special characters are on the outside of the word being searched, and the keyword_substring_search = false. It is almost as if it doesn't like /b being before
an escaped special character, since it finds these without /b.
Does anyone know how to get around this, since I need to be able to find all words I am passed, no matter how many different special characters they contain, and where these special characters reside in the word.
Thanks in advance for any help!
Cheers,
Tom
Update:
I've just read that \b only checks if the first character is \w character. It seems for my functionality I might need to write my own lookahead / lookbehind... any tips on what I might need?
Update 2:
These two links suggest a solution for a customised lookahead / lookbehind, but I can't seem to get this working with the regex obj I am using in vb:
objRegEx = CreateObject("VBscript.RegExp")
Link 1
Link 2