This appears like a query a programmer may ask after one medicinal cigarette too many. The pc science equal of “what’s the sounds of 1 hand clapping?”. However it’s a query I’ve to resolve the reply to.
I’m including indexOf() and lastIndexOf() operations to the Calculate rework of my information wrangling (ETL) software program (Straightforward Knowledge Remodel). It will enable customers to seek out the offset of 1 string inside one other, counting from the beginning or the top of the string. Straightforward Knowledge Remodel is written in C++ and makes use of the Qt QString class for strings. There are indexOf() and lastIndexOf() strategies for QString, so I believed this is able to be a straightforward job to wrap that performance. Possibly quarter-hour to program it, write a check case and doc it.
Clearly it wasn’t that straightforward, in any other case I couldn’t be scripting this weblog publish.
Initially, what’s the index of “a” in “abc”? 0, clearly. QString( “abc” ).indexOf( “a” ) returns 0. Duh. Nicely solely in case you are a (non-Fortran) programmer. Ask a non-programmer (akin to my spouse) and they’ll say: 1, clearly. It’s the first character. Duh. Excel FIND( “a”, “abc” ) returns 1.
Okay, most of my prospects, aren’t programmers. I can use 1 primarily based indexing.
However then issues get extra tough.
What’s the index of an empty string in “abc”? 1 possibly, utilizing 1-based indexing or possibly empty will not be a legitimate worth to go.
What’s the index of an empty string in an empty string? Hmm. I assume the empty string does include an empty string, however at what index? 1 possibly, utilizing 1-based indexing, besides there isn’t a primary place within the string. Once more, possibly empty will not be a legitimate worth to go.
I appeared on the Qt C++ QString, Javascript string and Excel FIND() perform for solutions. However they every give completely different solutions and a few of them aren’t even internally constant. This can be a easy comparability of the primary index or final index of textual content v1 in textual content v2 in every (Excel doesn’t have an equal of lastIndexOf() that I’m conscious of):

Altering these to make the all of the legitimate outcomes 1-based and setting invalid outcomes to -1, for simple comparability:

So:
- Javascript disagrees with C++ QString and Excel on whether or not the primary index of an empty string in an empty string is legitimate.
- Javascript disagrees with C++ QString on whether or not the final index of an empty string in a non-empty string is the index of the final character or 1 after the final character.
- C++ QString thinks the primary index of an empty string in an empty string is the primary character, however the final index of an empty string in an empty string is invalid.
It appears surprisingly tough to provide you with one thing intuitive and constant! I feel I’m most likely going to return an error message if both or each values are empty. This appears to me to be the one unambiguous and constant strategy.

I may return a 0 for a non-match or when one or each values are empty, however I feel you will need to return completely different leads to these 2 completely different instances. Additionally, not discovered and invalid really feel qualitatively completely different to a calculated index to me, so shouldn’t be simply one other quantity. What do you suppose?
*** Replace 14-Dec-2023 ***
I’ve been across the homes a bit extra following suggestions on this weblog, the Straightforward Knowledge Remodel discussion board and hacker information and this what I’ve determined:
IndexOf() v1 in v2:
v1 | v2 | IndexOf(v1,v2) |
---|---|---|
1 | ||
aba | ||
aba | 1 | |
a | a | 1 |
a | aba | 1 |
x | y | |
world | whats up world | 7 |
This is identical as Excel FIND() and differs from Javascript indexOf() (ignoring the distinction in 0 or 1 primarily based indexing) just for “”.indexOf(“”) which returns -1 in Javascript.
LastIndexOf() v1 in v2:
v1 | v2 | LastIndexOf(v1,v2) |
---|---|---|
1 | ||
aba | ||
aba | 4 | |
a | a | 1 |
a | aba | 3 |
x | y | |
world | whats up world | 7 |
This differs from Javascript lastIndexOf() (ignoring distinction in 0 or 1 primarily based indexing) just for “”.indexOf(“”) which returns -1 in Javascript.
Conceptually the index is the 1-based index of the primary (IndexOf) or final (LastIndexOf) place the place, if the V1 is faraway from the discovered place, it must be re-inserted with a view to revert to V2. Because of layer8 on Hacker Information for clarifying this.
Javascript and C++ QString return an integer and each use -1 as a placeholder worth. However Straightforward Knowledge Remodel is returning a string (that may be interpreted as a quantity, relying on the rework) so we aren’t sure to utilizing a numeric worth. So I’ve left it clean the place there is no such thing as a legitimate outcome.
Now I’ve spent sufficient time down this rabbit gap and have to get on with one thing else! When you don’t prefer it you’ll be able to all the time add an If with Calculate or use a Javascript rework to get the outcome you favor.
*** Replace 15-Dec-2023 ***
Fairly a little bit of debate on this matter on Hacker Information.