by Ricardo “Rudxain” Fernández Serrata
Version 2 (May 18, 2022)
Download (82 downloads)
All the different lengths that a string can have. Because Unicode is not ASCII.
CU: Code-Units
CP: Code-Points
B: Bytes
The length operator (`#`) returns a value at constant time, because Java stores metadata of strings so there's no need to scan the string. findAll() ALWAYS has a best case linear runtime, and an unbounded worst case. This means that `#` is always fast, and findAll() is as slow as the size of its input (and can get even worse if the regex has backtracking, which could lead to EXPONENTIAL runtime)
If `s` is a text string then `#split(s) = #s` is always true, because `split(text, null)` works at the CU level.
If your flow has to check how many CPs a text has, and has to do it repeatedly on the same text, store the result of `findAll` once in a variable, and code your flow to read the variable instead of calling `findAll`. Your flow will become faster and energy-saving.
In general, `char(x)[0] != x`, not just because `x` might be non-integer, but because `char` can return surrogate pairs, while `[0]` selects the 1st CU (ignoring the 2nd surrogate CU of the pair)
Related: hsivonen.fi/string-length
LICENSE: https://unlicense.org
5 stars | 0 | |
4 stars | 1 | |
3 stars | 0 | |
2 stars | 0 | |
1 star | 0 | |
Reports | 0 |
Rate and review within the app in the Community section.