ONLINE HELP
 WINDEVWEBDEV AND WINDEV MOBILE

Help / WLanguage / WLanguage functions / Standard functions / String functions
  • Remark on the syntax "Searching for substrings between two given separators"
  • ExtractStringBetween and UNICODE
WINDEV
WindowsLinuxUniversal Windows 10 AppJavaReports and QueriesUser code (UMC)
WEBDEV
WindowsLinuxPHPWEBDEV - Browser code
WINDEV Mobile
AndroidAndroid Widget iPhone/iPadIOS WidgetApple WatchMac CatalystUniversal Windows 10 App
Others
Stored procedures
ExtractStringBetween (Function)
In french: ExtraitChaineEntre
Allows you to:
  • extract a substring between two given separators from a character string.
  • search for substrings between two given separators in a character string.
For example, this function allows you to extract data between two tags in HTML, XML or JSON.
Remarks:
  • Searching for substrings takes less time than extracting substrings.
  • You can use arrays of separators. This allows you to use several pairs of separators at the same time.
Example
Country is string = [
<country>France</country><country>Italy</country>
<country>Germany</country><country>Spain</country>
]
ExtractStringBetween(Country, 1, "<country>", "</country>")   // Returns "France"
ExtractStringBetween(Country, 2, "<country>", "</country>")   // Returns "Italy"
ExtractStringBetween(Country, 3, "<country>", "</country>")   // Returns "Germany"
ExtractStringBetween(Country, 4, "<country>", "</country>")   // Returns "Spain"
ExtractStringBetween(Country, 5, "<country>", "</country>")   // Returns EOT
MyString is string = [
<red fruit>Strawberry</red fruit>
<red fruit>Raspberry</red fruit>
<exotic fruit>Cacao</exotic fruit>
<exotic fruit>Banana</exotic fruit>
]
ExtractStringBetween(MyString, 1, ["<red fruit>" , "<exotic fruit>"], ...
["</red fruit>" , "</exotic fruit>"]) // Returns "Strawberry"
ExtractStringBetween(MyString, 2, ["<red fruit>" , "<exotic fruit>"], ...
 ["</red fruit>" , "</exotic fruit>"]) // Returns "Raspberry"
ExtractStringBetween(MyString, 3, ["<red fruit>" , "<exotic fruit>"], ...
 ["</red fruit>" , "</exotic fruit>"]) // Returns "Cacao"
ExtractStringBetween(MyString, 4, ["<red fruit>" , "<exotic fruit>"], ...
 ["</red fruit>" , "</exotic fruit>"]) // Returns "Banana"
// Search for all the substrings
Country is string = [
<country>France</country><country>Italy</country>
<country>Germany</country><country>Spain</country>
]
SubString is string = ExtractStringBetween(Country, firstRank, "<country>", "</country>")
WHILE SubString <> EOT
Trace(SubString) // Returns "France", "Italy", "Germany", "Spain"
SubString = ExtractStringBetween(Country, nextRank, "<country>", "</country>")
END
// Search for all the substrings
// The separators are in arrays
sString is string = [
<red fruit>Strawberry</red fruit>
<red fruit>Raspberry</red fruit>
<exotic fruit>Cacao</exotic fruit>
<exotic fruit>Banana</exotic fruit>
]
sResult is string = ExtractStringBetween(sString, firstRank, ["<red fruit>" , ...
 "<exotic fruit>"], ["</red fruit>" , "</exotic fruit>"])
WHILE sResult <> EOT
Trace(sResult)
sResult = ExtractStringBetween(sString, nextRank, ["<red fruit>" , "<exotic fruit>"], ...
 ["</red fruit>" , "</exotic fruit>"])
END
Syntax

Extracting a substring between two given separators from a string Hide the details

<Result> = ExtractStringBetween(<Initial string> , <Index> , <Start separator> [, <End separator> [, <Options>]])
<Result>: Character string
Corresponds to:
  • The substring between <Start separator> at <Index> and <End separator> if the FromEnd constant is not specified.
  • The substring between <Start separator> at <Index> from the end of the string and the next end separator if the FromEnd constant is specified.
  • The EOT constant in one of the following cases:
    • if <Index> is greater than the number of start separators followed by end separators in the string,
    • if all separators are empty strings ("").
<Initial string>: Character string
Character string (up to 2 GB) containing the string to extract.
<Index>: Integer
Position of start separator followed by an end separator.
Remark: a start separator is not taken into account if there is no end separator between it and the previous start separator, unless the following conditions are met:
  • <Index> is set to 1,
  • the FromEnd option is not specified.
<Start separator>: String or Array of strings
This parameter can correspond to:
  • The string that delimits the beginning of substrings. This string is not included in the result.
  • An array of strings. The different strings in the array allow delimiting the beginning of substrings. The separators are not included in the result.
<End separator>: Optional character string or optional array of strings
This parameter can correspond to:
  • The string that delimits the end of substrings. This string is not included in the result.
  • An array of strings. The different strings in the array delimit the substrings. The separators are not included in the result.
If this parameter is not specified, the end separator will be identical to <Start separator>.
<Options>: Optional Integer constant
Search direction and characteristics:
FromBeginning
(Default value)
Search performed from the first character of the string to the last one.
FromEndSearch performed from the last character of the string to the first one.
IgnoreCaseCase and accent insensitive search (ignores uppercase/lowercase differences).
Linux This constant has no effect.
WholeWordSearches for a whole word (between punctuation characters or spaces).
Linux This constant has no effect.

Searching for substrings between two given separators in a character string Hide the details

<Result> = ExtractStringBetween(<Initial string> , <Search options> , <Start separator> [, <End separator> [, <Options>]])
<Result>: Character string
Corresponds to:
  • the next or previous substring according to the specified search direction. <Result> contains no separator.
  • the EOT constant at the end of the search.
<Initial string>: Character string
Character string (up to 2 GB) containing the string to extract.
<Search options>: Integer constant
Search direction:
firstRankStarts searching for substrings separated by the specified separators from the beginning of the string.
lastRankStarts searching for substrings separated by the specified separators from the end of the string.
nextRankContinues the search started with firstRank
previousRankContinues the search started with lastRank
<Start separator>: String or Array of strings
This parameter can correspond to:
  • The string that delimits the beginning of substrings. This string is not included in the result.
  • An array of strings. The different strings in the array allow delimiting the beginning of substrings. The separators are not included in the result.
<End separator>: Optional character string or optional array of strings
This parameter can correspond to:
  • The string that delimits the end of substrings. This string is not included in the result.
  • An array of strings. The different strings in the array delimit the substrings. The separators are not included in the result.
If this parameter is not specified, the end separator will be identical to <Start separator>.
<Options>: Optional Integer constant
Search characteristics:
IgnoreCaseCase and accent insensitive search (ignores uppercase/lowercase differences).
WholeWordSearches for a whole word (between punctuation characters or spaces).

Linux This parameter is not available. The search is case sensitive. The search string does not necessarily correspond to a whole word: this string can be part of a word.
Remarks

Remark on the syntax "Searching for substrings between two given separators"

  • This type of search can only be used on constant strings. Therefore, an element of the project (variable, control, item, etc.) must be used as initial string.
  • When a search is started with the firstRank or lastRank constants, the search information is stored in memory until all the substrings have been examined. Therefore, this type of search should be used only when all the substrings are to be examined.

ExtractStringBetween and UNICODE

<Initial string>, <Start separator> and <End separator> can correspond to:
  • ANSI strings.
  • UNICODE strings.
  • buffers.
You have the ability to use ANSI strings, Unicode strings and buffers in the different parameters of the function.
The following conversion rule is used for the Ansi systems (Windows or Linux):
  • If at least one of the strings is a buffer, all the strings are converted to buffers and the operation is performed with buffers.
  • If the first condition is not met and there is at least one Unicode string, all the strings are converted to Unicode and the operation is performed in Unicode (the conversion is performed with the current character set, if necessary).
  • Otherwise, the operation is performed in Ansi.
The conversion rule used for Unicode systems is as follows:
  • If at least one of the strings is a buffer, all the strings are converted to buffers and the operation is performed with buffers.
  • Otherwise, the operation is performed in Unicode.
For more details on UNICODE, see Managing UNICODE.
Reminder: The linguistic parameters used are defined during the call to ChangeCharset.
Component: wd290vm.dll
Minimum version required
  • Version 25
This page is also available for…
Comments
Click [Add] to post a comment

Last update: 07/12/2022

Send a report | Local help