ExtractStringBetween (Function)

ONLINE HELP
WINDEV, WEBDEV AND WINDEV MOBILE

Version:

Home | Sign in | English

This content has been translated automatically. Click here to view the French version.

Help / WLanguage / WLanguage functions / Standard functions / String functions

Remark on the syntax "Searching for substrings between two given separators"
ExtractStringBetween and UNICODE

WINDEV

WEBDEV

WINDEV Mobile

Others

See also

ExtractStringBetween (Function)

In french: ExtraitChaineEntre

Allows you to:

extract a substring between two given separators from a character string.
search for substrings between two given separators in a character string.

For example, this function allows you to extract data between two tags in HTML, XML or JSON.

Remarks:

Searching for substrings takes less time than extracting substrings.
You can use arrays of separators. This allows you to use several pairs of separators at the same time.

Example

Pays is string = [
			<pays>France</pays><pays>Italie</pays>
			<pays>Allemagne</pays><pays>Espagne</pays>
			]
ExtractStringBetween(Pays, 1, "<pays>", "</pays>")   // Renvoie "France"
ExtractStringBetween(Pays, 2, "<pays>", "</pays>")   // Renvoie "Italie"
ExtractStringBetween(Pays, 3, "<pays>", "</pays>")   // Renvoie "Allemagne"
ExtractStringBetween(Pays, 4, "<pays>", "</pays>")   // Renvoie "Espagne"
ExtractStringBetween(Pays, 5, "<pays>", "</pays>")   // Renvoie EOT

MaChaîne is string = [
				<fruit rouge>Fraise</fruit rouge>
				<fruit rouge>Framboise</fruit rouge>
				<fruit exotique>Cacao</fruit exotique>
				<fruit exotique>Banane</fruit exotique>
				]
ExtractStringBetween(MaChaîne, 1, ["<fruit rouge>" , "<fruit exotique>"], ...
				["</fruit rouge>" , "</fruit exotique>"]) // Renvoie "Fraise"
ExtractStringBetween(MaChaîne, 2, ["<fruit rouge>" , "<fruit exotique>"], ...
				 ["</fruit rouge>" , "</fruit exotique>"]) // Renvoie "Framboise"
ExtractStringBetween(MaChaîne, 3, ["<fruit rouge>" , "<fruit exotique>"], ...
				 ["</fruit rouge>" , "</fruit exotique>"]) // Renvoie "Cacao"
ExtractStringBetween(MaChaîne, 4, ["<fruit rouge>" , "<fruit exotique>"], ...
				 ["</fruit rouge>" , "</fruit exotique>"]) // Renvoie "Banane"

// Parcours de toutes les sous-chaînes
Pays is string = [
				<pays>France</pays><pays>Italie</pays>
				<pays>Allemagne</pays><pays>Espagne</pays>
			]
SousChaîne is string = ExtractStringBetween(Pays, firstRank, "<pays>", "</pays>")
WHILE SousChaîne <> EOT
	Trace(SousChaîne) // Renvoie "France", "Italie", "Allemagne", "Espagne"
	SousChaîne = ExtractStringBetween(Pays, nextRank, "<pays>", "</pays>")
END

// Parcours de toutes les sous-chaînes
// Les séparateurs sont présents dans des tableaux
sChaîne is string = [
				<fruit rouge>Fraise</fruit rouge>
				<fruit rouge>Framboise</fruit rouge>
				<fruit exotique>Cacao</fruit exotique>
				<fruit exotique>Banane</fruit exotique>
			]
sRésultat is string = ExtractStringBetween(sChaîne, firstRank, ["<fruit rouge>" , ...
					 "<fruit exotique>"], ["</fruit rouge>" , "</fruit exotique>"])
WHILE sRésultat <> EOT
	Trace(sRésultat)
	sRésultat = ExtractStringBetween(sChaîne, nextRank, ["<fruit rouge>" , "<fruit exotique>"], ...
			 ["</fruit rouge>" , "</fruit exotique>"])
END

Syntax

Extracting a substring between two given separators from a string Hide the details

<Result> = ExtractStringBetween(<Initial string> , <Index> , <Start separator> [, <End separator> [, <Options>]])

<Result>: Character string

Corresponds to:
The substring between <Start separator> at <Index> and <End separator> if the FromEnd constant is not specified.
The substring between <Start separator> at <Index> from the end of the string and the next end separator if the FromEnd constant is specified.
The EOT constant in one of the following cases:
if <Index> is greater than the number of start separators followed by end separators in the string,
if all separators are empty strings ("").

<Initial string>: Character string

Character string (up to 2 GB) containing the string to extract.

<Index>: Integer

Position of start separator followed by an end separator.
Note: a start separator is ignored if there is no end separator between it and the previous start separator, unless the following conditions are met:
<Index> is set to 1,
the FromEnd option is not specified.

<Start separator>: String or Array of strings

This parameter can correspond to:
The string that delimits the beginning of substrings. This string is not included in the result.
An array of strings. The different strings in the array allow delimiting the beginning of substrings. The separators are not included in the result.

<End separator>: Optional character string or optional array of strings

This parameter can correspond to:
The string that delimits the end of substrings. This string is not included in the result.
An array of strings. The different strings in the array delimit the substrings. The separators are not included in the result.
If this parameter is not specified, the end separator will be identical to <Start separator>.

<Options>: Optional Integer constant

Search direction and characteristics:
FromBeginning
(Default value) Searches from the first to the last character of the string.
FromEnd Searches from the last to the first character of the string.
IgnoreCase Case and accent insensitive search (ignores uppercase/lowercase differences).
This constant has no effect.
WholeWord Searches for a whole word (between punctuation characters or spaces).
This constant has no effect.

Searching for substrings between two given separators in a character string Hide the details

<Result> = ExtractStringBetween(<Initial string> , <Search options> , <Start separator> [, <End separator> [, <Options>]])

<Result>: Character string

Corresponds to:
the next or previous substring according to the specified search direction. <Result> contains no separator.
the EOT constant at the end of the search.

<Initial string>: Character string

Character string (up to 2 GB) containing the string to extract.

<Search options>: Integer constant

Search direction:
firstRank Starts searching for substrings separated by the specified separators from the beginning of the string.
lastRank Starts searching for substrings separated by the specified separators from the end of the string.
nextRank Continues the search started with firstRank
previousRank Continues the search started with lastRank

<Start separator>: String or Array of strings

This parameter can correspond to:
The string that delimits the beginning of substrings. This string is not included in the result.
An array of strings. The different strings in the array allow delimiting the beginning of substrings. The separators are not included in the result.

<End separator>: Optional character string or optional array of strings

This parameter can correspond to:
The string that delimits the end of substrings. This string is not included in the result.
An array of strings. The different strings in the array delimit the substrings. The separators are not included in the result.
If this parameter is not specified, the end separator will be identical to <Start separator>.

<Options>: Optional Integer constant

Search characteristics:
IgnoreCase Case and accent insensitive search (ignores uppercase/lowercase differences).
WholeWord Searches for a whole word (between punctuation characters or spaces).

This parameter is not available. The search is case sensitive. The search string doesn't have to be a complete word: it can be part of a word.

Remarks

Remark on the syntax "Searching for substrings between two given separators"

This type of search can only be used on constant strings. Therefore, an element of the project (variable, control, item, etc.) must be used as initial string.
When a search is started with the firstRank or lastRank constants, the search information is stored in memory until all the substrings have been examined. Therefore, this type of search should be used only when all the substrings are to be examined.

ExtractStringBetween and UNICODE

<Initial string>, <Start separator> and <End separator> can correspond to:

ANSI strings.
or Unicode strings.
buffers.

You have the ability to use ANSI strings, Unicode strings and buffers in the different parameters of the function.

The following conversion rule is used for the Ansi systems (Windows or Linux):

If at least one of the strings is a buffer, all the strings are converted to buffers and the operation is performed with buffers.
If the first condition is not met and there is at least one Unicode string, all the strings are converted to Unicode and the operation is performed in Unicode (the conversion is performed with the current character set, if necessary).
Otherwise, the operation is performed in Ansi.

The conversion rule used for Unicode systems is as follows:

If at least one of the strings is a buffer, all the strings are converted to buffers and the operation is performed with buffers.
Otherwise, the operation is performed in Unicode.

For more details on Unicode, see Unicode management.

Reminder: The language parameters used are defined when the ChangeCharset function is called.

Component: wd300vm.dll