MATLAB Function Reference Previous page   Next Page
regexp, regexpi

Match regular expression

Syntax

Each of these syntaxes apply to both regexp and regexpi. The regexp function is case sensitive in matching regular expressions to a string, and regexpi is case insensitive:

Description

The following descriptions apply to both regexp and regexpi:

regexp('str', 'expr') returns a row vector containing the starting index of each substring of str that matches the regular expression string expr. If no matches are found, regexp returns an empty array. The str and expr arguments can also be cell arrays of strings. See the guidelines listed below under Multiple Strings and Expressions.

[start end extents match tokens names] = regexp('str', 'expr') returns up to six values, one for each output variable you specify, and in the default order (as shown in the table below).

[v1 v2 ...] = regexp('str', 'expr', q1, q2, ...) returns up to six values, one for each output variable you specify, and ordered according to the order of the qualifier arguments, q1, q2, etc.

Return Values for Regular Expressions
Default Order
Description
Qualifier
1
Row vector containing the starting index of each substring of str that matches expr
start
2
Row vector containing the ending index of each substring of str that matches expr
end
3
Cell array containing the starting and ending indices of each substring of str that matches a token in expr
tokenExtents
4
Cell array containing the text of each substring of str that matches expr
match
5
Cell array containing the text of each token captured by regexp.
tokens
6
Structure array containing the name and text of each named token captured by regexp. If there are no named tokens in expr, regexp returns a structure array with no fields.
Field names of the returned structure are set to the token names, and field values are the text of those tokens. Named tokens are generated by the expression (?<tokenname>).
names

[v1 v2 ...] = regexp('str', 'expr', 'q1', 'q2', ..., 'once') returns just the first match found. The keyword once must come last in the argument list. Output and qualifier arguments are not required.

regexp 'str' 'expr' 'q1' 'q2' ... 'once' is the command syntax for this function. Only the 'str' and 'expr' arguments are required.

Remarks

Multiple Strings and Expressions

Either the str or expr argument, or both, can be a cell array of strings, according to the following guidelines:

See Regular Expressions in the MATLAB documentation for a listing of all regular expression elements supported by MATLAB.

regexp does not support international character sets.

Examples

Example 1

Return a row vector of indices that match words that start with c, end with t, and contain one or more vowels between them. Make the matches insensitive to letter case (by using regexpi):

Example 2

Return a cell array of row vectors of indices that match capital letters and white spaces in the cell array of strings str:

Capital letters, '[A-Z]', were found at these str indices:

Space characters, '\s', were found at these str indices:

Example 3

Return the text and the starting and ending indices of words containing the letter x:

Example 4

Search a string for opening and closing HTML tags. Use the expression <(\w+) to find the opening tag (e.g., '<tagname') and to create a token for it. Use the expression </\1> to find another occurrence of the same token, but formatted as a closing tag (e.g., '</tagname>'):

See "Tokens" in the MATLAB Programming documentation for information on using tokens.

Example 5

Enter a string containing two names, the first and last names being in a different order:

Create an expression that generates first and last name tokens, assigning the names first and last to the tokens. Call regexp to get the text and names of each token found:

Examine the tokens cell array that was returned. The first and last name tokens appear in the order in which they were generated: first name-last name, then last name-first name:

Now examine the names structure that was returned. First and last names appear in a more usable order:

See Also

regexprep, strfind, findstr, strmatch, strcmp, strcmpi, strncmp, strncmpi


Previous page  refreshdata regexprep Next page

© 1994-2005 The MathWorks, Inc.