MATLAB Function Reference |
Syntax
Each of these syntaxes apply to both regexp
and regexpi
. The regexp
function is case sensitive in matching regular expressions to a string, and regexpi
is case insensitive:
regexp('str', 'expr') [start end extents match tokens names] = regexp('str', 'expr') [v1 v2 ...] = regexp('str', 'expr', 'q1', 'q2', ...) [v1 v2 ...] = regexp('str', 'expr', 'q1', 'q2', ...,'once'
) regexp 'str' 'expr' 'q1' 'q2' ...'once'
Description
The following descriptions apply to both regexp
and regexpi
:
regexp('str', 'expr')
returns a row vector containing the starting index of each substring of str
that matches the regular expression string expr
. If no matches are found, regexp
returns an empty array. The str
and expr
arguments can also be cell arrays of strings. See the guidelines listed below under Multiple Strings and Expressions.
[start end extents match tokens names] = regexp('str', 'expr')
returns up to six values, one for each output variable you specify, and in the default order (as shown in the table below).
[v1 v2 ...] = regexp('str', 'expr', q1, q2, ...)
returns up to six values, one for each output variable you specify, and ordered according to the order of the qualifier arguments, q1
, q2
, etc.
Default Order |
Description |
Qualifier |
1 |
Row vector containing the starting index of each substring of str that matches expr |
start |
2 |
Row vector containing the ending index of each substring of str that matches expr |
end |
3 |
Cell array containing the starting and ending indices of each substring of str that matches a token in expr |
tokenExtents |
4 |
Cell array containing the text of each substring of str that matches expr |
match |
5 |
Cell array containing the text of each token captured by regexp . |
tokens |
6 |
Structure array containing the name and text of each named token captured by regexp . If there are no named tokens in expr , regexp returns a structure array with no fields.Field names of the returned structure are set to the token names, and field values are the text of those tokens. Named tokens are generated by the expression (?<tokenname>) . |
names |
[v1 v2 ...] = regexp('str', 'expr', 'q1', 'q2', ...,
returns just the first match found. The keyword 'once'
)
once
must come last in the argument list. Output and qualifier arguments are not required.
regexp 'str' 'expr' 'q1' 'q2' ...
is the command syntax for this function. Only the 'once'
'str'
and 'expr'
arguments are required.
Multiple Strings and Expressions
Either the str
or expr
argument, or both, can be a cell array of strings, according to the following guidelines:
str
is a cell array of strings, then each of the regexp
outputs is a cell array having the same dimensions as str
.
str
is a single string but expr
is a cell array of strings, then each of the regexp
outputs is a cell array having the same dimensions as expr
.
str
and expr
are cell arrays of strings, these two cell arrays must contain the same number of elements.
See Regular Expressions in the MATLAB documentation for a listing of all regular expression elements supported by MATLAB.
regexp
does not support international character sets.
Example 1
Return a row vector of indices that match words that start with c
, end with t
, and contain one or more vowels between them. Make the matches insensitive to letter case (by using regexpi
):
Example 2
Return a cell array of row vectors of indices that match capital letters and white spaces in the cell array of strings str
:
str = {'Madrid, Spain' 'Romeo and Juliet' 'MATLAB is great'}; s1 = regexp(str, '[A-Z]'); s2 = regexp(str, '\s');
Capital letters, '[A-Z]'
, were found at these str
indices:
Space characters, '\s'
, were found at these str
indices:
Example 3
Return the text and the starting and ending indices of words containing the letter x
:
str = 'regexp helps you relax'; [m s e] = regexp(str, '\w*x\w*', 'match', 'start', 'end') m = 'regexp' 'relax' s = 1 18 e = 6 22
Example 4
Search a string for opening and closing HTML tags. Use the expression <(\w+)
to find the opening tag (e.g., '<tagname'
) and to create a token for it. Use the expression </\1>
to find another occurrence of the same token, but formatted as a closing tag (e.g., '</tagname>'
):
str = 'if <code>A</code> == x<sup>2</sup>, <em>disp(x)</em>'; expr = '<(\w+).*?>.*?</\1>'; [tok mat] = regexp(str, expr, 'tokens', 'match'); tok{:} ans = 'code' ans = 'sup' ans = 'em' mat{:} ans = <code>A</code> ans = <sup>2</sup> ans = <em>disp(x)</em>
See "Tokens" in the MATLAB Programming documentation for information on using tokens.
Example 5
Enter a string containing two names, the first and last names being in a different order:
Create an expression that generates first and last name tokens, assigning the names first
and last
to the tokens. Call regexp
to get the text and names of each token found:
expr = ... '(?<first>\w+)\s+(?<last>\w+)|(?<last>\w+),\s+(?<first>\w+)'; [tokens names] = regexp(str, expr, 'tokens', 'names');
Examine the tokens
cell array that was returned. The first and last name tokens appear in the order in which they were generated: first name-last name, then last name-first name:
Now examine the names
structure that was returned. First and last names appear in a more usable order:
See Also
regexprep
, strfind
, findstr
, strmatch
, strcmp
, strcmpi
, strncmp
, strncmpi
refreshdata | regexprep |
© 1994-2005 The MathWorks, Inc.