Site Tools


regex

# $EPIC: regex.txt,v 1.8 2009/03/21 05:49:29 zwhite Exp $

Synopsis:

$regcomp(<regex pattern>)
$regcomp_cs(<regex pattern>)
$regexec(<compiled pattern> <string>)
$regmatches(<compiled pattern> <matches> <string>)
$regerror(<compiled pattern>)
$regfree(<compiled pattern>)

Technical:

These functions are an interface to “regular expression” pattern matching:

$regcomp() is used to “compile” a case insensitive regular expression. The return value is suitable for /assigning to an ircII variable. Note that the return value of this function must be passed to the $regfree() function, to return the allocated resources for the compiled pattern. The compilation could fail: you should pass the return value to $regerror() to check to fetch the error code.

$regcomp_cs() is the same as regcomp, but the pattern is case sensitive.

$regexec() is used to match a previously compiled pattern against a text string. The function returns 0 if the string is matched by the pattern, and 1 if it does not.

$regmatches() is used to find substrings within the string. It returns pairs of numbers which can be applied to $mid() to extract the respective substring. If the string doesn't match, it returns the empty string.

$regerror() is used to fetch the error code for the most recently attempted action on a previously compiled pattern.

$regfree() is used to return the resources allocated to a compiled pattern. Attempting to use a compiled pattern after it has been passed to regfree is an error and may crash the client. The function returns the FALSE value.

If you neglect to regfree something that was returned by regcomp, then that will result in a memory leak. The client cannot control this, and so if you use these functions, it is your duty to keep track of this.

Passing a value to regexec, regerror, or regfree that was not previously returned from regcomp is an error and may crash the client.

In at least some implementations of the regex calls, regexec will happily succeed for any pattern that regcomp failed to compile. It is necessary to check regerror after every regcomp and regexec call if you need to know about errors.

Practical:

These functions can be very very fast for pattern matching (depending on implementation) and the notation available is also more powerful than the standard regex notation. If you need to scan all incoming data for all variations of your name and performance is important to you, then these functions may meet your needs.

Returns:

$regcomp() returns an opaque l-value suitable for passing to the other three functions. The return value must be passed later to $regfree().
$regexec() returns 0 or non-zero depending on whether the match worked or not
$regmatches() returns a pair of numbers for each substring match.
$regerror() returns the current error condition for a pattern.
$regfree() returns the false value.

Example:

   @orig_string = "abc def";
   @pattern = regcomp(abc);
   if (regerror($pattern)) {
      xecho -b Error compiling regex: $regerror($pattern);
   };
   if (regexec($pattern $orig_string) == 0) {
      xecho -b It matched!;
   } else {
      xecho -b Regex error: $regerror($pattern);
   };
   @regfree($pattern);

When run, this will return:

*** It matched!

History:

All the functions except regmatches first appeared in EPIC4-1.029. The regmatches function, and its attendant support of subexpressions first appeared in EPIC4-1.1.3.

regex.txt · Last modified: 2009/06/02 15:52 by 127.0.0.1