Difference between revisions of "Regular expressions"
AlizaLorenzo (Talk | contribs) (Created page with "<center>{{Expand}}</center> Regular Expressions (regex) are essentially a search engine for finding patterns in a text. While the syntax is a bit tricky to learn, regex will save...") |
AlizaLorenzo (Talk | contribs) (→Literals) |
||
Line 1: | Line 1: | ||
<center>{{Expand}}</center> | <center>{{Expand}}</center> | ||
Regular Expressions (regex) are essentially a search engine for finding patterns in a text. While the syntax is a bit tricky to learn, regex will save tons of time and effort in the long run. Many of you are probably familiar with regex, even if only through the use of wildcards. Wildcard notation, such as <code>*.html</code>, matches to all html files in the given search directory. Regex takes this idea and expands on it dramatically, allowing for very complicated search patterns. A regular expression to find all html files in a given directory would be <code>.*\.html$</code> | Regular Expressions (regex) are essentially a search engine for finding patterns in a text. While the syntax is a bit tricky to learn, regex will save tons of time and effort in the long run. Many of you are probably familiar with regex, even if only through the use of wildcards. Wildcard notation, such as <code>*.html</code>, matches to all html files in the given search directory. Regex takes this idea and expands on it dramatically, allowing for very complicated search patterns. A regular expression to find all html files in a given directory would be <code>.*\.html$</code> | ||
+ | ==Syntax== | ||
+ | ===Characters=== | ||
+ | ====Literals==== | ||
+ | The most basic regex is a literal character. A literal character, such as <code>a</code> matches <code>a</code> in the string <code>alex</code>. However, in a string such as <code>adam</code>, it will only match the first <code>a</code>, before the 'd', unless you tell the regex engine otherwise. Most text editors that have a 'find' function, also have a 'find next' function. | ||
+ | |||
+ | Similarly, a regex search for <code>hat</code> in the string <code>Blackhat Academy</code> will return 'hat' from the end of the first word. This is merely a string of literal characters, and the regex engine handles it the same way as it handles a single literal character. | ||
+ | |||
+ | ====Specials==== | ||
+ | ====Non-Printable==== | ||
+ | ===Character Classes (Sets)=== | ||
+ | ====Negated Character Classes==== | ||
+ | ====Metacharacters==== | ||
+ | ====Shorthand==== | ||
+ | ====Negated Shorthand==== | ||
+ | ====Repeating Character Classes==== | ||
+ | ===Dot=== | ||
+ | |||
+ | ===Anchors=== | ||
+ | |||
+ | ===Word Boundaries=== | ||
+ | |||
+ | ===Alternation=== | ||
+ | |||
+ | ===Quantifiers=== | ||
+ | |||
+ | ==Tools== | ||
+ | ===Utilities=== | ||
+ | *[http://unixhelp.ed.ac.uk/CGI/man-cgi?grep grep] | ||
+ | *[http://tools.tortoisesvn.net/grepWin.html grepWin] | ||
+ | *[http://regexpal.com/ RegexPal] | ||
+ | ===Programming Languages=== | ||
+ | *Gnulib | ||
+ | *Java | ||
+ | *JavaScript | ||
+ | *.NET | ||
+ | *Perl | ||
+ | *PHP | ||
+ | *PowerShell | ||
+ | *Python | ||
+ | *Ruby | ||
+ | ===Databases=== | ||
+ | *[http://dev.mysql.com/doc/refman/5.1/en/regexp.html MySQL] | ||
+ | *[http://docs.oracle.com/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm Oracle] | ||
+ | *[http://www.postgresql.org/docs/9.0/static/functions-matching.html PostgreSQL] |
Revision as of 00:02, 1 July 2012
This article contains too little information, it should be expanded or updated. |
---|
Things you can do to help:
|
Regular Expressions (regex) are essentially a search engine for finding patterns in a text. While the syntax is a bit tricky to learn, regex will save tons of time and effort in the long run. Many of you are probably familiar with regex, even if only through the use of wildcards. Wildcard notation, such as *.html
, matches to all html files in the given search directory. Regex takes this idea and expands on it dramatically, allowing for very complicated search patterns. A regular expression to find all html files in a given directory would be .*\.html$
Syntax
Characters
Literals
The most basic regex is a literal character. A literal character, such as a
matches a
in the string alex
. However, in a string such as adam
, it will only match the first a
, before the 'd', unless you tell the regex engine otherwise. Most text editors that have a 'find' function, also have a 'find next' function.
Similarly, a regex search for hat
in the string Blackhat Academy
will return 'hat' from the end of the first word. This is merely a string of literal characters, and the regex engine handles it the same way as it handles a single literal character.
Specials
Non-Printable
Character Classes (Sets)
Negated Character Classes
Metacharacters
Shorthand
Negated Shorthand
Repeating Character Classes
Dot
Anchors
Word Boundaries
Alternation
Quantifiers
Tools
Utilities
Programming Languages
- Gnulib
- Java
- JavaScript
- .NET
- Perl
- PHP
- PowerShell
- Python
- Ruby