Friday, April 8, 2011

Regular Expressions

Regular Expression Is used by adding the regexp prefix to your pattern: regexp:pattern

Selenium uses the JavaScript implementation of regular expressions (Perl-based):

· any character matches its literal representation: abc will match abc

· [ starts a class, which is any number of characters specified in the class

o [a-z] will match any number of lowercase alphabetical characters without spaces: hello, pizza, world

o [A-Z] does the same with uppercase characters

o [a-zA-Z] will do the same with either uppercase of lowercase characters

o [abc] will match either a, b or c

o [0-9] will match numeric characters

o ^ negates the character class is just after [: [^a-z] will match all but lowercase alphabetic characters

o \d, \w and \s are shortcuts to match respectively digits, word characters (letters, digits and underscores) and spaces

o \D, \W and \S are the negations of the previous shortcuts

· . matches any single characters excepts line breaks \r and \n

· ^ matches the start of the string the pattern is applied to: ^. matches a in abcdef

· $ is like ^ but for the end of the string: .$ matches f in abcdef

· | is equivalent to a logical OR: abc|def|xyz matches abc, def or xyz

· | has the lowest priority so abc(def|xyz) will match either abcdef or abcxyz

· ? makes the last character of the match optional: abc? matches ab or abc

· ? works in a greedy way: it will include the last character if possible

· repeats the preceding item at least zero or more times: ".*" matches "def" and "ghi" in abc "def" "ghi" jkl

· *? is the lazy star: matches only "def" in the previous example

· + matches the previous item at least once or more times.

· {n} will match the previous item exactly n times: a{3} will match aaa

o {n,m} will match the previous item between n and m times with m >= n and is greedy so it will try to match m items first: a{2,4} will match aaaa, aaa and aa

o {n,m}? is the same but in a lazy way: will start by matching at least n times and increase the number of matches to m

o {n,} will match the previous item at least n times

· Common regexps:

o \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} matches an ip adress but will also match 999.999.999.999.

o (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) will match a real ip adress. Can be shortened to: (?:\d{1,3}\.){3}\d{1,3}

o [A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4} matches an email adress

No comments:

Post a Comment