Date post: | 05-Jan-2016 |
Category: |
Documents |
Upload: | crystal-edwards |
View: | 216 times |
Download: | 2 times |
C# StringsC# Strings 11
C# Regular ExpressionsC# Regular Expressions
CNS 3260CNS 3260
C# .NET Software DevelopmentC# .NET Software Development
C# StringsC# Strings 22
Introducing Regular ExpressionsIntroducing Regular Expressions
String pattern matching toolString pattern matching tool
Regular expressions constitute a languageRegular expressions constitute a language• C# regular expressions are a language inside a languageC# regular expressions are a language inside a language
Used in many languages (Perl most notably)Used in many languages (Perl most notably)
There’s a whole class on the theoryThere’s a whole class on the theory• CNS 3240: Computational TheoryCNS 3240: Computational Theory
C# StringsC# Strings 33
Pattern MatchingPattern Matching Match any of the characters in brackets [] onceMatch any of the characters in brackets [] once
• [a-zA-Z][a-zA-Z]
Anything not in brackets is matched exactlyAnything not in brackets is matched exactly• Except for special charactersExcept for special characters• abc[a-zA-Z]abc[a-zA-Z]
** Match preceding pattern zero or more times Match preceding pattern zero or more times• [a-zA-Z]*[a-zA-Z]*
++ Match preceding pattern one or more times Match preceding pattern one or more times• [a-zA-Z]+[a-zA-Z]+
C# StringsC# Strings 44
Language ElementsLanguage Elements
()() groups patternsgroups patterns || “or”, choose between patterns“or”, choose between patterns [][] defines a range of charactersdefines a range of characters {}{} used as a quantifierused as a quantifier \\ escape characterescape character . . matches any charactermatches any character ^̂ beginning of linebeginning of line $$ end of lineend of line [^][^] not character specifiednot character specified
C# StringsC# Strings 55
QuantifiersQuantifiers ** Matches zero or moreMatches zero or more ++ Matches one or moreMatches one or more ?? Matches zero or oneMatches zero or one {n}{n} Matches exactly nMatches exactly n {n,}{n,} Matches at least nMatches at least n {n,m}{n,m} Matches at least n, up to mMatches at least n, up to m
These quantifiers always take the largest pattern they can These quantifiers always take the largest pattern they can matchmatch
Lazy quantifiers always take the smallest pattern they can Lazy quantifiers always take the smallest pattern they can matchmatch• The lazy quantifiers are the same as those listed above, except The lazy quantifiers are the same as those listed above, except
followed by a ?followed by a ?
C# StringsC# Strings 66
Character ClassesCharacter Classes \w\w Matches any word characterMatches any word character
• Same as: [a-zA-Z_0-9]Same as: [a-zA-Z_0-9] \W\W Matches any non-word characterMatches any non-word character
• Same as: [^a-zA-Z_0-9]Same as: [^a-zA-Z_0-9]
\s\s Matches any white-space characterMatches any white-space character• Same as: [\f\n\r\t\v]Same as: [\f\n\r\t\v]
\S\S Matches any non-white-space characterMatches any non-white-space character• Same as: [^\f\n\r\t\v]Same as: [^\f\n\r\t\v]
\d\d Matches any digit characterMatches any digit character• Same as: [0-9]Same as: [0-9]
\D\D Matches any non-digit characterMatches any non-digit character• Same as: [^0-9]Same as: [^0-9]
C# StringsC# Strings 77
Putting It TogetherPutting It Together Regular Expression for C# identifiers:Regular Expression for C# identifiers:
• [a-zA-Z$_][a-zA-Z0-9$_]*[a-zA-Z$_][a-zA-Z0-9$_]*
Floating Point Numbers:Floating Point Numbers:• (0|([1-9][0-9]*))?\.[0-9]+(0|([1-9][0-9]*))?\.[0-9]+
C# Hexidecimal numbersC# Hexidecimal numbers• 0[xX][0-9a-fA-F]+0[xX][0-9a-fA-F]+
C# StringsC# Strings 88
C# Regular ExpressionsC# Regular Expressions
System.Text.RegularExpressionsSystem.Text.RegularExpressions• RegexRegex• MatchMatch• MatchCollectionMatchCollection• CaptureCapture• CaptureCollectionCaptureCollection• GroupGroup
C# StringsC# Strings 99
Regex ClassRegex Class
Exposes static methods for doing Exposes static methods for doing Regular Expression matchingRegular Expression matching
Or, holds a Regular Expression as an Or, holds a Regular Expression as an objectobject• Compiles the expression to make it Compiles the expression to make it
fasterfaster
C# StringsC# Strings 1010
Regex MembersRegex Members The non-static methods echo the static methodsThe non-static methods echo the static methods
OptionsOptions Escape()Escape() GetGroupNames()GetGroupNames() GetGroupNumbers()GetGroupNumbers() GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName() IsMatch()IsMatch() Match()Match() Matches()Matches() Replace()Replace() Split()Split() Unescape()Unescape()
C# StringsC# Strings 1111
Regex.OptionsRegex.Options
OptionsOptions• RegexOptions EnumRegexOptions Enum
Compiled – speeds up the searchesCompiled – speeds up the searches IgnoreCaseIgnoreCase MultiLineMultiLine NoneNone RightToLeftRightToLeft SingleLineSingleLine
C# StringsC# Strings 1212
Regex.Escape()Regex.Escape() If you’re not sure what needs to be escaped?If you’re not sure what needs to be escaped? Regex.Regex.EscapeEscape(string pattern)(string pattern)
• Returns a new string with the necessary characters Returns a new string with the necessary characters escapedescaped
Need to undo it?Need to undo it? Regex.Regex.UnescapeUnescape(string pattern)(string pattern)
• Returns a new string with all escape characters removedReturns a new string with all escape characters removed
C# StringsC# Strings 1313
MatchesMatchesprivate Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");
private string input1 = “801-224-6707";private string input1 = “801-224-6707";
Match:ValueIndex
LengthSuccess
NextMatch()CapturesGroups
C# StringsC# Strings 1414
Linked MatchesLinked Matches
Follow links using NextMatch()Follow links using NextMatch() Last link Success == falseLast link Success == false
Match1NextMatch()
Success==true
Match2NextMatch()
Success==true
Match3NextMatch()
Success==true
Match4NextMatch()
Success=false
C# StringsC# Strings 1515
GroupsGroups
private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");
Group:ValueIndex
LengthSuccessCaptures
1
2
0
Captures a matching substring for Captures a matching substring for future usefuture use• Captures in a Captures in a Capture Capture objectobject
Group 0 represents the entire matchGroup 0 represents the entire match
3 4
C# StringsC# Strings 1616
Named GroupsNamed Groups
(?<name>expression)(?<name>expression) Non-capturing groupNon-capturing group
• (?:expression)(?:expression)
(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})
C# StringsC# Strings 1717
CapturesCaptures
Capture:ValueIndex
Length
private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) ");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) "); private string input1 = “801-224-6707";private string input1 = “801-224-6707";
C# StringsC# Strings 1818
Regex.Match()Regex.Match() Regex.Regex.IsMatchIsMatch(string input, string pattern)(string input, string pattern)
• returns true if input matches pattern at least oncereturns true if input matches pattern at least once
Regex.Regex.MatchMatch(string input, string pattern)(string input, string pattern)• Returns a Match objectReturns a Match object• Use Match.Value to get the string value of the matchUse Match.Value to get the string value of the match
Regex.Regex.MatchesMatches(string input, string pattern)(string input, string pattern)• Returns a MatchCollection of all the occurrences of Returns a MatchCollection of all the occurrences of
pattern in inputpattern in input
C# StringsC# Strings 1919
Regex GroupsRegex Groups
GetGroupNames()GetGroupNames()• Returns all the group names in a string[]Returns all the group names in a string[]
GetGroupNumbers()GetGroupNumbers()• Returns the group numbers in an int[]Returns the group numbers in an int[]
GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName()
C# StringsC# Strings 2020
Regex.Split()Regex.Split()
Splits the string on a Regular Expression PatternSplits the string on a Regular Expression Pattern
string input = "one%%two%%%%three%%%four";Console.WriteLine();Console.WriteLine("Split...");Console.WriteLine(string.Join(",", Regex.Split(input, @"[%]+")));Console.ReadLine();
C# StringsC# Strings 2121
Regex.ReplaceRegex.Replace
Refer to a group capture in the regex using a $Refer to a group capture in the regex using a $ Replace(string input, string replacement, int count)Replace(string input, string replacement, int count)
string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";
Regex re2 = new Regex("(aaa)(bbb)(ccc)");Regex re2 = new Regex("(aaa)(bbb)(ccc)");
Console.WriteLine();Console.WriteLine();
Console.WriteLine("Replace...");Console.WriteLine("Replace...");
Console.WriteLine(re2.Replace(input2, Console.WriteLine(re2.Replace(input2, "$3$2$1""$3$2$1", 1));, 1));
Console.ReadLine();Console.ReadLine();
C# StringsC# Strings 2222
Constructing Strings Constructing Strings Because strings are immutable, building them is Because strings are immutable, building them is
slowslow• Each change creates a new stringEach change creates a new string
Use StringBuilder to speed things upUse StringBuilder to speed things up
C# StringsC# Strings 2323
StringBuilderStringBuilder
In System.TextIn System.Text Contains a mutable stringContains a mutable string Allocates space as neededAllocates space as needed Build the string then call:Build the string then call:
• myStringBuilder.ToString()myStringBuilder.ToString()
C# StringsC# Strings 2424
StringBuilder MembersStringBuilder Members CapacityCapacity IndexerIndexer LengthLength Append()Append() AppendFormat()AppendFormat() Insert()Insert() Remove()Remove() Replace()Replace() ToString()ToString()