+ All Categories
Home > Documents > C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

Date post: 05-Jan-2016
Category:
Upload: crystal-edwards
View: 216 times
Download: 2 times
Share this document with a friend
24
C# Strings C# Strings 1 C# Regular C# Regular Expressions Expressions CNS 3260 CNS 3260 C# .NET Software Development C# .NET Software Development
Transcript
Page 1: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 11

C# Regular ExpressionsC# Regular Expressions

CNS 3260CNS 3260

C# .NET Software DevelopmentC# .NET Software Development

Page 2: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 22

Introducing Regular ExpressionsIntroducing Regular Expressions

String pattern matching toolString pattern matching tool

Regular expressions constitute a languageRegular expressions constitute a language• C# regular expressions are a language inside a languageC# regular expressions are a language inside a language

Used in many languages (Perl most notably)Used in many languages (Perl most notably)

There’s a whole class on the theoryThere’s a whole class on the theory• CNS 3240: Computational TheoryCNS 3240: Computational Theory

Page 3: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 33

Pattern MatchingPattern Matching Match any of the characters in brackets [] onceMatch any of the characters in brackets [] once

• [a-zA-Z][a-zA-Z]

Anything not in brackets is matched exactlyAnything not in brackets is matched exactly• Except for special charactersExcept for special characters• abc[a-zA-Z]abc[a-zA-Z]

** Match preceding pattern zero or more times Match preceding pattern zero or more times• [a-zA-Z]*[a-zA-Z]*

++ Match preceding pattern one or more times Match preceding pattern one or more times• [a-zA-Z]+[a-zA-Z]+

Page 4: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 44

Language ElementsLanguage Elements

()() groups patternsgroups patterns || “or”, choose between patterns“or”, choose between patterns [][] defines a range of charactersdefines a range of characters {}{} used as a quantifierused as a quantifier \\ escape characterescape character . . matches any charactermatches any character ^̂ beginning of linebeginning of line $$ end of lineend of line [^][^] not character specifiednot character specified

Page 5: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 55

QuantifiersQuantifiers ** Matches zero or moreMatches zero or more ++ Matches one or moreMatches one or more ?? Matches zero or oneMatches zero or one {n}{n} Matches exactly nMatches exactly n {n,}{n,} Matches at least nMatches at least n {n,m}{n,m} Matches at least n, up to mMatches at least n, up to m

These quantifiers always take the largest pattern they can These quantifiers always take the largest pattern they can matchmatch

Lazy quantifiers always take the smallest pattern they can Lazy quantifiers always take the smallest pattern they can matchmatch• The lazy quantifiers are the same as those listed above, except The lazy quantifiers are the same as those listed above, except

followed by a ?followed by a ?

Page 6: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 66

Character ClassesCharacter Classes \w\w Matches any word characterMatches any word character

• Same as: [a-zA-Z_0-9]Same as: [a-zA-Z_0-9] \W\W Matches any non-word characterMatches any non-word character

• Same as: [^a-zA-Z_0-9]Same as: [^a-zA-Z_0-9]

\s\s Matches any white-space characterMatches any white-space character• Same as: [\f\n\r\t\v]Same as: [\f\n\r\t\v]

\S\S Matches any non-white-space characterMatches any non-white-space character• Same as: [^\f\n\r\t\v]Same as: [^\f\n\r\t\v]

\d\d Matches any digit characterMatches any digit character• Same as: [0-9]Same as: [0-9]

\D\D Matches any non-digit characterMatches any non-digit character• Same as: [^0-9]Same as: [^0-9]

Page 7: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 77

Putting It TogetherPutting It Together Regular Expression for C# identifiers:Regular Expression for C# identifiers:

• [a-zA-Z$_][a-zA-Z0-9$_]*[a-zA-Z$_][a-zA-Z0-9$_]*

Floating Point Numbers:Floating Point Numbers:• (0|([1-9][0-9]*))?\.[0-9]+(0|([1-9][0-9]*))?\.[0-9]+

C# Hexidecimal numbersC# Hexidecimal numbers• 0[xX][0-9a-fA-F]+0[xX][0-9a-fA-F]+

Page 8: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 88

C# Regular ExpressionsC# Regular Expressions

System.Text.RegularExpressionsSystem.Text.RegularExpressions• RegexRegex• MatchMatch• MatchCollectionMatchCollection• CaptureCapture• CaptureCollectionCaptureCollection• GroupGroup

Page 9: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 99

Regex ClassRegex Class

Exposes static methods for doing Exposes static methods for doing Regular Expression matchingRegular Expression matching

Or, holds a Regular Expression as an Or, holds a Regular Expression as an objectobject• Compiles the expression to make it Compiles the expression to make it

fasterfaster

Page 10: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1010

Regex MembersRegex Members The non-static methods echo the static methodsThe non-static methods echo the static methods

OptionsOptions Escape()Escape() GetGroupNames()GetGroupNames() GetGroupNumbers()GetGroupNumbers() GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName() IsMatch()IsMatch() Match()Match() Matches()Matches() Replace()Replace() Split()Split() Unescape()Unescape()

Page 11: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1111

Regex.OptionsRegex.Options

OptionsOptions• RegexOptions EnumRegexOptions Enum

Compiled – speeds up the searchesCompiled – speeds up the searches IgnoreCaseIgnoreCase MultiLineMultiLine NoneNone RightToLeftRightToLeft SingleLineSingleLine

Page 12: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1212

Regex.Escape()Regex.Escape() If you’re not sure what needs to be escaped?If you’re not sure what needs to be escaped? Regex.Regex.EscapeEscape(string pattern)(string pattern)

• Returns a new string with the necessary characters Returns a new string with the necessary characters escapedescaped

Need to undo it?Need to undo it? Regex.Regex.UnescapeUnescape(string pattern)(string pattern)

• Returns a new string with all escape characters removedReturns a new string with all escape characters removed

Page 13: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1313

MatchesMatchesprivate Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4})");

private string input1 = “801-224-6707";private string input1 = “801-224-6707";

Match:ValueIndex

LengthSuccess

NextMatch()CapturesGroups

Page 14: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1414

Linked MatchesLinked Matches

Follow links using NextMatch()Follow links using NextMatch() Last link Success == falseLast link Success == false

Match1NextMatch()

Success==true

Match2NextMatch()

Success==true

Match3NextMatch()

Success==true

Match4NextMatch()

Success=false

Page 15: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1515

GroupsGroups

private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");private Regex re1 = new Regex("(([2-9]\d{2})-)?(\d{3})-(\d{4})");

Group:ValueIndex

LengthSuccessCaptures

1

2

0

Captures a matching substring for Captures a matching substring for future usefuture use• Captures in a Captures in a Capture Capture objectobject

Group 0 represents the entire matchGroup 0 represents the entire match

3 4

Page 16: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1616

Named GroupsNamed Groups

(?<name>expression)(?<name>expression) Non-capturing groupNon-capturing group

• (?:expression)(?:expression)

(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})(@"(?:(?<areaCode>[2-9]\d{2})-)?(?<prefix>\d{3})-(?<lastFour>\d{4})

Page 17: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1717

CapturesCaptures

Capture:ValueIndex

Length

private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) ");private Regex re1 = new Regex(@"(([2-9]\d{2})-)?(\d{3})-(\d{4}) "); private string input1 = “801-224-6707";private string input1 = “801-224-6707";

Page 18: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1818

Regex.Match()Regex.Match() Regex.Regex.IsMatchIsMatch(string input, string pattern)(string input, string pattern)

• returns true if input matches pattern at least oncereturns true if input matches pattern at least once

Regex.Regex.MatchMatch(string input, string pattern)(string input, string pattern)• Returns a Match objectReturns a Match object• Use Match.Value to get the string value of the matchUse Match.Value to get the string value of the match

Regex.Regex.MatchesMatches(string input, string pattern)(string input, string pattern)• Returns a MatchCollection of all the occurrences of Returns a MatchCollection of all the occurrences of

pattern in inputpattern in input

Page 19: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 1919

Regex GroupsRegex Groups

GetGroupNames()GetGroupNames()• Returns all the group names in a string[]Returns all the group names in a string[]

GetGroupNumbers()GetGroupNumbers()• Returns the group numbers in an int[]Returns the group numbers in an int[]

GetNameFromNumber()GetNameFromNumber() GetNumberFromName()GetNumberFromName()

Page 20: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 2020

Regex.Split()Regex.Split()

Splits the string on a Regular Expression PatternSplits the string on a Regular Expression Pattern

string input = "one%%two%%%%three%%%four";Console.WriteLine();Console.WriteLine("Split...");Console.WriteLine(string.Join(",", Regex.Split(input, @"[%]+")));Console.ReadLine();

Page 21: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 2121

Regex.ReplaceRegex.Replace

Refer to a group capture in the regex using a $Refer to a group capture in the regex using a $ Replace(string input, string replacement, int count)Replace(string input, string replacement, int count)

string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";string input2 = "aaabbbccc:aaabbbccc:aaabbbccc";

Regex re2 = new Regex("(aaa)(bbb)(ccc)");Regex re2 = new Regex("(aaa)(bbb)(ccc)");

Console.WriteLine();Console.WriteLine();

Console.WriteLine("Replace...");Console.WriteLine("Replace...");

Console.WriteLine(re2.Replace(input2, Console.WriteLine(re2.Replace(input2, "$3$2$1""$3$2$1", 1));, 1));

Console.ReadLine();Console.ReadLine();

Page 22: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 2222

Constructing Strings Constructing Strings Because strings are immutable, building them is Because strings are immutable, building them is

slowslow• Each change creates a new stringEach change creates a new string

Use StringBuilder to speed things upUse StringBuilder to speed things up

Page 23: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 2323

StringBuilderStringBuilder

In System.TextIn System.Text Contains a mutable stringContains a mutable string Allocates space as neededAllocates space as needed Build the string then call:Build the string then call:

• myStringBuilder.ToString()myStringBuilder.ToString()

Page 24: C# Strings 1 C# Regular Expressions CNS 3260 C#.NET Software Development.

C# StringsC# Strings 2424

StringBuilder MembersStringBuilder Members CapacityCapacity IndexerIndexer LengthLength Append()Append() AppendFormat()AppendFormat() Insert()Insert() Remove()Remove() Replace()Replace() ToString()ToString()


Recommended