From the course: Learning Regular Expressions
Define a character set - Regular Expressions Tutorial
From the course: Learning Regular Expressions
Define a character set
- In this chapter, we'll learn about character sets. We will begin by learning two more metacharacters. The open and closed square braces. These allow us to define a custom character set. A character set is going to match any one of several characters, the characters in our set but it's only going to match one character. Be careful about that. The order that we put the characters into our set does not matter. If we had a character set where inside the square braces we had A, E, I, O, and U, then that would match any one vowel. An example of where this might be useful is if we wanted to search a text and find the word "gray." And we want to find it whether it's spelled "G-R-E-Y" or "G-R-A-Y." We could put it in a character set so that we have a regular expression looking for "gr" and then, inside those square braces, "e" or "a" followed by "y." Now be careful, it is only a single character, so if we had "gr" and then in square braces "ea" followed by a "t," it would not match the word "great" because it's a single character. Even though it looks the same visually, it doesn't have the same meaning. Let's try out a few in our RegEx tool. Apples and... bananas and peaches are going to be the text that we're going to search. And for our regular expression, let's just start by putting square braces, A, E, I, O, and U and we'll close our square braces. Now we've defined a character set, and now it tells us it's going to match any one of these things. So you can see it matched the "e" in "apples." It didn't match the "a" because it is case sensitive. Let's go over here at our flags and let's turn on global and now you can see it matches all of those lowercase vowels. Now, if we wanted it to match the uppercase ones as well, well, then we'd need to add A, E, I, O, and U. Now it matches the "a" in "apples," also. As I said, it doesn't matter what order you write these in. If we swap these around and made it U, O, I, E, A, it matches the exact same thing. Let's try another example. Let's erase our fruits here, and let's write in "gray," space, "gray." And we don't want to just find the vowels. Let's try and find anything that matches "gr" and then either "e" or "a" followed by "y." And you can see that now it matches both of those, and as we discussed, it only matches a single character either the "e" or the "a." If we made this into "great" and then tried to see if it would match, "great," it does not. It would match "great" and "grit." Those would both match because it's a single character not two characters. Now, we've only been doing character sets with letters but that's not the only use case for them. For example, we could look for all numbers. Let's do number sign, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Close our square braces. And let's say we were looking for texts where we had "contestant... number one" and "contestant number two" and now if we had "contestant... number 99," we'll notice it doesn't match two of them. It only matches the first one but it is matching any one of those numbers. We can also do that with punctuation. Let's say we had some text and we wanted to find either "notice," "keep off the grass," or "Notice! Keep off the grass." We could come up here and make our regular expression. We're looking for "notice" and then it could either be a colon an exclamation point, a semicolon, a comma, et cetera. Any one of those literal characters can be put into a character set. A character set is just a way of defining a custom set of values that can be in any single character position.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.