Python Regular Expression Exercises
Let’s check out some exercises that will help you understand Regular Expressions better.
Exercise 6-a
From the list keep only the lines that start with a number or a letter after > sign.
Hint 1
You can use findall method from the regex library:
i.e.: re.findall()
Hint 2
w+
can be a meaningful regular expression in this case.
Solution
data = re.findall('>w+', str)
Exercise 6-b
Write a regex so that the full email addresses are extracted.
i.e.: mike@protonmail.com
Hint 1
One way to approach this problem is:
1- include everything that’s non-space before the “@” sign
2- adding the “@” sign
3- everything non-space after the “@” sign.
This example really shows the versatility of regex because with this format, you will catch the emails regardless of different suffixes (.co.uk, .gov.fr, .co.jp etc.)
Hint 2
Regular Expression for everything except space is:
S
: Non-space characters
Hint 3
By combining +
with S
you can apply non-space to one or more characters
i.e.: S+
Solution
regex = r'S+@S+'
Note: So the part inside quotes is purely regex. But you might be wondering what r is doing in front. r’text here’ is a fantastic trick in Python that can help you avoid numerous conflicts such as back slash misinterpretations while typing directories etc.
Raw string can help you remember and understand the function of r.
It’s a good practice to have sometimes, otherwise if you type your string without the r backslashes will be trated as escape characters.
Exercise 6-c
This time write a regex to get only the part of the email before the “@” sign and include the “@” sign.
i.e: only mike@ part from mike@protonmail.com
Hint 1
One way to approach this problem is:
1- include everything that’s non-space before the “@” sign
2- adding the “@” sign
Hint 2
Regular Expression for everything except space is:
S
: Non-space characters
Hint 3
By combining +
with S
you can apply non-space to one or more characters
i.e.: S+
Solution
regex = r'S+@'