Home
Search
Python re.sub, subn MethodsUse the re.sub and re.subn methods to invoke a method for matching strings.
Re.sub. In regular expressions, sub stands for substitution. The re.sub method applies a method to all matches. It evaluates a pattern, and for each match calls a method (or lambda).
re.match, search
This method can modify strings in complex ways. We can apply transformations, like change numbers within a string. The syntax can be hard to follow.
This example introduces a method "multiply" that receives a match. It accesses group(0) and converts it into an integer. It multiplies that number by two, and converts it to a string.
Sub In the main program body, we call the re.sub method. The first argument is "\d+" which means one or more digit chars.
And The second argument is the multiply method name. We also pass a sample string for processing.
Result The re.sub method matched each group of digits (each number) and the multiply method doubled it.
Python program that uses re.sub
import re def multiply(m): # Convert group 0 to an integer. v = int(m.group(0)) # Multiply integer by 2. # ... Convert back into string and return it. return str(v * 2) # Use pattern of 1 or more digits. # ... Use multiply method as second argument. result = re.sub("\d+", multiply, "10 20 30 40 50") print(result)
20 40 60 80 100
String. Re.sub can replace a pattern match with a simple string. No method call or lambda is required. Here we replace a pattern with the string "ring."
Python program that uses string replacement
import re # An example string. v = "running eating reading" # Replace words starting with "r" and ending in "ing" # ... with a new string. v = re.sub(r"r.*?ing", "ring", v) print(v)
ring eating ring
Subn. Usually re.sub() is sufficient. But another option exists. The re.subn method has an extra feature. It returns a tuple with a count of substitutions in the second element.
Tuple
Tip If you must know the number of substitutions made by re.sub, using re.subn is an ideal choice.
However If your program has no use of this information, using re.sub is probably best. It is simpler and more commonly used.
Python program that calls re.subn
import re def add(m): # Convert. v = int(m.group(0)) # Add 2. return str(v + 1) # Call re.subn. result = re.subn("\d+", add, "1 2 3 4 5") print("Result string:", result[0]) print("Number of substitutions:", result[1])
Result string: 11 21 31 41 51 Number of substitutions: 5
Lambda. A def method name can be used in re.sub. But a lambda offers a more terse alternative. Here we specify a lambda expression directly within the re.sub argument list.
Here We add the string "ing" to the end of all words within the input string. Additional logic could be used to make the results better.
Tip A gerund form of a verb cannot be made this way all the time. Sometimes other spelling changes are needed.
Python program that uses re.sub, lambda
import re # The input string. input = "laugh eat sleep think" # Use lambda to add "ing" to all words. result = re.sub("\w+", lambda m: m.group(0) + "ing", input) # Display result. print(result)
laughing eating sleeping thinking
Dictionary example. The re.sub method can be used with a dictionary. In the method provided to re.sub, we access a dictionary to influence our action.
Here We replace all known "plant" strings with the string PLANT. On other words, modify() takes no action.
Python program that uses re.sub with dictionary
import re plants = {"flower": 1, "tree": 1, "grass": 1} def modify(m): v = m.group(0) # If string is in dictionary, return different string. if v in plants: return "PLANT" # Do not change anything. return v # Modify to remove all strings within the dictionary. result = re.sub("\w+", modify, "bird flower dog fish tree") print(result)
bird PLANT dog fish PLANT
A summary. Re.sub, and its friend re.subn, can replace substrings in arbitrary ways. A method can test the contents of a match and change it using any algorithm.
And with a pattern, we can specify nearly any textual sequence to match. We can change a string to any other string (with a sufficient algorithm).
Home
© 2007-2022 sam allen.
see site info on the changelog.