Remove
HTML tagsIn developing with the Swift 5.8 language, we may need to eliminate HTML markup from a string
to enable further processing. This can be done with an iterative loop.
By detecting when the angle brackets are opened and closed, we can detect markup tags. This allows us to write a reliable markup-removing function.
Consider this example and the stripHtml
function. The function receives a String
, and returns a new String
containing all data except the HTML tags.
String
. In newer versions of Swift, this can be done directly.String
and return it.func stripHtml(source: String) -> String { var data = [Character]() var inside = false // Step 1: loop over string, and append chars not inside markup tags starting and ending with brackets. for c in source { if c == "<" { inside = true continue } if c == ">" { inside = false continue } if !inside { data.append(c) } } // Step 2: return new string. return String(data) } // Use the strip html function on this string. let input = "<p>Hello <b>world</b>!</p>" let result = stripHtml(source: input) print(input) print(result)<p>Hello <b>world</b>!</p> Hello world!
The simple parser we created adequately removes the HTML markup from our Swift string
. Note that this approach can fail with comments containing HTML tags.
Swift provides string
-processing abilities like looping over Characters that can be used to remove HTML tags. Other similar functions can convert strings—for example, the ROT13 cipher.