How can we count words in a String? Words are delimited by spaces and some types of punctuation. The easiest way to count words is with a regular expression. We use a Regex with a pattern that matches non-word characters.
First, this program imports the System.Text.RegularExpressions namespace. This allows us to use the Regex type directly in the CountWords function. In CountWords, we use the Regex.Matches shared function with the pattern "\S+".
Regex pattern:This means each match consists of one or more non-whitespace characters.
Program that counts words: VB.NET Imports System.Text.RegularExpressions Module Module1 Sub Main() ' Count words in this string. Dim value As String = "To be or not to be, that is the question." Dim count1 As Integer = CountWords(value) ' Count words again. value = "Mary had a little lamb." Dim count2 As Integer = CountWords(value) ' Display counts. Console.WriteLine(count1) Console.WriteLine(count2) End Sub ''' <summary> ''' Use regular expression to count words. ''' </summary> Public Function CountWords(ByVal value As String) As Integer ' Count matches. Dim collection As MatchCollection = Regex.Matches(value, "\S+") Return collection.Count End Function End Module Output 10 5
Testing the function. In the Main subroutine, we test the CountWords Function. We use two String literals as arguments to CountWords. Then, we write the word counts of these String literals that were received.
Results:The first sentence, from Shakespeare's play Hamlet, has ten words. The second sentence, from a nursery rhyme, has five words.
The critical issue with word-counting functions is their closeness to standard functionality found in programs like Microsoft Word. In most schools, for example, the count from Microsoft Word is acceptable.
Note:The function here was run through a battery of tests against Microsoft Word. It was found to have a difference of 0.022%.Word Count
We implemented a word count function in the VB.NET language using the Regex type. By using the Count property on the MatchCollection, we can count the number of words in a way that is close to other programs such as Microsoft Word.