VB.NET Word Count

The VB.NET programming language

How can we count words in a String? Words are delimited by spaces and some types of punctuation. The easiest way to count words is with a regular expression. We use a Regex with a pattern that matches non-word characters.

Example

Regex type

First, this program imports the System.Text.RegularExpressions namespace. This allows us to use the Regex type directly in the CountWords function. In CountWords, we use the Regex.Matches shared function with the pattern "\S+".

Regex pattern:This means each match consists of one or more non-whitespace characters.

Program that counts words: VB.NET

Imports System.Text.RegularExpressions

Module Module1
    Sub Main()
	' Count words in this string.
	Dim value As String = "To be or not to be, that is the question."
	Dim count1 As Integer = CountWords(value)

	' Count words again.
	value = "Mary had a little lamb."
	Dim count2 As Integer = CountWords(value)

	' Display counts.
	Console.WriteLine(count1)
	Console.WriteLine(count2)
    End Sub

    ''' <summary>
    ''' Use regular expression to count words.
    ''' </summary>
    Public Function CountWords(ByVal value As String) As Integer
	' Count matches.
	Dim collection As MatchCollection = Regex.Matches(value, "\S+")
	Return collection.Count
    End Function
End Module

Output

10
5
Main method

Testing the function. In the Main subroutine, we test the CountWords Function. We use two String literals as arguments to CountWords. Then, we write the word counts of these String literals that were received.

Results:The first sentence, from Shakespeare's play Hamlet, has ten words. The second sentence, from a nursery rhyme, has five words.

Discussion

Question and answer

The critical issue with word-counting functions is their closeness to standard functionality found in programs like Microsoft Word. In most schools, for example, the count from Microsoft Word is acceptable.

Note:The function here was run through a battery of tests against Microsoft Word. It was found to have a difference of 0.022%.

Word Count

Summary

We implemented a word count function in the VB.NET language using the Regex type. By using the Count property on the MatchCollection, we can count the number of words in a way that is close to other programs such as Microsoft Word.


VB.NET: Regex