String split Examples
This page was last reviewed on Feb 23, 2023.
Dot Net Perls
Split. Often strings are read in from lines of a file. And these lines have many parts, separated by delimiters. With use split() to break them apart.
Regex. Split in Java uses a Regex. A single character (like a comma) can be split upon. Or a more complex pattern (with character codes) can be used.
A simple example. Let's begin with this example. We introduce a string that has 2 commas in it, separating 3 strings (cat, dog, bird). We split on a comma.
Return Split returns a String array. We then loop over that array's elements with a for-each loop. We display them.
public class Program { public static void main(String[] args) { // This string has 3 words separated by commas. String value = "cat,dog,bird"; // Split on a comma. String parts[] = value.split(","); // Display result parts. for (String part : parts) { System.out.println(part); } } }
cat dog bird
Split lines in file. Here we use BufferedReader and FileReader to read in a text file. Then, while looping over it, we split each line. In this way we parse a CSV file with split.
Detail Finally we use the System.out.println method to display each part from each line to the screen.
carrot,squash,turnip potato,spinach,kale
import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; public class Program { public static void main(String[] args) throws IOException { // Open this file. BufferedReader reader = new BufferedReader(new FileReader( "C:\\programs\\file.txt")); // Read lines from file. while (true) { String line = reader.readLine(); if (line == null) { break; } // Split line on comma. String[] parts = line.split(","); for (String part : parts) { System.out.println(part); } System.out.println(); } reader.close(); } }
carrot squash turnip potato spinach kale
Either character. Often data is inconsistent. Sometimes we need to split on a range or set of characters. With split, this is possible. Here we split on a comma and a colon.
Tip With square brackets, we specify the possible characters to split upon. So we split on all colons and commas, with one call.
public class Program { public static void main(String[] args) { String line = "carrot:orange,apple:red"; // Split on comma or colon. String[] parts = line.split("[,:]"); for (String part : parts) { System.out.println(part); } } }
carrot orange apple red
Count, separate words. We can use more advanced character patterns in split. Here we separate a String based on non-word characters. We use "\W+" to mean this.
Detail The pattern means "one or more non-word characters." A plus means "one or more" and a W means non-word.
Note The comma and its following space are treated as a single delimiter. So two characters are matched as one delimiter.
public class Program { public static void main(String[] args) { String line = "hello, how are you?"; // Split on 1+ non-word characters. String[] words = line.split("\\W+"); // Count words. System.out.println(words.length); // Display words. for (String word : words) { System.out.println(word); } } }
4 hello how are you
Numbers. This example splits a string apart and then uses parseInt to convert those parts into ints. It splits on a two-char sequence. Then in a loop, it calls parseInt on each String.
public class Program { public static void main(String[] args) { String line = "1, 2, 3"; // Split on two-char sequence. String[] numbers = line.split(", "); // Display numbers. for (String number : numbers) { int value = Integer.parseInt(number); System.out.println(value + " * 20 = " + value * 20); } } }
1 * 20 = 20 2 * 20 = 40 3 * 20 = 60
Limit. Split accepts an optional second parameter, a limit Integer. If we provide this, the result array has (at most) that many elements. Any extra parts remain part of the last element.
Info To have a limit argument, we must use a Regex. Here we escape the vertical bar so it is treated like a normal char.
Here We get the first 2 parts split apart correctly, and the third part has all the remaining (unsplit) parts.
public class Program { public static void main(String[] args) { String value = "a|b|c|d|e"; // Use limit of just 3 parts. // ... Escape the bar for a Regex. String parts[] = value.split("\\|", 3); // Only 3 elements are in the result array. for (String part : parts) { System.out.println(part); } } }
a b c|d|e
Pattern.compile, split. A split method is available on the Pattern class, found in java.util.regex. We can compile a Pattern and reuse it many times. This can enhance performance.
Note A call to Pattern.compile optimizes all split() calls afterwards. But this only helps if many splits are done.
import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // Separate based on number delimiters. Pattern p = Pattern.compile("\\d+"); String value = "abc100defgh9ij"; String[] elements = p.split(value); // Display our results. for (String element : elements) { System.out.println(element); } } }
abc defgh ij
Benchmark, pattern split. We can improve the speed of splitting strings based on regular expressions by using Pattern.compile. We create a delimiter pattern. Then we call split() with it.
Version 1 This version of the code uses Pattern split(): it reuses the same Pattern instance many times.
Version 2 This code uses split() with a Regex argument, so it does not reuse the same Regex.
Result When many Strings are split, a call Pattern.compile before using its Split method optimizes performance.
import java.util.regex.Pattern; public class Program { public static void main(String[] args) { // ... Create a delimiter pattern. Pattern pattern = Pattern.compile("\\W+"); String line = "cat; dog--ABC"; long t1 = System.currentTimeMillis(); // Version 1: use split method on Pattern. for (int i = 0; i < 1000000; i++) { String[] values = pattern.split(line); if (values.length != 3) { System.out.println(false); } } long t2 = System.currentTimeMillis(); // Version 2: use String split method. for (int i = 0; i < 1000000; i++) { String[] values = line.split("\\W+"); if (values.length != 3) { System.out.println(false); } } long t3 = System.currentTimeMillis(); // ... Benchmark results. System.out.println(t2 - t1); System.out.println(t3 - t2); } }
471 ms, Pattern split 549 ms, String split
Join. This method combines Strings together—we specify our desired delimiter String. Join is sophisticated. It can handle a String array or individual Strings.
Word count. We can count the words in a string by splitting the string on non-word (or space) characters. This is not the fastest method, but it tends to be a fairly accurate one.
Word Count
With split, we use a regular expression-based pattern. But for simple cases, we provide the delimiter itself as the pattern. This too works. Split is elegant and powerful.
Dot Net Perls is a collection of tested code examples. Pages are continually updated to stay current, with code correctness a top priority.
Sam Allen is passionate about computer languages. In the past, his work has been recommended by Apple and Microsoft and he has studied computers at a selective university in the United States.
This page was last updated on Feb 23, 2023 (edit).
© 2007-2024 Sam Allen.