I often use the java built-in java.util.Scanner
in order to retrieve information from string streams, but what I wasn't able to do, is to extract some tokens that match a predefined regex, and then consume the character sequence till the end.
For the java built-in scanner example, the regex is used only to define the delimiters, for example the tokens are separated by blank spaces, say ,
\t
, or \n
.
I could not tell it to retrieve well defined information, for example 123a4567-e89b-12d3-a456-123442445670
, especially when it is inside a text mixed with some other tokens.
All I have to do now is to provide the regex example, and I get all the matching tokens from the given text.
Exemple
@Test
public void testUUID() {
final String regex = "[0-9abcdef]{8}(-[0-9abcdef]{4}){3}-[0-9abcdef]{12}";
final String text = "uuid : 6d0a3538-9760-41ae-965d-7aad70021f81\n" +
"uuid : d7d97fb3-3676-4109-9a94-7acc5f593ace\n" +
"uuid : 02e87dd3-10ff-43cf-9572-bd9d151bb439\n" +
"uuid : 632a4c31-8dfe-43a3-8f8d-15b472292cc9";
final List<String> expectedUUIDs = Arrays.asList("6d0a3538-9760-41ae-965d-7aad70021f81",
"d7d97fb3-3676-4109-9a94-7acc5f593ace",
"02e87dd3-10ff-43cf-9572-bd9d151bb439",
"632a4c31-8dfe-43a3-8f8d-15b472292cc9");
final List<String> foundUUIDs = new ArrayList<>();
final RegexScanner regexScanner = new RegexScanner(text, regex);
while (regexScanner.hasNext()) {
foundUUIDs.add(regexScanner.next());
}
Assert.assertArrayEquals(expectedUUIDs.toArray(), foundUUIDs.toArray());
}
This way, the scanner is consumed by the while loop until the end is reached, which means all the token are read and processed.
It is possible also to provide a function that maps the found token to another object using the next(Function<String ,R> mapper)
method.
The code is available on github via this link.
Top comments (0)