Regular Expressions interview Q&As for Java developers

Q1 What’s the difference between a wildcard and a regular expression?
A1 A wildcard is a generic term referring to something that can be substituted for all possibilities. In computer terms, usually a simple “wildcard” is just a * that can match one or more characters, and possibly a ? that can match any single character.

A regular expression (aka regex)certainly can do what a wild card can, and is a much more powerful pattern matcher that gives you the ability to restrict the type of characters.

To match somethiong like 123-456, 123-245, etc

A wildcard does

A regular expression does

where “\” is an escape character, and “\d” means a digit. {3} means 3 times.

The Unix commandd shown below uses the wildcard character “?” in finding file names like job1.log, job2.log, job3.log, etc, and the grep command uses the regex to find contents within a file like “123-456”.

The output of the above command is

Q2 Why is regex very powerful? Where can we use them?
A2 Regular expressions are also known as regex, and they are very powerful and used widely in JavaScript, Unix scripting, development tools, monitoring tools like Tivoli, and programming languages. So, it really pays to have good knowledge of regexes. You can use online regex tools like regexpal.com or RegexBuddy to craft your regular expressions.

#1. Java APIs take regex as arguments. Splits it on “,” with 0 or more leading and trailing spaces.

#2. Tools like Notepad++ provide regex based find and find/replace. The following comma separated text

can be converted to quoted CSV text as shown below.

Now let’s see how we can use notepad++ find/replace function to add quotes around each entry using its find/replace with regular expressions as shown below.

regex in Notepad++

regex in Notepad++

As you can see

Find what regular expression is: ([^,]*)(,?) , which means 0 or more characters but “,” as first group stored in “\1” followed by 0 or 1 “,” (i.e. optional ,), and grouped as \2.

Replace with regular expression is: “\1″\2 where ” is added then followed by \1, which is the value captured like BA555 and then followed by “, and followed by optional “,”.

Similar approaches can be used for formatting other bulk records like removing spaces, adding quotes, replacing new line characters with tabs, replacing tabs with new lines, replacing commas with new lines, etc.

#3. Notepad++ with regex to construct SQL clause or any other data conversion.

Say you have data as shown below. This data could come from an excel spread sheet, word document, or copied from a confluence or wiki page.

The SQL we need is:

This is how you go about converting it:

Step 1: Copy the data to Notepad++ and delete the header row by highlighting it and pressing the delete button.

Step 2: You need to now remove all the columns except rule_name column. To do this place the cursor LHS of first two columns and press “ALT + SHIFT” keys together and highlight the columns you want to remove with the mouse. Do the same for the last column as well.

Notepad++ crop columns

Notepad++ crop columns

Step 3: Next step is to remove any leading or trailing spaces. Use regex based find and replace command. Pressing CTR+ F will bring the Find dialog . You can also select it from the “Search” menu at the top.

In the pop up find dialog, select the “replace” tab. Enter the find and replace value as shown below. Make sure the “Regular expression” option and “Wrap around” check box are ticked.

Notepad++ regex

Notepad++ regex

Step 4: Remove the new line characters or carriage return by finding and replacing with the “Extended …” option turned on as shown below.

regex-notepad-sql-3

Replace new line with nothing.

Step 5: You need to put a single quote (‘) around the entries for the SQL query. regex is again to the rescue.

The parentheses ‘( )’ are used to capture the values. and \1 and \2 represent both the captured values. The ‘ is add before \1 and \2. Where \1 is the value like “asx100_rule” and \2 is “,”. The * means 0 or many, and ? means 0 or 1.

You can now take the single line text and put it in your where clause. This is very handy when you have to work with larger data.

#4. Harnessing the power of Unix scripts and regex.

For example, if you have a pipe delimited text as shown below

You can remove the last pipe with regex “|$“, “-e” means expression,

Remove the starting pipe with regex “^|“. “g” means global change

to break the commands into multiple lines, use “|”

#5. For validating user input in Java and JavaScript with regex. For example, to validate if the name entered is alphanumeric.

#6. XSLT 2.0 makes text handling in XML documents a whole lot easier with its regular expression support.

#7. Regex can be used in Maven pom.xml files.

There are a lot more examples.


Categories Menu - Q&As, FAQs & Tutorials

Top