ChatGPT解决这个技术问题 Extra ChatGPT

How to replace case-insensitive literal substrings in Java

Using the method replace(CharSequence target, CharSequence replacement) in String, how can I make the target case-insensitive?

For example, the way it works right now:

String target = "FooBar";
target.replace("Foo", "") // would return "Bar"

String target = "fooBar";
target.replace("Foo", "") // would return "fooBar"

How can I make it so replace (or if there is a more suitable method) is case-insensitive so that both examples return "Bar"?


C
Community
String target = "FOOBar";
target = target.replaceAll("(?i)foo", "");
System.out.println(target);

Output:

Bar

It's worth mentioning that replaceAll treats the first argument as a regex pattern, which can cause unexpected results. To solve this, also use Pattern.quote as suggested in the comments.


What if target contains $ or diacritical characters like á?
I mean two things: 1. "blÁÜ123".replaceAll("(?i)bláü") does not replace anything. 2. "Sentence!End".replaceAll("(?i)Sentence.") does maybe replace more than anticipated.
You can't turn string into regex matching it so simple. It's not correct generally, it will work only for specific cases.
Use Pattern.quote() to protect the search string from being interpreted as a regex. This doe snot address the unicode quirks listed above, but should be fine for basic character sets. e.g. target.replaceAll("(?i)"+Pattern.quote("foo"), "");
Just making sure. Pattern.quote("foo") is not necessary if the string is "foo" right? Only if it is something more fancy, right?
H
Hovercraft Full Of Eels

If you don't care about case, then you perhaps it doesn't matter if it returns all upcase:

target.toUpperCase().replace("FOO", "");

You can also pass the Locale into toUpperCase(locale) if your dealing with characters like á.
i
ilmassa

Regular expressions are quite complex to manage due to the fact that some characters are reserved: for example, "foo.bar".replaceAll(".") produces an empty string, because the dot means "anything" If you want to replace only the point should be indicated as a parameter "\\.".

A simpler solution is to use StringBuilder objects to search and replace text. It takes two: one that contains the text in lowercase version while the second contains the original version. The search is performed on the lowercase contents and the index detected will also replace the original text.

public class LowerCaseReplace 
{
    public static String replace(String source, String target, String replacement)
    {
        StringBuilder sbSource = new StringBuilder(source);
        StringBuilder sbSourceLower = new StringBuilder(source.toLowerCase());
        String searchString = target.toLowerCase();

        int idx = 0;
        while((idx = sbSourceLower.indexOf(searchString, idx)) != -1) {
            sbSource.replace(idx, idx + searchString.length(), replacement);
            sbSourceLower.replace(idx, idx + searchString.length(), replacement);
            idx+= replacement.length();
        }
        sbSourceLower.setLength(0);
        sbSourceLower.trimToSize();
        sbSourceLower = null;

        return sbSource.toString();
    }


    public static void main(String[] args)
    {
        System.out.println(replace("xXXxyyyXxxuuuuoooo", "xx", "**"));
        System.out.println(replace("FOoBaR", "bar", "*"));
    }
}

Works great! Note that "target" must not be null. Clearing sbSourceLower should not be necessary (any more).
Thanks for concise solution and thanks to @msteiger for correction. I wonder why nobody added similar solution to any famous lib like Guava, Apache Commons etc.?
is this better (in performance) than regex based solutions?
This function is good and easy to understand
D
Danubian Sailor

Not as elegant perhaps as other approaches but it's pretty solid and easy to follow, esp. for people newer to Java. One thing that gets me about the String class is this: It's been around for a very long time and while it supports a global replace with regexp and a global replace with Strings (via CharSequences), that last doesn't have a simple boolean parameter: 'isCaseInsensitive'. Really, you'd've thought that just by adding that one little switch, all the trouble its absence causes for beginners especially could have been avoided. Now on JDK 7, String still doesn't support this one little addition!

Well anyway, I'll stop griping. For everyone in particular newer to Java, here's your cut-and-paste deus ex machina. As I said, not as elegant and won't win you any slick coding prizes, but it works and is reliable. Any comments, feel free to contribute. (Yes, I know, StringBuffer is probably a better choice of managing the two character string mutation lines, but it's easy enough to swap the techniques.)

public String replaceAll(String findtxt, String replacetxt, String str, 
        boolean isCaseInsensitive) {
    if (str == null) {
        return null;
    }
    if (findtxt == null || findtxt.length() == 0) {
        return str;
    }
    if (findtxt.length() > str.length()) {
        return str;
    }
    int counter = 0;
    String thesubstr = "";
    while ((counter < str.length()) 
            && (str.substring(counter).length() >= findtxt.length())) {
        thesubstr = str.substring(counter, counter + findtxt.length());
        if (isCaseInsensitive) {
            if (thesubstr.equalsIgnoreCase(findtxt)) {
                str = str.substring(0, counter) + replacetxt 
                    + str.substring(counter + findtxt.length());
                // Failing to increment counter by replacetxt.length() leaves you open
                // to an infinite-replacement loop scenario: Go to replace "a" with "aa" but
                // increment counter by only 1 and you'll be replacing 'a's forever.
                counter += replacetxt.length();
            } else {
                counter++; // No match so move on to the next character from
                           // which to check for a findtxt string match.
            }
        } else {
            if (thesubstr.equals(findtxt)) {
                str = str.substring(0, counter) + replacetxt 
                    + str.substring(counter + findtxt.length());
                counter += replacetxt.length();
            } else {
                counter++;
            }
        }
    }
    return str;
}

this method is utterly slow as its complexity is O(size_str * size_findtext)
g
gouessej

Just make it simple without third party libraries:

    final String source = "FooBar";
    final String target = "Foo";
    final String replacement = "";
    final String result = Pattern.compile(target, Pattern.LITERAL | Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE).matcher(source)
.replaceAll(Matcher.quoteReplacement(replacement));

M
Michał Piątkowski

For non-Unicode characters:

String result = Pattern.compile("(?i)препарат", 
Pattern.UNICODE_CASE).matcher(source).replaceAll("БАД");

M
Michael

org.apache.commons.lang3.StringUtils:

public static String replaceIgnoreCase(String text, String searchString, String replacement)

Case insensitively replaces all occurrences of a String within another String.


C
Community

I like smas's answer that uses replaceAll with a regular expression. If you are going to be doing the same replacement many times, it makes sense to pre-compile the regular expression once:

import java.util.regex.Pattern;

public class Test { 

    private static final Pattern fooPattern = Pattern.compile("(?i)foo");

    private static removeFoo(s){
        if (s != null) s = fooPattern.matcher(s).replaceAll("");
        return s;
    }

    public static void main(String[] args) {
        System.out.println(removeFoo("FOOBar"));
    }
}

A
Asaf Magen
String newstring  = "";
String target2 = "fooBar";
newstring = target2.substring("foo".length()).trim();   
logger.debug("target2: {}",newstring); 
// output: target2: Bar
    
String target3 = "FooBar";
newstring = target3.substring("foo".length()).trim();
logger.debug("target3: {}",newstring); 
// output: target3: Bar