ChatGPT解决这个技术问题 Extra ChatGPT

Is there an alternative to string.Replace that is case-insensitive?

I need to search a string and replace all occurrences of %FirstName% and %PolicyAmount% with a value pulled from a database. The problem is the capitalization of FirstName varies. That prevents me from using the String.Replace() method. I've seen web pages on the subject that suggest

Regex.Replace(strInput, strToken, strReplaceWith, RegexOptions.IgnoreCase);

However for some reason when I try and replace %PolicyAmount% with $0, the replacement never takes place. I assume that it has something to do with the dollar sign being a reserved character in regex.

Is there another method I can use that doesn't involve sanitizing the input to deal with regex special characters?

If "$0" is the variable going in that doesn't impact the regex at all.
As Markus points out, it appears "modern" versions of .NET now have this baked in with good ole StringComparison.OrdinalIgnoreCase as a third parameter.

N
Neuron

Seems like string.Replace should have an overload that takes a StringComparison argument. Since it doesn't, you could try something like this:

public static string ReplaceString(string str, string oldValue, string newValue, StringComparison comparison)
{
    StringBuilder sb = new StringBuilder();

    int previousIndex = 0;
    int index = str.IndexOf(oldValue, comparison);
    while (index != -1)
    {
        sb.Append(str.Substring(previousIndex, index - previousIndex));
        sb.Append(newValue);
        index += oldValue.Length;

        previousIndex = index;
        index = str.IndexOf(oldValue, index, comparison);
    }
    sb.Append(str.Substring(previousIndex));

    return sb.ToString();
}

Nice. I would change ReplaceString to Replace.
Agree with the comments above. This can be made into an extension method with the same method name. Just pop it in a static class with the method signature: public static string Replace(this String str, string oldValue, string newValue, StringComparison comparison)
@Helge, in general, that may be fine, but I have to take arbitrary strings from the user and can not risk the input being meaningful to regex. Of course, I guess I could write a loop and put a backslash in front of each and every character... At that point, I might as well do the above (IMHO).
While unit testing this I ran into the case where it would never return when oldValue == newValue == "".
This is buggy; ReplaceString("œ", "oe", "", StringComparison.InvariantCulture) throws ArgumentOutOfRangeException.
T
Todd White

From MSDN
$0 - "Substitutes the last substring matched by group number number (decimal)."

In .NET Regular expressions group 0 is always the entire match. For a literal $ you need to

string value = Regex.Replace("%PolicyAmount%", "%PolicyAmount%", @"$$0", RegexOptions.IgnoreCase);

in this particular case this is fine, but in cases where the strings are input from outside, one cannot be sure that they do not contain characters which mean something special in regular expressions
You should escape special characters like this: string value = Regex.Replace("%PolicyAmount%", Regex.Escape("%PolicyAmount%"), Regex.Escape("$0"), RegexOptions.IgnoreCase);
Please watch out when using Regex.Escape in Regex.Replace. You'll have to escape all of the three strings passed and call Regex.Unescape on the result!
According to msdn: "Character escapes are recognized in regular expression patterns but not in replacement patterns." ( msdn.microsoft.com/en-us/library/4edbef7e.aspx )
It's best to use: string value = Regex.Replace("%PolicyAmount%", Regex.Escape("%PolicyAmount%"), "$0".Replace("$", "$$"), RegexOptions.IgnoreCase); as replacement recognizes only dolar signs.
C
Community

Kind of a confusing group of answers, in part because the title of the question is actually much larger than the specific question being asked. After reading through, I'm not sure any answer is a few edits away from assimilating all the good stuff here, so I figured I'd try to sum.

Here's an extension method that I think avoids the pitfalls mentioned here and provides the most broadly applicable solution.

public static string ReplaceCaseInsensitiveFind(this string str, string findMe,
    string newValue)
{
    return Regex.Replace(str,
        Regex.Escape(findMe),
        Regex.Replace(newValue, "\\$[0-9]+", @"$$$0"),
        RegexOptions.IgnoreCase);
}

So...

This is an extension method @MarkRobinson

This doesn't try to skip Regex @Helge (you really have to do byte-by-byte if you want to string sniff like this outside of Regex)

Passes @MichaelLiu 's excellent test case, "œ".ReplaceCaseInsensitiveFind("oe", ""), though he may have had a slightly different behavior in mind.

Unfortunately, @HA 's comment that you have to Escape all three isn't correct. The initial value and newValue doesn't need to be.

Note: You do, however, have to escape $s in the new value that you're inserting if they're part of what would appear to be a "captured value" marker. Thus the three dollar signs in the Regex.Replace inside the Regex.Replace [sic]. Without that, something like this breaks...

"This is HIS fork, hIs spoon, hissssssss knife.".ReplaceCaseInsensitiveFind("his", @"he$0r")

Here's the error:

An unhandled exception of type 'System.ArgumentException' occurred in System.dll

Additional information: parsing "The\hisr\ is\ he\HISr\ fork,\ he\hIsr\ spoon,\ he\hisrsssssss\ knife\." - Unrecognized escape sequence \h.

Tell you what, I know folks that are comfortable with Regex feel like their use avoids errors, but I'm often still partial to byte sniffing strings (but only after having read Spolsky on encodings) to be absolutely sure you're getting what you intended for important use cases. Reminds me of Crockford on "insecure regular expressions" a little. Too often we write regexps that allow what we want (if we're lucky), but unintentionally allow more in (eg, Is $10 really a valid "capture value" string in my newValue regexp, above?) because we weren't thoughtful enough. Both methods have value, and both encourage different types of unintentional errors. It's often easy to underestimate complexity.

That weird $ escaping (and that Regex.Escape didn't escape captured value patterns like $0 as I would have expected in replacement values) drove me mad for a while. Programming Is Hard (c) 1842


r
rboarman

Here's an extension method. Not sure where I found it.

public static class StringExtensions
{
    public static string Replace(this string originalString, string oldValue, string newValue, StringComparison comparisonType)
    {
        int startIndex = 0;
        while (true)
        {
            startIndex = originalString.IndexOf(oldValue, startIndex, comparisonType);
            if (startIndex == -1)
                break;

            originalString = originalString.Substring(0, startIndex) + newValue + originalString.Substring(startIndex + oldValue.Length);

            startIndex += newValue.Length;
        }

        return originalString;
    }

}

You may need to handle empty/null string cases.
Mutiple errors in this solution: 1. Check originalString, oldValue and newValue for null. 2. Do not give orginalString back (does not work, simple types are not passed by reference), but assign the value of orginalValue first to a new string and modify it and give it back.
D
DeveloperDan

Seems the easiest method is simply to use the Replace method that ships with .Net and has been around since .Net 1.0:

string res = Microsoft.VisualBasic.Strings.Replace(res, 
                                   "%PolicyAmount%", 
                                   "$0", 
                                   Compare: Microsoft.VisualBasic.CompareMethod.Text);

In order to use this method, you have to add a Reference to the Microsoft.VisualBasic assemblly. This assembly is a standard part of the .Net runtime, it is not an extra download or marked as obsolete.


It works. You need to add a reference to the Microsoft.VisualBasic assembly.
Strange that this method had some problems when I used it (characters at the beginning of line went missing). The most popular answer here from C. Dragon 76 worked as expected.
The problem with this is it returns a NEW string even if a replacement isn't made, where the string.replace( ) returns a pointer to the same string. Can get inefficient if you're doing something like a form letter merge.
Brain2000, you are wrong. All strings in .NET are immutable.
Der_Meister, whilst what you say is correct, that doesn't make what Brain2000 said wrong.
K
Karl Glennon
    /// <summary>
    /// A case insenstive replace function.
    /// </summary>
    /// <param name="originalString">The string to examine.(HayStack)</param>
    /// <param name="oldValue">The value to replace.(Needle)</param>
    /// <param name="newValue">The new value to be inserted</param>
    /// <returns>A string</returns>
    public static string CaseInsenstiveReplace(string originalString, string oldValue, string newValue)
    {
        Regex regEx = new Regex(oldValue,
           RegexOptions.IgnoreCase | RegexOptions.Multiline);
        return regEx.Replace(originalString, newValue);
    }

Which is better way ? what's about stackoverflow.com/a/244933/206730 ? better performance?
V
Vitoc

Inspired by cfeduke's answer, I made this function which uses IndexOf to find the old value in the string and then replaces it with the new value. I used this in an SSIS script processing millions of rows, and the regex-method was way slower than this.

public static string ReplaceCaseInsensitive(this string str, string oldValue, string newValue)
{
    int prevPos = 0;
    string retval = str;
    // find the first occurence of oldValue
    int pos = retval.IndexOf(oldValue, StringComparison.InvariantCultureIgnoreCase);

    while (pos > -1)
    {
        // remove oldValue from the string
        retval = retval.Remove(pos, oldValue.Length);

        // insert newValue in it's place
        retval = retval.Insert(pos, newValue);

        // check if oldValue is found further down
        prevPos = pos + newValue.Length;
        pos = retval.IndexOf(oldValue, prevPos, StringComparison.InvariantCultureIgnoreCase);
    }

    return retval;
}

+1 for not using regex when its not necessary. Sure, you use a few more lines of code, but its much more efficient than regex-based replace unless you need the $ functionality.
C
Community

Expanding on C. Dragon 76's popular answer by making his code into an extension that overloads the default Replace method.

public static class StringExtensions
{
    public static string Replace(this string str, string oldValue, string newValue, StringComparison comparison)
    {
        StringBuilder sb = new StringBuilder();

        int previousIndex = 0;
        int index = str.IndexOf(oldValue, comparison);
        while (index != -1)
        {
            sb.Append(str.Substring(previousIndex, index - previousIndex));
            sb.Append(newValue);
            index += oldValue.Length;

            previousIndex = index;
            index = str.IndexOf(oldValue, index, comparison);
        }
        sb.Append(str.Substring(previousIndex));
        return sb.ToString();
     }
}

M
Mark Cranness

Based on Jeff Reddy's answer, with some optimisations and validations:

public static string Replace(string str, string oldValue, string newValue, StringComparison comparison)
{
    if (oldValue == null)
        throw new ArgumentNullException("oldValue");
    if (oldValue.Length == 0)
        throw new ArgumentException("String cannot be of zero length.", "oldValue");

    StringBuilder sb = null;

    int startIndex = 0;
    int foundIndex = str.IndexOf(oldValue, comparison);
    while (foundIndex != -1)
    {
        if (sb == null)
            sb = new StringBuilder(str.Length + (newValue != null ? Math.Max(0, 5 * (newValue.Length - oldValue.Length)) : 0));
        sb.Append(str, startIndex, foundIndex - startIndex);
        sb.Append(newValue);

        startIndex = foundIndex + oldValue.Length;
        foundIndex = str.IndexOf(oldValue, startIndex, comparison);
    }

    if (startIndex == 0)
        return str;
    sb.Append(str, startIndex, str.Length - startIndex);
    return sb.ToString();
}

M
Markus Hartmair

Since .NET Core 2.0 or .NET Standard 2.1 respectively, this is baked into the .NET runtime [1]:

"hello world".Replace("World", "csharp", StringComparison.CurrentCultureIgnoreCase); // "hello csharp"

[1] https://docs.microsoft.com/en-us/dotnet/api/system.string.replace#System_String_Replace_System_String_System_String_System_StringComparison_


A
Allanrbo

a version similar to C. Dragon's, but for if you only need a single replacement:

int n = myText.IndexOf(oldValue, System.StringComparison.InvariantCultureIgnoreCase);
if (n >= 0)
{
    myText = myText.Substring(0, n)
        + newValue
        + myText.Substring(n + oldValue.Length);
}

B
Brandon

Here is another option for executing Regex replacements, since not many people seem to notice the matches contain the location within the string:

    public static string ReplaceCaseInsensative( this string s, string oldValue, string newValue ) {
        var sb = new StringBuilder(s);
        int offset = oldValue.Length - newValue.Length;
        int matchNo = 0;
        foreach (Match match in Regex.Matches(s, Regex.Escape(oldValue), RegexOptions.IgnoreCase))
        {
            sb.Remove(match.Index - (offset * matchNo), match.Length).Insert(match.Index - (offset * matchNo), newValue);
            matchNo++;
        }
        return sb.ToString();
    }

Could you explain why you're multiplying by MatchNo?
If there is a difference in length between the oldValue and newValue, the string will get longer or shorter as you replace values. match.Index refers to the original location within the string, we need to adjust for that positions movement due to our replacement. Another approach would be to execute the Remove/Insert from right to left.
I get that. That's what the "offset" variable is for. What I don't understand is why you are multiplying by matchNo. My intuition tells me that the location of a match within a string would have no relation to the actual count of previous occurrences.
Never mind, I get it now. The offset needs to be scaled based on the # of occurrences. If you are losing 2 characters each time you need to do a replace, you need to account for that when computing the parameters to the remove method
J
Joel Coehoorn
Regex.Replace(strInput, strToken.Replace("$", "[$]"), strReplaceWith, RegexOptions.IgnoreCase);

This doesn't work. The $ is not in the token. It's in the strReplace With string.
And you can't adapt it for that?
This site is supposed to be a repository for correct answers. Not answers that are almost correct.
c
cfeduke

The regular expression method should work. However what you can also do is lower case the string from the database, lower case the %variables% you have, and then locate the positions and lengths in the lower cased string from the database. Remember, positions in a string don't change just because its lower cased.

Then using a loop that goes in reverse (its easier, if you do not you will have to keep a running count of where later points move to) remove from your non-lower cased string from the database the %variables% by their position and length and insert the replacement values.


By reverse, I mean process the found locations in reverse from furthest to shortest, not traverse the string from the database in reverse.
You could, or you could just use the Regex :)
F
Fredrik Johansson

(Since everyone is taking a shot at this). Here's my version (with null checks, and correct input and replacement escaping) ** Inspired from around the internet and other versions:

using System;
using System.Text.RegularExpressions;

public static class MyExtensions {
    public static string ReplaceIgnoreCase(this string search, string find, string replace) {
        return Regex.Replace(search ?? "", Regex.Escape(find ?? ""), (replace ?? "").Replace("$", "$$"), RegexOptions.IgnoreCase);          
    }
}

Usage:

var result = "This is a test".ReplaceIgnoreCase("IS", "was");

S
Simon Hewitt

Let me make my case and then you can tear me to shreds if you like.

Regex is not the answer for this problem - too slow and memory hungry, relatively speaking.

StringBuilder is much better than string mangling.

Since this will be an extension method to supplement string.Replace, I believe it important to match how that works - therefore throwing exceptions for the same argument issues is important as is returning the original string if a replacement was not made.

I believe that having a StringComparison parameter is not a good idea. I did try it but the test case originally mentioned by michael-liu showed a problem:-

[TestCase("œ", "oe", "", StringComparison.InvariantCultureIgnoreCase, Result = "")]

Whilst IndexOf will match, there is a mismatch between the length of the match in the source string (1) and oldValue.Length (2). This manifested itself by causing IndexOutOfRange in some other solutions when oldValue.Length was added to the current match position and I could not find a way around this. Regex fails to match the case anyway, so I took the pragmatic solution of only using StringComparison.OrdinalIgnoreCase for my solution.

My code is similar to other answers but my twist is that I look for a match before going to the trouble of creating a StringBuilder. If none is found then a potentially large allocation is avoided. The code then becomes a do{...}while rather than a while{...}

I have done some extensive testing against other Answers and this came out fractionally faster and used slightly less memory.

    public static string ReplaceCaseInsensitive(this string str, string oldValue, string newValue)
    {
        if (str == null) throw new ArgumentNullException(nameof(str));
        if (oldValue == null) throw new ArgumentNullException(nameof(oldValue));
        if (oldValue.Length == 0) throw new ArgumentException("String cannot be of zero length.", nameof(oldValue));

        var position = str.IndexOf(oldValue, 0, StringComparison.OrdinalIgnoreCase);
        if (position == -1) return str;

        var sb = new StringBuilder(str.Length);

        var lastPosition = 0;

        do
        {
            sb.Append(str, lastPosition, position - lastPosition);

            sb.Append(newValue);

        } while ((position = str.IndexOf(oldValue, lastPosition = position + oldValue.Length, StringComparison.OrdinalIgnoreCase)) != -1);

        sb.Append(str, lastPosition, str.Length - lastPosition);

        return sb.ToString();
    }