ChatGPT解决这个技术问题 Extra ChatGPT

How do I replace multiple spaces with a single space in C#?

How can I replace multiple spaces in a string with only one space in C#?

Example:

1 2 3  4    5

would be:

1 2 3 4 5
a state machine can easily do it, but it's probably overkill if you need it only to remove spaces
I've added a benchmark on the different ways to do this in a duplicate question stackoverflow.com/a/37592018/582061 . Regex was not the fastest way to do this.
Unless maybe it's a regional thing where people abbreviate "whitespace" as "space", I don't understand why so many answers are seeking to replace anything other than multiple consecutive space (i.e., ' ', '\u0020', '\x20', (char) 32) characters.

M
Matt

I like to use:

myString = Regex.Replace(myString, @"\s+", " ");

Since it will catch runs of any kind of whitespace (e.g. tabs, newlines, etc.) and replace them with a single space.


Slight modification: Regex.Replace(source, @"(\s)\s+", "$1"); This will return the first whitespace type found. So if you have 5 tabs, it will return a tab. Incase someone prefers this.
@radistao Your link is for Javascript string replace, not for C#.
@Shiva, /\s\s+/ is a standard POSIX regex statement and may be converted/used in any language using own syntax
In the spirit of @F.B.tenKate's solution: Regex.Replace(source, @"(\s)\1+", "$1"); will replace multiple identical consecutive characters by a single one.
in order to remove leading and trailing whitespaces you should use Trim() function with this,,like var myString = Regex.Replace(myString, @"\s+", " ").Trim();
c
chindirala sampath kumar
string sentence = "This is a sentence with multiple    spaces";
RegexOptions options = RegexOptions.None;
Regex regex = new Regex("[ ]{2,}", options);     
sentence = regex.Replace(sentence, " ");

I have copy and paste that and it works. I really do not like REgex but this time it saves my life.
@Craig a comment would suffice, IMO. // This block replaces multiple spaces with one... :)
Really, RegEx is overkill for this.
@Joel: Can't agree. I'm actually sure that this way is more efficient than yours for large enough strings and can be done in one single line. Where's the overkill?
@Oscar Joel’s code isn’t a simple loop through all characters! It’s a hidden nested loop that has a quadratic worst case. This regular expression, by contrast, is linear, only builds up a single string (= drastically reduced allocation costs compared to Joel’s code) and furthermore the engine can optimize the hell out of it (to be honest, I doubt the .NET regex is smart enough for this but in theory this regular expression can be implemented so cheaply that it’s not even funny any more; it only needs a DFA with three states, one transition each, and no additional information).
t
tvanfosson
string xyz = "1   2   3   4   5";
xyz = string.Join( " ", xyz.Split( new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries ));

This is more readable over regex, i prefer it more because i don't need to learn some other syntax
I like it because it doesnt need Regex
This would be inefficient for large strings.
This also removes leading and trailing spaces.
I prefer this answer as well. My old mentor used to say "anytime you have a problem you think you need Regex to solve, well...now you've got TWO problems"
B
Brenda Bell

I think Matt's answer is the best, but I don't believe it's quite right. If you want to replace newlines, you must use:

myString = Regex.Replace(myString, @"\s+", " ", RegexOptions.Multiline);

RegexOptions.Multiline changes the meaning of ^ and $ so they match the beginning and end of every line ($ = \n), instead of the whole multi-line string. Because \s is equivalent to [ \f\n\r\t\v] the newlines should be replaced even if Multiline option is off.
Matt's answer has already covered this. I 'believe' 30 persons just blindfold up-voted this answer :)
c
cuongle

Another approach which uses LINQ:

 var list = str.Split(' ').Where(s => !string.IsNullOrWhiteSpace(s));
 str = string.Join(" ", list);

Love this solution! Is there a downside to this 2022 or why is it not more popular.
F
Fahim Parkar

It's much simpler than all that:

while(str.Contains("  ")) str = str.Replace("  ", " ");

This will be far less efficient than the regex " {2,}" if the string contains sequences of 3 or more spaces.
@JanGoyvaerts: Even with 10 spaces, the regex was slower when I made a quick and dirty test. That being said, it only takes one giant substring full of spaces to completely kill performance of the while loop. For fairness, I used I used RegexOptions.Compiled, rather than the slower Regex.Replace.
RegexOptions.Compiled adds a lot of overhead compiling the regex into IL. Don't use it unless your application will use the regex often enough or on large enough strings that the increased matching speed offsets the decreased compilation speed.
This is an example of extreme inefficient code. LOL.
@pcbabu It's not as bad as it seems for many cases. The Replace() method will handle all occurrences of two spaces in a given string, so we're not looping (and re-allocating a whole string) for every instance of paired spaces in the string. One new allocation will handle all of them. We only re-run the loop when there were 3 or more spaces together, which is likely to be a rarer occurrence for many input sources. If you can show it becomes a problem for your data, then go write the state machine to push character by character into a new stringbuilder.
S
ScubaSteve

Regex can be rather slow even with simple tasks. This creates an extension method that can be used off of any string.

    public static class StringExtension
    {
        public static String ReduceWhitespace(this String value)
        {
            var newString = new StringBuilder();
            bool previousIsWhitespace = false;
            for (int i = 0; i < value.Length; i++)
            {
                if (Char.IsWhiteSpace(value[i]))
                {
                    if (previousIsWhitespace)
                    {
                        continue;
                    }

                    previousIsWhitespace = true;
                }
                else
                {
                    previousIsWhitespace = false;
                }

                newString.Append(value[i]);
            }

            return newString.ToString();
        }
    }

It would be used as such:

string testValue = "This contains     too          much  whitespace."
testValue = testValue.ReduceWhitespace();
// testValue = "This contains too much whitespace."

I like the idea of the extension method although the routine could be optimised.
J
Jan Goyvaerts
myString = Regex.Replace(myString, " {2,}", " ");

N
Nolonar

For those, who don't like Regex, here is a method that uses the StringBuilder:

    public static string FilterWhiteSpaces(string input)
    {
        if (input == null)
            return string.Empty;

        StringBuilder stringBuilder = new StringBuilder(input.Length);
        for (int i = 0; i < input.Length; i++)
        {
            char c = input[i];
            if (i == 0 || c != ' ' || (c == ' ' && input[i - 1] != ' '))
                stringBuilder.Append(c);
        }
        return stringBuilder.ToString();
    }

In my tests, this method was 16 times faster on average with a very large set of small-to-medium sized strings, compared to a static compiled Regex. Compared to a non-compiled or non-static Regex, this should be even faster.

Keep in mind, that it does not remove leading or trailing spaces, only multiple occurrences of such.


If you want to check if the character is whitespace, and not just a space see my answer below.
A
Aleks Andreev

This is a shorter version, which should only be used if you are only doing this once, as it creates a new instance of the Regex class every time it is called.

temp = new Regex(" {2,}").Replace(temp, " "); 

If you are not too acquainted with regular expressions, here's a short explanation:

The {2,} makes the regex search for the character preceding it, and finds substrings between 2 and unlimited times.
The .Replace(temp, " ") replaces all matches in the string temp with a space.

If you want to use this multiple times, here is a better option, as it creates the regex IL at compile time:

Regex singleSpacify = new Regex(" {2,}", RegexOptions.Compiled);
temp = singleSpacify.Replace(temp, " ");

r
ravish.hacker

You can simply do this in one line solution!

string s = "welcome to  london";
s.Replace(" ", "()").Replace(")(", "").Replace("()", " ");

You can choose other brackets (or even other characters) if you like.


You have to make sure your string doesn't have "()" or ")(" in it. Or "wel()come to london)(" becomes "wel come to london". You could try using lots of brackets. So use ((((())))) instead of () and )))))((((( instead of )(. It will still work. Still, if the string contains ((((())))) or )))))(((((, this will fail.
S
Stephen du Buis

no Regex, no Linq... removes leading and trailing spaces as well as reducing any embedded multiple space segments to one space

string myString = "   0 1 2  3   4               5  ";
myString = string.Join(" ", myString.Split(new char[] { ' ' }, 
StringSplitOptions.RemoveEmptyEntries));

result:"0 1 2 3 4 5"


A word of caution : The use of split, while very simple to understand indeed, can have surprisingly negative performance impact. As many strings could be created, you'll have to watch your memory usage in case you handle large strings with this method.
J
Jamshaid K.
// Mysample string
string str ="hi you           are          a demo";

//Split the words based on white sapce
var demo= str .Split(' ').Where(s => !string.IsNullOrWhiteSpace(s));
        
//Join the values back and add a single space in between
str = string.Join(" ", demo);
// output: string str ="hi you are a demo";

J
Jay Bazuzi

Consolodating other answers, per Joel, and hopefully improving slightly as I go:

You can do this with Regex.Replace():

string s = Regex.Replace (
    "   1  2    4 5", 
    @"[ ]{2,}", 
    " "
    );

Or with String.Split():

static class StringExtensions
{
    public static string Join(this IList<string> value, string separator)
    {
        return string.Join(separator, value.ToArray());
    }
}

//...

string s = "     1  2    4 5".Split (
    " ".ToCharArray(), 
    StringSplitOptions.RemoveEmptyEntries
    ).Join (" ");

J
Jay Bazuzi

I just wrote a new Join that I like, so I thought I'd re-answer, with it:

public static string Join<T>(this IEnumerable<T> source, string separator)
{
    return string.Join(separator, source.Select(e => e.ToString()).ToArray());
}

One of the cool things about this is that it work with collections that aren't strings, by calling ToString() on the elements. Usage is still the same:

//...

string s = "     1  2    4 5".Split (
    " ".ToCharArray(), 
    StringSplitOptions.RemoveEmptyEntries
    ).Join (" ");

why create an extension method? why not just use string.Join()?
T
The_Black_Smurf

Many answers are providing the right output but for those looking for the best performances, I did improve Nolanar's answer (which was the best answer for performance) by about 10%.

public static string MergeSpaces(this string str)
{

    if (str == null)
    {
        return null;
    }
    else
    {
        StringBuilder stringBuilder = new StringBuilder(str.Length);

        int i = 0;
        foreach (char c in str)
        {
            if (c != ' ' || i == 0 || str[i - 1] != ' ')
                stringBuilder.Append(c);
            i++;
        }
        return stringBuilder.ToString();
    }

}

M
M.Hassan

Use the regex pattern

    [ ]+    #only space

   var text = Regex.Replace(inputString, @"[ ]+", " ");

P
Paul Easter

I know this is pretty old, but ran across this while trying to accomplish almost the same thing. Found this solution in RegEx Buddy. This pattern will replace all double spaces with single spaces and also trim leading and trailing spaces.

pattern: (?m:^ +| +$|( ){2,})
replacement: $1

Its a little difficult to read since we're dealing with empty space, so here it is again with the "spaces" replaced with a "_".

pattern: (?m:^_+|_+$|(_){2,})  <-- don't use this, just for illustration.

The "(?m:" construct enables the "multi-line" option. I generally like to include whatever options I can within the pattern itself so it is more self contained.


L
Learner1947

I can remove whitespaces with this

while word.contains("  ")  //double space
   word = word.Replace("  "," "); //replace double space by single space.
word = word.trim(); //to remove single whitespces from start & end.

yes but you would only replace two whitespaces with one. This would not help X number of spaces
That While loop will take care of all that double spaces to be removed.
In the loop you are replacing space characters but then with Trim() you are removing all removing leading and trailing whitespace characters beyond just space. After fixing that with Trim(' ') there is then the problem that the question never asked for leading and trailing (white)spaces to be removed. After fixing that by removing Trim(' ') entirely...you've now duplicated this old answer. Also, why post almost-C# code that's a few tweaks away from being made valid?
T
Tom Gullen

Without using regular expressions:

while (myString.IndexOf("  ", StringComparison.CurrentCulture) != -1)
{
    myString = myString.Replace("  ", " ");
}

OK to use on short strings, but will perform badly on long strings with lots of spaces.


A
Ahmed Aljaff

try this method

private string removeNestedWhitespaces(char[] st)
{
    StringBuilder sb = new StringBuilder();
    int indx = 0, length = st.Length;
    while (indx < length)
    {
        sb.Append(st[indx]);
        indx++;
        while (indx < length && st[indx] == ' ')
            indx++;
        if(sb.Length > 1  && sb[0] != ' ')
            sb.Append(' ');
    }
    return sb.ToString();
}

use it like this:

string test = removeNestedWhitespaces("1 2 3  4    5".toCharArray());

This will remove the trailing spaces
sorry for the mistake, i fixed the code, now it's work as expected tested string: " 1 2 3 4 9 " result string: " 1 2 3 4 9 "
R
Reap

Here is a slight modification on Nolonar original answer.

Checking if the character is not just a space, but any whitespace, use this:

It will replace any multiple whitespace character with a single space.

public static string FilterWhiteSpaces(string input)
{
    if (input == null)
        return string.Empty;

    var stringBuilder = new StringBuilder(input.Length);
    for (int i = 0; i < input.Length; i++)
    {
        char c = input[i];
        if (i == 0 || !char.IsWhiteSpace(c) || (char.IsWhiteSpace(c) && 
            !char.IsWhiteSpace(strValue[i - 1])))
            stringBuilder.Append(c);
    }
    return stringBuilder.ToString();
}

Thanks, this helped me out. Small error: strValue should probably be input. Also, IsWhiteSpace includes line breaking characters. You probably don't want to merge multiple line breaks, if only for the fact it will behave differently based on your environment (\r\n vs \n). In this case check for 'CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.SpaceSeparator'.
@OliverSchimmer that's right, thanks for the correction. The added unicode character info is a great addition. Feel free to make an edit! :)
Isn't this a rewrite of this answer? stackoverflow.com/a/33817748/56621
@AlexfromJitbit, it's actually a modification of my answer, which predates that other answer by about 2.5 years.
@Nolonar yes, and I acknowledge that in my answer, hope thats ok
D
Demetris Leptos

How about going rogue?

public static string MinimizeWhiteSpace(
    this string _this)
    {
        if (_this != null)
        {
            var returned = new StringBuilder();
            var inWhiteSpace = false;
            var length = _this.Length;
            for (int i = 0; i < length; i++)
            {
                var character = _this[i];
                if (char.IsWhiteSpace(character))
                {
                    if (!inWhiteSpace)
                    {
                        inWhiteSpace = true;
                        returned.Append(' ');
                    }
                }
                else
                {
                    inWhiteSpace = false;
                    returned.Append(character);
                }
            }
            return returned.ToString();
        }
        else
        {
            return null;
        }
    }

P
Patrick Artner

Mix of StringBuilder and Enumerable.Aggregate() as extension method for strings:

using System;
using System.Linq;
using System.Text;

public static class StringExtension
{
    public static string CondenseSpaces(this string s)
    {
        return s.Aggregate(new StringBuilder(), (acc, c) =>
        {
            if (c != ' ' || acc.Length == 0 || acc[acc.Length - 1] != ' ')
                acc.Append(c);
            return acc;
        }).ToString();
    }

    public static void Main()
    {
        const string input = "     (five leading spaces)     (five internal spaces)     (five trailing spaces)     ";
        
        Console.WriteLine(" Input: \"{0}\"", input);
        Console.WriteLine("Output: \"{0}\"", StringExtension.CondenseSpaces(input));
    }
}

Executing this program produces the following output:

 Input: "     (five leading spaces)     (five internal spaces)     (five trailing spaces)     "
Output: " (five leading spaces) (five internal spaces) (five trailing spaces) "

This is, at first glance, a good and short and straightforward usage of Aggregate(); however, there is a bug in it. Testing acc.Length > 0 clearly prevents an IndexOutOfRange exception for the acc[acc.Length-1] != ' ' condition that follows, but this prevents leading space characters from ever being emitted because acc is empty at that point. I have corrected this to acc.Length == 0 || acc[acc.Length - 1] != ' ' and also expanded the sample code to demonstrate that single and multiple consecutive spaces throughout s are handled correctly.
One optimization you might make is to initialize acc with new StringBuilder(s.Length) since the longest the result string will be — when no replacements are made because s contains no runs of consecutive space characters — is the same length as the input string. Also, I'd suggest a method name like CollapseSpaces() or CondenseSpaces() to more accurately describe what it's doing; "strip" sounds like it's removing all spaces.
@LanceU.Matthews thanks for reading and fixing, you are right. fixed the name.
o
onedaywhen

Old skool:

string oldText = "   1 2  3   4    5     ";
string newText = oldText
                    .Replace("  ", " " + (char)22 )
                    .Replace( (char)22 + " ", "" )
                    .Replace( (char)22 + "", "" );

Assert.That( newText, Is.EqualTo( " 1 2 3 4 5 " ) );

G
Giedrius

I looked over proposed solutions, could not find the one that would handle mix of white space characters acceptable for my case, for example:

Regex.Replace(input, @"\s+", " ") - it will eat your line breaks, if they are mixed with spaces, for example \n \n sequence will be replaced with

Regex.Replace(source, @"(\s)\s+", "$1") - it will depend on whitespace first character, meaning that it again might eat your line breaks

Regex.Replace(source, @"[ ]{2,}", " ") - it won't work correctly when there's mix of whitespace characters - for example "\t \t "

Probably not perfect, but quick solution for me was:

Regex.Replace(input, @"\s+", 
(match) => match.Value.IndexOf('\n') > -1 ? "\n" : " ", RegexOptions.Multiline)

Idea is - line break wins over the spaces and tabs.

This won't handle windows line breaks correctly, but it would be easy to adjust to work with that too, don't know regex that well - may be it is possible to fit into single pattern.


I think this is the answer to a different question. Only spaces — not tabs or newlines or "mix of whitespace characters" — were mentioned in this question, so while this may be good analysis I don't see how this information is relevant here.
Downvotes indicate content that is "not useful" (however the voter chooses to define that) and pushes it down relative to other answers; I exercised mine because this answer, in my opinion, does not provide information that is relevant or useful to the question as asked and, thus, is one more answer to look through — a distraction — when trying to find one that does focus on the posed problem. As I indicated, I don't think this is a bad answer in its own right, I just don't think it belongs here; I'd be surprised if there isn't at least one C# merge-adjacent-whitespace Q somewhere on SO.
B
Bibin Gangadharan

The following code remove all the multiple spaces into a single space

    public string RemoveMultipleSpacesToSingle(string str)
    {
        string text = str;
        do
        {
            //text = text.Replace("  ", " ");
            text = Regex.Replace(text, @"\s+", " ");
        } while (text.Contains("  "));
        return text;
    }

Why do you need the loop? Do you not trust Regex.Replace() to work the first time? Also, since performing the replace only really does anything when a character occurs two or more times in a row, that's what you should match: \s{2,}. Most importantly, though, this does not do what the method name suggests or this question asked: \s matches not just a space but any whitespace character.
V
Vasilis Plavos

You can create a StringsExtensions file with a method like RemoveDoubleSpaces().

StringsExtensions.cs

public static string RemoveDoubleSpaces(this string value)  
{
  Regex regex = new Regex("[ ]{2,}", RegexOptions.None);
  value = regex.Replace(value, " ");

  // this removes space at the end of the value (like "demo ")
  // and space at the start of the value (like " hi")
  value = value.Trim(' ');

  return value;
}

And then you can use it like this:

string stringInput =" hi here     is  a demo ";

string stringCleaned = stringInput.RemoveDoubleSpaces();

This is very inefficient. If the input contains 8 consecutive spaces then the first loop will run 3 times. The StartsWith in the first will have to search the whole string to get a false and if the string is large then that could take time. The second and third loops are unnecessary, the first loop means there can be at most one initial space and at most one final space.
It's one thing — and not at all a bad thing — to leave good code unoptimized in favor of clarity. Even at a glance, though, this is just needlessly inefficient code. Internally, both Contains() and Replace() must use IndexOf() (or something like it) to locate the specified string, so what you're saying is "scan for the specified string to see if it needs to be replaced, which, in turn, requires scanning for it again." This is analogous to if (dict.ContainsKey(key)) value = dict[key]; instead of found = dict.TryGetValue(key, out value);. If a one-size-fits-most (cont.)
(cont.) solution makes the code too hard to read or comprehend then that's where comments, not BCL-method-calls-as-self-documentation, should be used to describe what's going on. As for what you're doing with the *sWith() calls, that can be replaced with value = value.TrimEnd(' ').TrimStart(' '); or, simply, value = value.Trim(' ');, but then removing lone leading or trailing spaces isn't relevant to this question, anyways. If nothing else, there already are several answers that use string.Replace(), and this one is adding nothing new.
Recent changes to this answer mean it is extremely similar to many of the other answers and so it now adds nothing new to the question.
Trimming leading/trailing spaces wasn't part of the question, though, and the extension method is syntactic sugar; a reader can trivially incorporate those into their code, if needed. Ignoring those negligible changes, you've now duplicated the accepted answer, this answer, and two others that use the equivalent pattern " {2,}". I will echo @AdrianHHH's comment and say that this answer is not adding any new, useful information and is, thus, clutter on a question that already has too much of it.