ChatGPT解决这个技术问题 Extra ChatGPT

Split a string by another string in C#

I've been using the Split() method to split strings, but this only appears to work if you are splitting a string by a character. Is there a way to split a string, with another string being the split by parameter?

I've tried converting the splitter into a character array, with no luck.

In other words, I'd like to split the string:

THExxQUICKxxBROWNxxFOX

by xx, and return an array with values:

THE, QUICK, BROWN, FOX

For future concerns: One of the below comment interested me so I decided to open a discussion on software engineering concerning the nonintuitive (but right) way to do it in the accepted answer.

A
Adam Robinson

In order to split by a string you'll have to use the string array overload.

string data = "THExxQUICKxxBROWNxxFOX";

return data.Split(new string[] { "xx" }, StringSplitOptions.None);

I actually ended up changing my answer to this for 2 reasons: #1: To handle the splits I want to do I would need to use Regex.Escape, because my split string will often contain asterisks, etc. #2: While this program I'm writing needs no real optimization, there does appear to be additional overhead involved with using the Regex Split method.
@Peter: In that post Jon is suggesting it because the poster does not have a fixed delimiter; he is looking to split strings separated by "more than one space" (meaning 2+). For strings delimited by a pattern rather than a value, RegEx is a great (well, the only) option. For fixed-value delimiters, it introduces needless overhead. Try running a test; as the number of operations increases, RegEx ends up taking somewhere around ~10x as long as a corresponding string.Split.
I come from Python to C#. Python supports string split by another string. And I frequently need to come back to this question for a simple answer to string[] Split(string pattern), which is the most natural usage I could think of yet it isn't there. I wrote C before so I am used to char arrays but I still hate to see char[] popping up in a C# code because it suddenly drags my attention from stream level to byte level. Anybody know why C# library guys designed the Split method like this? If there is a good reason, I can probably try to appreciate it despite the inconvenience.
This snippet ranks very high on the the list of things I'd be ashamed of to show to non C# developers.
Why the hell can't we just do data.Split("xx")?
G
Greg

edit: See @Danation's answer for newer/less versbose overload

There is an overload of Split that takes strings.

"THExxQUICKxxBROWNxxFOX".Split(new [] { "xx" }, StringSplitOptions.None);

You can use either of these StringSplitOptions

None - The return value includes array elements that contain an empty string

RemoveEmptyEntries - The return value does not include array elements that contain an empty string

So if the string is "THExxQUICKxxxxBROWNxxFOX", StringSplitOptions.None will return an empty entry in the array for the "xxxx" part while StringSplitOptions.RemoveEmptyEntries will not.


It does not quite "take" strings. It expects an array of chars, you simply used the literal constructor for this.
@SvenMawby Nah, it "literally" has an "overload" for an "array" of "strings". Split(String[], StringSplitOptions)
T
T.Todua
Regex.Split(string, "xx")

is the way I do it usually.

Of course you'll need:

using System.Text.RegularExpressions;

or :

System.Text.RegularExpressions.Regex.Split(string, "xx")

but then again I need that library all the time.


@Brandon: While I'm usually cautioning against premature optimization, you should be aware that a RegEx.Split is quite a bit more costly than a simple String.Split because of the regular expression overhead.
If you want to split by an arbitrary string, use Regex.Escape on the string first, this will escape any regex meta-characters.
one of the key advantages that may pay for overhead is ability to provide string comparison setting
b
bruno conde

There's an overload of String.Split for this:

"THExxQUICKxxBROWNxxFOX".Split(new [] {"xx"}, StringSplitOptions.None);

The only answer which removes the needless array type declaration.
L
Lorenz Lo Sauer

I generally like to use my own extension for that:

string data = "THExxQUICKxxBROWNxxFOX";
var dataspt = data.Split("xx");
//>THE  QUICK  BROWN  FOX 


//the extension class must be declared as static
public static class StringExtension
{   
    public static string[] Split(this string str, string splitter)
    {
        return str.Split(new[] { splitter }, StringSplitOptions.None);
    }
}

This will however lead to an Exception, if Microsoft decides to include this method-overload in later versions. It is also the likely reason why Microsoft has not included this method in the meantime: At least one company I worked for, used such an extension in all their C# projects.

It may also be possible to conditionally define the method at runtime if it doesn't exist.


Alternatively, use params string[] splitter as the second parameter and change new[] {splitter} to splitter to support multiple delimiters.
D
Danation

As of .NET Core 2.0, there is an override that takes a string.

So now you can do "THExxQUICKxxBROWNxxFOX".Split("xx").

See https://docs.microsoft.com/en-us/dotnet/api/system.string.split?view=netcore-2.0#System_String_Split_System_String_System_StringSplitOptions_


M
Matt

The previous answers are all correct. I go one step further and make C# work for me by defining an extension method on String:

public static class Extensions
{
    public static string[] Split(this string toSplit, string splitOn) {
        return toSplit.Split(new string[] { splitOn }, StringSplitOptions.None);
    }
}

That way I can call it on any string in the simple way I naively expected the first time I tried to accomplish this:

"a big long string with stuff to split on".Split("g str");

S
SNag
string data = "THExxQUICKxxBROWNxxFOX";

return data.Replace("xx","|").Split('|');

Just choose the replace character carefully (choose one that isn't likely to be present in the string already)!


@MasoudHosseini: Please read the complete answer; there's already a disclaimer.
@kobe: Because it's a terrible hack.
Works fine, but it is dangerous for generic methods
Posting explanations like, "It's a terrible hack" or "a bad answer" are not helpful. It's simply an opinion without explanation. Instead, stating something like "It's unnecessary to both scan the string for replacements and then scan for split characters since it leads to poor performance." would be a better way to explain yourself. Too many programmers act this way. :(
What if the string contains the | char already, for this reason I think it's dangerous to use.
M
Mohammad

Create this function first.

string[] xSplit(string str, string sep) {
    return str.Split(new [] {sep}, StringSplitOptions.None);
}

Then use it like this.

xSplit("THExxQUICKxxBROWNxxFOX", "xx");

u
user890255

This is also easy:

string data = "THExxQUICKxxBROWNxxFOX";
string[] arr = data.Split("xx".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

But this would also split "THExQUICK" where we do not want it to be splitted
Thanks Rafalon: yes, Greg's is the best answer: data.Split(new string[] { "xx" }, StringSplitOptions.RemoveEmptyEntries)
C
Cole Tobin

The easiest way is to use String.Replace:

string myString = "THExxQUICKxxBROWNxxFOX";
mystring = mystring.Replace("xx", ", ");

Or more simply:

string myString = "THExxQUICKxxBROWNxxFOX".Replace("xx", ", ");

As it is, this won't return an array (as the question asks for), just a string with commas where the xx's were.
And not only that if the string contained additional comma's you would not be able to split out the words correctly.
He is onto something though. If you also chain it with a split. Doubt it is effective, but it is more readable.. var myStrings = "THExxQUICKxxBROWNxxFOX".Replace("xx", "|").Split('|');
@Terje. What if there are already some "|" in the start string ?