ChatGPT解决这个技术问题 Extra ChatGPT

How does String substring work in Swift

I've been updating some of my old code and answers with Swift 3 but when I got to Swift Strings and Indexing with substrings things got confusing.

Specifically I was trying the following:

let str = "Hello, playground"
let prefixRange = str.startIndex..<str.startIndex.advancedBy(5)
let prefix = str.substringWithRange(prefixRange)

where the second line was giving me the following error

Value of type 'String' has no member 'substringWithRange'

I see that String does have the following methods now:

str.substring(to: String.Index)
str.substring(from: String.Index)
str.substring(with: Range<String.Index>)

These were really confusing me at first so I started playing around index and range. This is a followup question and answer for substring. I am adding an answer below to show how they are used.

For those who want to get the substring from the string stackoverflow.com/q/32305891/468724
or subscript string or substring stackoverflow.com/questions/24092884/…

Y
Yaobin Then

https://i.stack.imgur.com/IKS4o.png

All of the following examples use

var str = "Hello, playground"

Swift 4

Strings got a pretty big overhaul in Swift 4. When you get some substring from a String now, you get a Substring type back rather than a String. Why is this? Strings are value types in Swift. That means if you use one String to make a new one, then it has to be copied over. This is good for stability (no one else is going to change it without your knowledge) but bad for efficiency.

A Substring, on the other hand, is a reference back to the original String from which it came. Here is an image from the documentation illustrating that.

No copying is needed so it is much more efficient to use. However, imagine you got a ten character Substring from a million character String. Because the Substring is referencing the String, the system would have to hold on to the entire String for as long as the Substring is around. Thus, whenever you are done manipulating your Substring, convert it to a String.

let myString = String(mySubstring)

This will copy just the substring over and the memory holding old String can be reclaimed. Substrings (as a type) are meant to be short lived.

Another big improvement in Swift 4 is that Strings are Collections (again). That means that whatever you can do to a Collection, you can do to a String (use subscripts, iterate over the characters, filter, etc).

The following examples show how to get a substring in Swift.

Getting substrings

You can get a substring from a string by using subscripts or a number of other methods (for example, prefix, suffix, split). You still need to use String.Index and not an Int index for the range, though. (See my other answer if you need help with that.)

Beginning of a string

You can use a subscript (note the Swift 4 one-sided range):

let index = str.index(str.startIndex, offsetBy: 5)
let mySubstring = str[..<index] // Hello

or prefix:

let index = str.index(str.startIndex, offsetBy: 5)
let mySubstring = str.prefix(upTo: index) // Hello

or even easier:

let mySubstring = str.prefix(5) // Hello

End of a string

Using subscripts:

let index = str.index(str.endIndex, offsetBy: -10)
let mySubstring = str[index...] // playground

or suffix:

let index = str.index(str.endIndex, offsetBy: -10)
let mySubstring = str.suffix(from: index) // playground

or even easier:

let mySubstring = str.suffix(10) // playground

Note that when using the suffix(from: index) I had to count back from the end by using -10. That is not necessary when just using suffix(x), which just takes the last x characters of a String.

Range in a string

Again we simply use subscripts here.

let start = str.index(str.startIndex, offsetBy: 7)
let end = str.index(str.endIndex, offsetBy: -6)
let range = start..<end

let mySubstring = str[range]  // play

Converting Substring to String

Don't forget, when you are ready to save your substring, you should convert it to a String so that the old string's memory can be cleaned up.

let myString = String(mySubstring)

Using an Int index extension?

I'm hesitant to use an Int based index extension after reading the article Strings in Swift 3 by Airspeed Velocity and Ole Begemann. Although in Swift 4, Strings are collections, the Swift team purposely hasn't used Int indexes. It is still String.Index. This has to do with Swift Characters being composed of varying numbers of Unicode codepoints. The actual index has to be uniquely calculated for every string.

I have to say, I hope the Swift team finds a way to abstract away String.Index in the future. But until then, I am choosing to use their API. It helps me to remember that String manipulations are not just simple Int index lookups.


Thx for the desctription. Well deserved uprates. Apple overcomplicated this. Substring should be as easy as string.substring[from...to].
Really good explanation . except one little thing garbage collected ;-) I hope people here know that there is no garbage collection in Swift.
@ChristianAnchorDampf, Thanks for taking the time to comment. I took out garbage collecting. How is the new wording?
Thanks for the detailed explanation! Totally agree with @Teddy - unicode width should be implementation details, as most people don't care about how the bytes actually look like. The API design should be around 95% use cases, and provide low level APIs for people who needs to deal with protocol stack etc.
Apple has really made an awful mess of strings. They should not keep changing them between versions of Swift and you should not have to create extensions to do simple things like substring(). This should be built into the language.
J
Juncheng Tang

I'm really frustrated at Swift's String access model: everything has to be an Index. All I want is to access the i-th character of the string using Int, not the clumsy index and advancing (which happens to change with every major release). So I made an extension to String:

extension String {
    func index(from: Int) -> Index {
        return self.index(startIndex, offsetBy: from)
    }

    func substring(from: Int) -> String {
        let fromIndex = index(from: from)
        return String(self[fromIndex...])
    }

    func substring(to: Int) -> String {
        let toIndex = index(from: to)
        return String(self[..<toIndex])
    }

    func substring(with r: Range<Int>) -> String {
        let startIndex = index(from: r.lowerBound)
        let endIndex = index(from: r.upperBound)
        return String(self[startIndex..<endIndex])
    }
}

let str = "Hello, playground"
print(str.substring(from: 7))         // playground
print(str.substring(to: 5))           // Hello
print(str.substring(with: 7..<11))    // play

The indexes are very useful because a character can be more than one byte. Try let str = "🇨🇭🇩🇪🇺🇸Hello" print(str.substring(to: 2))
Yes, I understand that a character (i.e. extended grapheme cluster) can take multiple bytes. My frustration is why we have to use the verbose index-advancing method to access the characters of a string. Why can't the Swift team just add some overloads to the Core Library to abstract it away. If I type str[5], I want to access the character at index 5, whatever that character appears to be or how many bytes it takes. Isn't Swift all about developer's productivity?
@RenniePet I believe Apple recognizes the problem and changes are coming. As per the Swift Evolution page on GitHub: "Swift 4 seeks to make strings more powerful and easier-to-use, while retaining Unicode correctness by default". It's vague but let's keep our hopes up
@CodeDifferent why apple didn't add subscript character access? So that people understand that it's bad thing to do. Basically if you would do for i in 0..string.count using subscripts that would be double loop, cause under the hood index has to go through each byte of string to find out which is the next character. If you loop using index, you iterate over string only once. Btw, hate this myself, but that's the reasoning behind subscript being not available on string in swift.
@RaimundasSakalauskas that argument doesn't fly by me. C# has both Unicode correctness and integer subscripting, which is really convenient. In Swift 1, Apple wanted developers to use countElement(str) to find the length. In Swift 3, Apple made string not conforming to Sequence and forced everyone to use str.characters instead. These guys are not afraid of making changes. Their stubbornness on integer subscripting in really hard to understand
L
Lou Zell

Swift 5 Extension:

extension String {
    subscript(_ range: CountableRange<Int>) -> String {
        let start = index(startIndex, offsetBy: max(0, range.lowerBound))
        let end = index(start, offsetBy: min(self.count - range.lowerBound, 
                                             range.upperBound - range.lowerBound))
        return String(self[start..<end])
    }

    subscript(_ range: CountablePartialRangeFrom<Int>) -> String {
        let start = index(startIndex, offsetBy: max(0, range.lowerBound))
         return String(self[start...])
    }
}

Usage:

let s = "hello"
s[0..<3] // "hel"
s[3...]  // "lo"

Or unicode:

let s = "😎🤣😋"
s[0..<1] // "😎"

So much better, thank you for posting this extension! I think coming from Python, Swift is much harder than necessary to get used to. It seems for people going in the other direction from Objective C to Swift there is more positive confirmation.
@Leon I just removed it. Prior to 4.1, count was only available on self.characters
Are there any gotchas to watch out with this particular extension? Why didn't Apple do something like this?
You'll also need to add an extension that takes a CountableClosedRange<Int> if you'd like to write e.g. s[0...2].
@ChrisFrederick and CountablePartialRangeFrom<Int> for s[2...].
C
Community

Swift 4 & 5:

extension String {
  subscript(_ i: Int) -> String {
    let idx1 = index(startIndex, offsetBy: i)
    let idx2 = index(idx1, offsetBy: 1)
    return String(self[idx1..<idx2])
  }

  subscript (r: Range<Int>) -> String {
    let start = index(startIndex, offsetBy: r.lowerBound)
    let end = index(startIndex, offsetBy: r.upperBound)
    return String(self[start ..< end])
  }

  subscript (r: CountableClosedRange<Int>) -> String {
    let startIndex =  self.index(self.startIndex, offsetBy: r.lowerBound)
    let endIndex = self.index(startIndex, offsetBy: r.upperBound - r.lowerBound)
    return String(self[startIndex...endIndex])
  }
}

How to use it:

"abcde"[0] --> "a" "abcde"[0...2] --> "abc" "abcde"[2..<4] --> "cd"


C
Community

Swift 4

In swift 4 String conforms to Collection. Instead of substring, we should now use a subscript. So if you want to cut out only the word "play" from "Hello, playground", you could do it like this:

var str = "Hello, playground"
let start = str.index(str.startIndex, offsetBy: 7)
let end = str.index(str.endIndex, offsetBy: -6)
let result = str[start..<end] // The result is of type Substring

It is interesting to know, that doing so will give you a Substring instead of a String. This is fast and efficient as Substring shares its storage with the original String. However sharing memory this way can also easily lead to memory leaks.

This is why you should copy the result into a new String, once you want to clean up the original String. You can do this using the normal constructor:

let newString = String(result)

You can find more information on the new Substring class in the [Apple documentation].1

So, if you for example get a Range as the result of an NSRegularExpression, you could use the following extension:

extension String {

    subscript(_ range: NSRange) -> String {
        let start = self.index(self.startIndex, offsetBy: range.lowerBound)
        let end = self.index(self.startIndex, offsetBy: range.upperBound)
        let subString = self[start..<end]
        return String(subString)
    }

}

Your code will crash if range.upperBound is > length of string. Also, a sample usage would have been helpful as well, as I wasn't familiar with subscripts in Swift. You could include something like datePartOnly = "2018-01-04-08:00"[NSMakeRange(0, 10)]. Other than that, very nice answer, +1 :).
nowadays it is this weird thing: text[Range( nsRange , in: text)!]
M
Mahima Srivastava

Came across this fairly short and simple way of achieving this.

var str = "Hello, World"
let arrStr = Array(str)
print(arrStr[0..<5]) //["H", "e", "l", "l", "o"]
print(arrStr[7..<12]) //["W", "o", "r", "l", "d"]
print(String(arrStr[0..<5])) //Hello
print(String(arrStr[7..<12])) //World

N
Nikesh Jha

Here's a function that returns substring of a given substring when start and end indices are provided. For complete reference you can visit the links given below.

func substring(string: String, fromIndex: Int, toIndex: Int) -> String? {
    if fromIndex < toIndex && toIndex < string.count /*use string.characters.count for swift3*/{
        let startIndex = string.index(string.startIndex, offsetBy: fromIndex)
        let endIndex = string.index(string.startIndex, offsetBy: toIndex)
        return String(string[startIndex..<endIndex])
    }else{
        return nil
    }
}

Here's a link to the blog post that I have created to deal with string manipulation in swift. String manipulation in swift (Covers swift 4 as well)

Or you can see this gist on github


R
Rio Bautista

I had the same initial reaction. I too was frustrated at how syntax and objects change so drastically in every major release.

However, I realized from experience how I always eventually suffer the consequences of trying to fight "change" like dealing with multi-byte characters which is inevitable if you're looking at a global audience.

So I decided to recognize and respect the efforts exerted by Apple engineers and do my part by understanding their mindset when they came up with this "horrific" approach.

Instead of creating extensions which is just a workaround to make your life easier (I'm not saying they're wrong or expensive), why not figure out how Strings are now designed to work.

For instance, I had this code which was working on Swift 2.2:

let rString = cString.substringToIndex(2)
let gString = (cString.substringFromIndex(2) as NSString).substringToIndex(2)
let bString = (cString.substringFromIndex(4) as NSString).substringToIndex(2)

and after giving up trying to get the same approach working e.g. using Substrings, I finally understood the concept of treating Strings as a bidirectional collection for which I ended up with this version of the same code:

let rString = String(cString.characters.prefix(2))
cString = String(cString.characters.dropFirst(2))
let gString = String(cString.characters.prefix(2))
cString = String(cString.characters.dropFirst(2))
let bString = String(cString.characters.prefix(2))

I hope this contributes...


Well, dealing with a complex problem does not mean that the solution could be elegant. Again, I also understand the problem, but the entire String class and dealing with it is just horrible.
t
t1ser

I'm quite mechanical thinking. Here are the basics...

Swift 4 Swift 5

  let t = "abracadabra"

  let start1 = t.index(t.startIndex, offsetBy:0)
  let   end1 = t.index(t.endIndex, offsetBy:-5)
  let start2 = t.index(t.endIndex, offsetBy:-5)
  let   end2 = t.index(t.endIndex, offsetBy:0)

  let t2 = t[start1 ..< end1]
  let t3 = t[start2 ..< end2]                

  //or a shorter form 

  let t4 = t[..<end1]
  let t5 = t[start2...]

  print("\(t2) \(t3) \(t)")
  print("\(t4) \(t5) \(t)")

  // result:
  // abraca dabra abracadabra

The result is a substring, meaning that it is a part of the original string. To get a full blown separate string just use e.g.

    String(t3)
    String(t4)

This is what I use:

    let mid = t.index(t.endIndex, offsetBy:-5)
    let firstHalf = t[..<mid]
    let secondHalf = t[mid...]

S
Suragch

I am new in Swift 3, but looking the String (index) syntax for analogy I think that index is like a "pointer" constrained to string and Int can help as an independent object. Using the base + offset syntax , then we can get the i-th character from string with the code bellow:

let s = "abcdefghi"
let i = 2
print (s[s.index(s.startIndex, offsetBy:i)])
// print c

For a range of characters ( indexes) from string using String (range) syntax we can get from i-th to f-th characters with the code bellow:

let f = 6
print (s[s.index(s.startIndex, offsetBy:i )..<s.index(s.startIndex, offsetBy:f+1 )])
//print cdefg

For a substring (range) from a string using String.substring (range) we can get the substring using the code bellow:

print (s.substring (with:s.index(s.startIndex, offsetBy:i )..<s.index(s.startIndex, offsetBy:f+1 ) ) )
//print cdefg

Notes:

The i-th and f-th begin with 0. To f-th, I use offsetBY: f + 1, because the range of subscription use ..< (half-open operator), not include the f-th position. Of course must include validate errors like invalid index.


T
Tall Dane

Same frustration, this should not be that hard...

I compiled this example of getting positions for substring(s) from larger text:

//
// Play with finding substrings returning an array of the non-unique words and positions in text
//
//

import UIKit

let Bigstring = "Why is it so hard to find substrings in Swift3"
let searchStrs : Array<String>? = ["Why", "substrings", "Swift3"]

FindSubString(inputStr: Bigstring, subStrings: searchStrs)


func FindSubString(inputStr : String, subStrings: Array<String>?) ->    Array<(String, Int, Int)> {
    var resultArray : Array<(String, Int, Int)> = []
    for i: Int in 0...(subStrings?.count)!-1 {
        if inputStr.contains((subStrings?[i])!) {
            let range: Range<String.Index> = inputStr.range(of: subStrings![i])!
            let lPos = inputStr.distance(from: inputStr.startIndex, to: range.lowerBound)
            let uPos = inputStr.distance(from: inputStr.startIndex, to: range.upperBound)
            let element = ((subStrings?[i])! as String, lPos, uPos)
            resultArray.append(element)
        }
    }
    for words in resultArray {
        print(words)
    }
    return resultArray
}

returns ("Why", 0, 3) ("substrings", 26, 36) ("Swift3", 40, 46)


That is some code, but does not really explain how string indexing and substrings work in swift3.
P
Peter Kreinz

Swift 4+

extension String {
    func take(_ n: Int) -> String {
        guard n >= 0 else {
            fatalError("n should never negative")
        }
        let index = self.index(self.startIndex, offsetBy: min(n, self.count))
        return String(self[..<index])
    }
}

Returns a subsequence of the first n characters, or the entire string if the string is shorter. (inspired by: https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.text/take.html)

Example:

let text = "Hello, World!"
let substring = text.take(5) //Hello

J
Just a coder

Swift 4

extension String {
    subscript(_ i: Int) -> String {
        let idx1 = index(startIndex, offsetBy: i)
        let idx2 = index(idx1, offsetBy: 1)
        return String(self[idx1..<idx2])
    }
}

let s = "hello"

s[0]    // h
s[1]    // e
s[2]    // l
s[3]    // l
s[4]    // o

Now try this in a string of a million characters.
why? what happens?
L
Lucas Algarra

I created a simple extension for this (Swift 3)

extension String {
    func substring(location: Int, length: Int) -> String? {
        guard characters.count >= location + length else { return nil }
        let start = index(startIndex, offsetBy: location)
        let end = index(startIndex, offsetBy: location + length)
        return substring(with: start..<end)
    }
}

L
Leslie Godwin

Heres a more generic implementation:

This technique still uses index to keep with Swift's standards, and imply a full Character.

extension String
{
    func subString <R> (_ range: R) -> String? where R : RangeExpression, String.Index == R.Bound
    {
        return String(self[range])
    }

    func index(at: Int) -> Index
    {
        return self.index(self.startIndex, offsetBy: at)
    }
}

To sub string from the 3rd character:

let item = "Fred looks funny"
item.subString(item.index(at: 2)...) // "ed looks funny"

I've used camel subString to indicate it returns a String and not a Substring.


J
Jeremy Andrews

Building on the above I needed to split a string at a non-printing character dropping the non-printing character. I developed two methods:

var str = "abc\u{1A}12345sdf"
let range1: Range<String.Index> = str.range(of: "\u{1A}")!
let index1: Int = str.distance(from: str.startIndex, to: range1.lowerBound)
let start = str.index(str.startIndex, offsetBy: index1)
let end = str.index(str.endIndex, offsetBy: -0)
let result = str[start..<end] // The result is of type Substring
let firstStr = str[str.startIndex..<range1.lowerBound]

which I put together using some of the answers above.

Because a String is a collection I then did the following:

var fString = String()
for (n,c) in str.enumerated(){

*if c == "\u{1A}" {
    print(fString);
    let lString = str.dropFirst(n + 1)
    print(lString)
    break
   }
 fString += String(c)
}*

Which for me was more intuitive. Which one is best? I have no way of telling They both work with Swift 5


Thanks for your answer. Is anything different about Strings in Swift 5? I haven't had time to play around with it yet.
They say so but I have not had a chance to look into it.
H
Harshit Jain
var str = "VEGANISM"
print (str[str.index(str.startIndex, offsetBy:2)..<str.index(str.endIndex, offsetBy: -1)] )

//Output-> GANIS

Here, str.startIndex and str.endIndex is the starting index and ending index of your string.

Here as the offsetBy in startIndex = 2 -> str.index(str.startIndex, offsetBy:2) therefore the trimmed string will have starting from index 2 (i.e. from second character) and offsetBy in endIndex = -1 -> str.index(str.endIndex, offsetBy: -1) i.e. 1 character is being trimmed from the end.

var str = "VEGANISM"
print (str[str.index(str.startIndex, offsetBy:0)..<str.index(str.endIndex, offsetBy: 0)] )

//Output-> VEGANISM

As the offsetBy value = 0 on both sides i.e., str.index(str.startIndex, offsetBy:0) and str.index(str.endIndex, offsetBy: 0) therefore, the complete string is being printed


S
Seungjun

I created an simple function like this:

func sliceString(str: String, start: Int, end: Int) -> String {
    let data = Array(str)
    return String(data[start..<end])
}

you can use it in following way

print(sliceString(str: "0123456789", start: 0, end: 3)) // -> prints 012

C
CAHbl463

Swift 4

"Substring" (https://developer.apple.com/documentation/swift/substring):

let greeting = "Hi there! It's nice to meet you! 👋"
let endOfSentence = greeting.index(of: "!")!
let firstSentence = greeting[...endOfSentence]
// firstSentence == "Hi there!"

Example of extension String:

private typealias HowDoYouLikeThatElonMusk = String
private extension HowDoYouLikeThatElonMusk {

    subscript(_ from: Character?, _ to: Character?, _ include: Bool) -> String? {
        if let _from: Character = from, let _to: Character = to {
            let dynamicSourceForEnd: String = (_from == _to ? String(self.reversed()) : self)
            guard let startOfSentence: String.Index = self.index(of: _from),
                let endOfSentence: String.Index = dynamicSourceForEnd.index(of: _to) else {
                return nil
            }

            let result: String = String(self[startOfSentence...endOfSentence])
            if include == false {
                guard result.count > 2 else {
                        return nil
                }
                return String(result[result.index(result.startIndex, offsetBy: 1)..<result.index(result.endIndex, offsetBy: -1)])
            }
            return result
        } else if let _from: Character = from {
            guard let startOfSentence: String.Index = self.index(of: _from) else {
                return nil
            }
            let result: String = String(self[startOfSentence...])
            if include == false {
                guard result.count > 1 else {
                    return nil
                }
                return String(result[result.index(result.startIndex, offsetBy: 1)...])
            }
            return result
        } else if let _to: Character = to {
            guard let endOfSentence: String.Index = self.index(of: _to) else {
                    return nil
            }
            let result: String = String(self[...endOfSentence])
            if include == false {
                guard result.count > 1 else {
                    return nil
                }
                return String(result[..<result.index(result.endIndex, offsetBy: -1)])
            }
            return result
        }
        return nil
    }
}

example of using the extension String:

let source =                                   ">>>01234..56789<<<"
// include = true
var from =          source["3", nil, true]  //       "34..56789<<<"
var to =            source[nil, "6", true]  // ">>>01234..56"
var fromTo =        source["3", "6", true]  //       "34..56"
let notFound =      source["a", nil, true]  // nil
// include = false
from =              source["3", nil, false] //        "4..56789<<<"
to =                source[nil, "6", false] // ">>>01234..5"
fromTo =            source["3", "6", false] //        "4..5"
let outOfBounds =   source[".", ".", false] // nil

let str = "Hello, playground"
let hello = str[nil, ",", false] // "Hello"

L
Louis Lac

The specificity of String has mostly been addressed in other answers. To paraphrase: String has a specific Index which is not of type Int because string elements do not have the same size in the general case. Hence, String does not conform to RandomAccessCollection and accessing a specific index implies the traversal of the collection, which is not an O(1) operation.

Many answers have proposed workarounds for using ranges, but they can lead to inefficient code as they use String methods (index(from:), index(:offsetBy:), ...) that are not O(1).

To access string elements like in an array you should use an Array:

let array = Array("Hello, world!")
let letter = array[5]

This is a trade-off, the array creation is an O(n) operation but array accesses are then O(1). You can convert back to a String when you want with String(array).


This seems like it would be a good option if you are manipulating your own text. However, if it's coming from users, you run into problems with surrogate pairs and grapheme clusters.
Sure, this should be used with caution and the user must know what he is doing.
g
gobuzov

Swift 5

// imagine, need make substring from 2, length 3

let s = "abcdef"    
let subs = s.suffix(s.count-2).prefix(3) 

// now subs = "cde"


W
Wimukthi Rajapaksha

Swift 5
let desiredIndex: Int = 7 let substring = str[String.Index(encodedOffset: desiredIndex)...]
This substring variable will give you the result.
Simply here Int is converted to Index and then you can split the strings. Unless you will get errors.


This is wrong. A Character might consist of one or more bytes. It only works with ascii text.