Advanced array slicing

Tip

21 September 2020

Whenever your work with numbers, you might want to define special ones that act a bit differently from others, while keeping their value. For example, it could be useful to define a first and last integers to be used in a range to target the first and last valid indexes of an array.

When working on array slicing with Scout, I had to find a way to do so: to specify to target the beginning or the end of the array, while letting the possibility to use integers. If you are interested in this topic, read on!

A playground with the overall implementation can be found here. This link is a direct download.

Positive slicing

To slice an array, let's define a Bounds object to specify the lower and upper bounds to target in the array:

struct Bounds {
  let lower: Int
  let upper: Int
}

A first solution would be to extend Int to define static lower and upper bounds:

extension Int {
  static let first = 0
  static let last = -1
}

This approach is perfectly fine when working with positive slicing, as the first index of an array can only be 0 (not talking about the ArraySlice from Foundation here), and the last is determined when the array is exposed. For example, to target all the elements but the first one, we could define our Bounds like so:

let bounds = Bounds(lower: 1, upper: .last)

When the array is exposed, we can have our Bounds object to replace the last bound with the array last valid index.
For example we could define a range function that takes the array last valid index as parameter, and returns a range to slice the array.

public func range(lastValidIndex: Int) throws -> ClosedRange<Int> {
  let upper = self.upper == .last ?
    lastValidIndex :
    self.upper

  guard 0 <= lower, lower <= upper, upper <= lastValidIndex else {
      // throw an error indicating that the bounds are not valid
  }
  return lower...upper
}

Finally, as the upper bound can be negative, it would not be relevant to let another object read the Bounds properties. To enforce that, they will marked as private. So here is the final Bounds structure.

struct Bounds {
  private let lower: Int
  private let upper: Int

  public init(lower: Int, upper: Int) {
    self.lower = lower
    self.upper = upper
  }

  public func range(lastValidIndex: Int) throws -> ClosedRange<Int> {
    let upper = self.upper == .last ?
      lastValidIndex :
      self.upper

    guard 0 <= lower, lower <= upper, upper <= lastValidIndex else {
        // throw an error indicating that the bounds are not valid
    }
    return lower...upper
  }
}

Negative slicing

Now, what if we want to be able to also target the elements from the end ? For example the last 5 elements? It's not possible yet because only positive integers can be specified, indicating the index to target counting from the beginning of the array. An interesting solution is then to use negative indexes as in Python to specify an index counting from the end of the array.
Here is an array to explain that.

["Daisy", "Donald", "Loulou", "Fifi", "Riri", "Fifi", "Loulou", "Donald", "Daisy"]
[  -4   ,    -3   ,    -2   ,   -1  ,    0  ,    1  ,    2    ,    3    ,    4   ]

The goal is to have something like that, here to specifiy the last 5 elements:

let bounds = Bounds(lower: -4, upper: .last)

This requires to change the range function to modify the lower bound if it is negative.

let lower = self.lower < 0 ?
    lastValidIndex + self.lower :
    self.lower

But now, we face another problem: as the index -1 is used to specify the index before the last valid index, there is a conflict with our last index! The bounds Bounds(lower: -4, upper: .last) cannot be differentiated from Bounds(lower: -4, upper: -1).

The problem is that once a bound is defined, whether with a standard integer or one of the static constants, there is no way to keep the information to reuse it in the other functions. We canot know if -1 targets the last index with positive slicing or the index before the last index with negative slicing.

🤔Let's scratch our head a bit.
(scratching...)
(scratching...)

So if we need a way to keep that information, maybe we could define a wrapper structure around the bound value. Let's try that.
For usage simplicity, we will make this new structure implement ExpressibleByIntegerLiteral and Equatable. Fortunately, most of the requirements will be provided automatically by the compiler. We'll just have to implement the init(integerLiteral:) initialiser.

public extension Bounds {

    struct Bound: ExpressibleByIntegerLiteral, Equatable {
        var value: Int

        public init(integerLiteral value: Int) {
            self.value = value
        }

        public init(_ value: Int) {
            self.value = value
        }
    }
}

Now, let's find a way to add a specifity to a Bound object, like being the first or last element. The best solution I came up with was to add an optional private String identifier, and add a new private initialiser to set its value. This way, we will be able to define static constants inside the Bound struct and only inside it.

public extension Bounds {

    struct Bound: ExpressibleByIntegerLiteral, Equatable {
        public static let first = Bound(0, identifier: "first")
        public static let last = Bound(0, identifier: "last")

        var value: Int
        private(set) var identifier: String?

        public init(integerLiteral value: Int) {
            self.value = value
        }

        public init(_ value: Int) {
            self.value = value
        }

        private init(_ value: Int, identifier: String) {
            self.value = value
            self.identifier = identifier
        }
    }
}

In the above, the first and last properties can be used anywhere, while it is not possible create new static Bound constants with a non-nil identifier. Finally, the bound Bound(-1) will be different from Bound.last since Bound.last has a non-nil identifier property. That's exactly our goal!

The 0 value chosen to initalise the static first and last constants is almost arbitrary. For the first constant, this will allow us to easily set the lower bound regardless if was specified as first or with a 0 value. Regarding last, since we do not need to find a specific value anymore, it is totally arbitrary to choose 0 as its value.

With the above in place, let's tweak the Bounds structure a bit in order to use it.

public struct Bounds: Equatable {

  private let lower: Bound
  private let upper: Bound

  public init(lower: Bound, upper: Bound) {
      self.lower = lower
      self.upper = upper
  }

  public func range(lastValidIndex: Int) throws -> ClosedRange<Int> {
      let lower = self.lower.value < 0 ?
        // count from the end of the array
        lastValidIndex + self.lower.value :
        self.lower.value

      let upper: Int
      if self.upper == .last {
          upper = lastValidIndex
      } else if self.upper.value < 0 {
          // count from the end of the array
          upper = lastValidIndex + self.upper.value
      } else {
          upper = self.upper.value
      }

      guard 0 <= lower, lower <= upper, upper <= lastValidIndex else {
          // throw an error indicating that the bounds are not valid
      }
      return lower...upper
  }
}

Now, when we will specify a Bounds, it will look like this.

// target all the elements but the first one
let bounds = Bounds(lower: 1, upper: .last)

// target all the elements but the last one
let bounds = Bounds(lower: .first, upper: -1)

// target the last 4 elements
let bounds = Bounds(lower: -3, upper: .last)

So to get the last 3 element of an array, we could write:

let array = ["Riri", "Fifi", "Loulou", "Donald", "Daisy"]
let bounds = Bounds(lower: -2, upper: .last)
let range = try bounds.range(lastValidIndex: array.count - 1)
let slice = array[range]
// slice = ["Loulou", "Donald", "Daisy"]

Pretty simple, right? 😊

Custom array subscript

Finally, we can push the idea a bit futher by offering a custom array subscript. We will loose the ability to catch an error if the bounds are incorrect, but that's the way subscripting work. If you give it an incorrect value, it will exit with a preconditionFailure() or a fatalError().
To do so, we first have to implement the following extension.

public extension Array {

    subscript(_ bounds: Bounds) -> ArraySlice<Element> {
        do {
            let range = try bounds.range(lastValidIndex: count - 1)
            return self[range]
        } catch {
            preconditionFailure("Incorrect bounds: \(error). \(error.localizedDescription)")
        }
    }
}

This way, we can now subscript an array by providing a Bound. If the bounds are incorrect, we will get a preconditionFailure() with a relevant error message.

To take advantage of this new subscript, I think that a good approach is to define a custom operator on two Bounds.Bound operands, which returns a Bounds. The good news is that there is not much work to do.

infix operator ~>

public func ~> (lhs: Bounds.Bound, rhs: Bounds.Bound) -> Bounds {
    Bounds(lower: lhs, upper: rhs)
}

The choice of the ~> operator is arguably. You are free to use your own operator. Here, I chose one that does the trick but I would have preferred to use :. It does not seem to be possible, unfortunately.

In the end, here are some examples of how the overall implementation could be used.

let array = ["Riri", "Fifi", "Loulou", "Donald", "Daisy"]
print(array[-1 ~> .last]) // ["Donald", "Daisy"]
print(array[1 ~> -1]) // ["Fifi", "Loulou", "Donald"]
print(array[-2 ~> -1]) // ["Loulou", "Donald"]

If you want to use that in your own code, you can use the playground for this tip.

📡

I hope you found this tip useful! If so, feel free to share it on Twitter. If you want to reach out to me or to know when new posts are available, you can find me on Twitter. Also you can send me an email.