[MS] Rotation revisited: Cycle decomposition in clang's libcxx - devamazonaws.blogspot.com
We got distracted by the rotation algorithm in gcc's libstdc++, but let's get back to the cycle decomposition algorithm in clang's libcxx.
The implementation in clang's libcxx performs the minimum number of swaps, roughly n/2, where n is the total number of elements. It does so by viewing the rotation as a permutation and walking through each of the cycles.
For notational convenience, let a be |A| and n be |A| + |B| (the total number of elements). The number of cycles is gcd(a, b), and the k'th cycle consists of the elements starting at first + k, and then stepping to the next element by moving forward another a elements, with wraparound, until you return back to the starting point.
For example, if you have |A| = 4 and |B| = 6, then the cycle that starts at A1 takes 4 steps forward to continues to B1; takes another 4 steps forward to B5; then takes 2 steps forward, wraps around, and then two more steps forward, landing on A3; then takes 4 steps forward to B3; and then takes 4 steps forward and wraps around to A1, which is the starting point.
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
| ↳ | → | → | → | ↴ | |||||
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
| ↳ | → | → | → | ↴ | |||||
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
| → | → | ↴ | ↳ | → | |||||
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
| ↳ | → | → | → | ↴ | |||||
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
| ↴ | ↳ | → | → | → | |||||
| A1 | A2 | A3 | A4 | B1 | B2 | B3 | B4 | B5 | B6 |
There's another cycle that starts at A2 and continues to B2, B6, A4, B4, then back to A2.
Now, we've been counting swaps, but a single-element rotation is not done as a sequence of swaps, but rather by picking up the first element, sliding all the other elements over, and then putting the original first element at the end. I've been informally calling an assignment "half of a swap", though a swap is really a constructor, two assignments, and a destructor. But let's stick with the "half a swap" accounting fiction.
The rotation algorithm goes like this:
auto a = std::distance(first, mid); // number of "A" elements
auto n = std::distance(first, last); // total elements
auto g = gcd(a, n); // number of cycles
for (auto k = 0; k < g; ++k) {
// Rotate the elements in the cycle starting at k
auto save = std::move(first[k]);
auto i, next = k;
while (i = next, next = (i + a) % n, next != k) {
first[i] = std::move(first[next]);
}
first[i] = std::move(save);
}
For example, if rotating A1, A2, B1, B2, B3, B4, there are two cycles: A1, B1, B3; and A2, B2, B4. The elements within each cycle rotate one position.
| ⮣ | → | → | → | → | → | ⮧ |
| ⮤ | ← | ← | ⮠ | |||
| A1 | A2 | B1 | B2 | B3 | B4 | |
| ⮦ | ← | ← | ⮢ | |||
| ⮡ | → | → | → | → | → | ⮥ |
And when you're done with all the cycles, you've rotated the entire A and B blocks.
| B1 | B2 | B3 | B4 | A1 | A2 |
This performs n/2 swaps, which is the fewest swaps of all the algorithms we've looked at so far. However, it has terrible locality because the elements in the cycle are all spread out.
Calculating the greated common divisor of two numbers can be done in O(log n) steps via Euclid's algorithm.
int gcd(int a, int b)
{
do {
auto r = a % b;
a = b;
b = r;
} while (r);
return a;
}
Commenter Brent thought that the cycle decomposition algorithm was obvious. Of course, the trick is the step they called "Repeat". How many times do you repeat?
The clang libcxx algorithm calculates the number of repeats by taking the gcd. But there's a trick so we don't have to calculated it at all. We'll look at that trick next time.
Bonus chatter: I think it's interesting that of the three major implementations of the C++ standard library, each one uses a different rotation algorithm when given random-access iterators!
Post Updated on June 4, 2026 at 03:00PM
Thanks for reading
from devamazonaws.blogspot.com
Comments
Post a Comment