(A new question of the week)
A few months ago, I wrote about Ranking a Word Among Its Permutations, that is, finding where a word would be found in an ordered list of all possible “words” made by permuting its letters. The problem in general requires a (sometimes lengthy) algorithm. A month or so later, we received a question that turned out to be a special case of that problem – simple enough, it turned out, to be able to write a formula for the answer.
Specifying the problem
Here is the September question, from Rarkyan:
Hello,
I want to create a pattern of sequences using combination formula:
nCr = n! / r! * (n – r)!
For example: I have 5 blank zeroes: 00000
I want to know how many patterns are formed if I fill with 3 symbols (in this case I’m using “1” as symbol).Using combination formula:
5C3 = 5! / 3! * (5 – 3)! = 10 patterns
This is the results:
a. 00111, b. 01011, c. 01101, d. 01110, e. 10011, f. 10101, g. 10110, h. 11001, i. 11010, j. 11100
As we can see on the ID “a” the “1” symbol from left to right is located in 3rd, 4th, and 5th places: 00111
On ID “b” the “1” symbol is located in 2nd, 4th, and 5th places: 01011, etc.
The question is:
1. Is there a formula for knowing the order of symbols?
2. On what ID do the patterns take the following sequence:
3. The symbols is in position : 2,3,4
4. The symbols is in position : 1,2,4, etc
Thanks
Rarkyan has listed the \(_5C_3 = \frac{5!}{3!(5-3)!} = 10\) permutations of “00111” in ascending order as binary numbers, and wants to find the index in this list of any particular number, given the locations of its three 1’s, in order to use this index as an identifier for the permutation.
I responded with a reference to the August post on the subject:
Hi, Rarkyan.
As I understand it, you are considering a list in lexical order of all sequences consisting of n symbols, of which r are 1 and n – r are 0. That is, they are n-digit binary numbers with r 1’s, arranged in numerical order. You want to find the rank (position in the list) of a particular binary number.
This is a relatively simple instance of the question posed here:
I discussed this in our blog at
Ranking a Word Among Its Permutations
Here, we’re thinking of each number as a permutation of 11100, in your example.
As you’ll see, this probably can’t be written as a formula taking the positions of the 1’s as variables, but rather as an algorithm that will take more steps as n increases. It might be interesting to consider whether this could be written somehow as, say, a series with n terms (or maybe r terms).
I hope the algorithm will meet your needs, either directly or as a start in thinking about a specific method for your case.
Knowing the general problem to be complex, I doubted that even this simple case could be put into a mere formula, but was open to the possibility of at least making a simpler algorithm. Knowing nothing about Rarkyan’s mathematical or programming skills, I left it there initially.
He replied:
Thank you for the reply Doctor Peterson.
Yes I would like to know if there is a formula for determining lexical order or maybe other way than lexical order. Where in the case that I experienced, actually I just wanted to know at what ID the symbol order of 2,4,5 will be formed etc. Thus I can find out the formation of a pattern using only the formula, so I don’t need to generate all of the pattern sequences.
The given links I am still unable to understand. (I’m sorry, I’m bad in English also I’m not mathematician.)
By using my sample of using 5 zeroes and 3 symbols (using “1” as symbol) could you give me a detailed example to determine or predict that pattern 01101 will occur at ID ……, etc
Thank you
The phrase “or maybe other way than lexical order” suggested that the requirement might be more flexible than the original request; perhaps all that was needed was some numerical identifier for any given arrangement.
To give myself time to look into the request, while hoping the need might be simpler, I answered:
I’ll see whether I can find a simpler way to describe the process for your specific case. That may take a while, because I am busy at the moment; possibly another of us will have ideas while I get my other work done.
But it’s also possible that you don’t really need that. Perhaps if you explain what you want to do with this (that is, your larger goal), it will either help us see an easier way to meet your real need, or will reveal that you need something more general. (The fact that you don’t necessarily need lexical order is the clue that something else might be sufficient.) Often context makes a big difference in the answer to a question.
Further discussion (which I will omit, as it didn’t contribute to the final answer) showed that Rarkyan would accept any reproducible ordering, but ultimately wanted to give consecutive names to each arrangement (by letter, which could be obtained from indexes), and the original request was indeed the most reasonable way to meet the goal.
Creating a formula
This discussion gave me five days over which to occasionally ponder the problem, and then I was able to reply:
I think I have the answer now; I had to gradually convince myself that it would be reasonably simple, then wait until I felt free to sit down and experiment with it. It’s actually rather pretty.
I took the idea from the pages I showed you at first, which involves taking each “letter” in a “word” and counting the number of ways to make “words” of lower value using the remaining places. In this problem, there are only two different “letters”, 0 and 1, which makes this as much simpler than the general problem as binary arithmetic is simpler than base 10 (or more so).
Sometimes the key to solving a problem is to first be convinced that it is solvable! This is one reason mathematicians like “existence proofs”, which without showing how to solve a problem, tell us that the effort will not be wasted. I had been convinced otherwise at first.
Start with an example
I stated my plan for developing the formula:
Let’s take an example, then I’ll state the formula for our case of 5 places and 3 ones, then generalize that to any number of ones, and finally to any size at all.
First, some notation. I’ll call the number of places n (5 here) and the number of ones r (3 here). So there are nCr [= 5C3 = 10] possible arrangements. Then I’ll call the locations of the ones d1, d2, d3 (and so on up to dr, if r > 3). The places are numbered 1 through n (5), from the left.
This is a good plan for many problems of this sort: first work with a specific, simple example, and generalize it step by step, rather than jumping into the full complexity of the problem all at once. This is also a good way to teach, or in this case to communicate the ideas.
Defining variables and other notation is another important step in problem-solving. Here, for \(r = 3\), we will be creating a function \(i = f_3(n, d_1, d_2, d_3)\) that produces the desired index (identifier), with the hope of being able to express a function of a variable number of arguments eventually. In the original example, we want to find that \(f_3(5, 1, 3, 4) = 7\) corresponding to 10110 being in the 7th position in the list, called “g”.
Now I applied the algorithm from the earlier post to this example:
Now, suppose our pattern is 10110, so that d1=1, d2=3, and d3=4. We want to count patterns that precede this:
How many are before 10000, that is, have pattern 0****, with 3 ones among 4 places? Answer: 4C3 = 4.
How many are from there to 10100, that is, have pattern 100**, with 2 ones among 2 places? Answer: 2C2 = 1.
How many are from there to 10110, that is, have pattern 1000*, with 1 ones among 1 place? Answer: 1C1 = 1.
Then our pattern is in the next position, (4+1+1)+1 = 7th.
Specifically, the numbers before 10000 (which has a 1 in place 1) must have the form 0 _ _ _ _, so they are 00111, 01011, 01101, and 01110, the permutations of 0111 (4 places, 3 chosen to hold a 1: 4 of them) in the last 4 places. The 1 number from 10000 to 10100 (which has its second 1 in place 3) must have the form 1 0 0 _ _, so it is 10011, the permutation of 11 (2 of 2 chosen to hold a 1: only 1 possible). And so on.
To check, here is our list:
1: 00111
2: 01011
3: 01101
4: 01110
5: 10011
6: 10101
7: 10110
8: 11001
9: 11010
10: 11100
The calculation was 4C3 + 2C2 + 1C1 + 1, where the first number in each combination is the number of remaining places (5-1, 5-3, 5-4), and the second is the number of remaining ones (3, 2, 1).
In more proper notation, we did \(_4C_3 + _2C_2 + _1C_1 + 1 = 4 + 1 + 1 + 1 = 7\), or \({4\choose 3} + {2\choose 2} + {1\choose 1} + 1\).
Replace numbers with variables (d1, d2, d3)
Now we want to turn that into a formula. By considering where each number in the calculation came from, I can do that:
Now, let’s generalize, keeping n=5 and r=3 but using variables for the three places:
How many have their first 1 in place d1, that is, have no ones before that place? There must be 3 ones among the remaining 5-d1 places. Answer: (5-d1)C3.
How many have their next 1 in place d2, that is, have no more ones before that place? There must be 2 ones among the remaining 5-d2 places. Answer: (5-d2)C2.
How many have their next 1 in place d3, that is, have no more ones before that place? There must be 1 one among the remaining 5-d3 places. Answer: (5-d3)C1.
So our pattern’s position is (5-d1)C3 + (5-d2)C2 + (5-d3)C1 + 1.
So our formula is \({{5 – d_1}\choose 3} + {{5 – d_2}\choose 2} + {{5 – d_3}\choose 1} + 1\).
Once I have a formula, I have to make sure it is correct in general, so applying it to another example is a good idea:
Let’s try another example, say 11001. Here d1=1, d2=2, and d3=5. The formula gives
(5-1)C3 + (5-2)C2 + (5-5)C1 + 1 = 4C3 + 3C2 + 0C1 + 1 = 4 + 3 + 0 + 1 = 8, which is correct.
That last term was odd, but it worked!
Because the third 1 was in the last place, I had to use \({0\choose 1} = 0\), meaning that there are no ways to choose 1 item from 0! This is a fact that is not regularly taught, but is reasonable, and necessary in problems like this. One source that mentions this case is Wikipedia, which says, “one might extend the definition beyond the above boundaries to include \({n\choose k} = 0\) when either k > n or k < 0.”
Since having a 1 in the last place was a good extreme test of the formula, I tried the other extreme, having all 1’s at the start:
How about an extreme 11100. Here d1=1, d2=2, and d3=3. The formula gives
(5-1)C3 + (5-2)C2 + (5-3)C1 + 1 = 4C3 + 3C2 + 2C1 + 1 = 4 + 3 + 2 + 1 = 10, which is correct again.
That gave us no trouble at all.
Generalize the formula (letting n, r vary too)
It’s time to generalize to other values of r, and even of n:
Now, if n is something other than 5, we just replace 5 with n; and for a different r, we will have a different number of terms, but they will follow the same pattern. Our formula becomes
[(n-d1)C(r-0) + (n-d2)C(r-1) + (n-d3)C(r-2) + … ]+ 1
Taking the index of each one as k and writing this as a summation, we have
[Σ{k=1 to r} (n-dk)C(r-k+1)] + 1
That’s our answer.
Written properly, the formula is $$\left[{{n – d_1}\choose {r – 0}} + {{n – d_2}\choose {r – 1}} + {{n – d_3}\choose {r – 2}} + \dots\right] + 1$$ or $$\sum_{k=1}^{r}{{n – d_k}\choose {r – k+1}} + 1$$
Do a final check
Now it’s time to check the complete formula, using different values for n and r, but still keeping the numbers small enough to manage:
Let’s try a slightly bigger example. I’ll take n=6, r=2, so our list has 6C2 = 15 entries:
1: 000011
2: 000101
3: 000110
4: 001001
5: 001010
6: 001100
7: 010001
8: 010010
9: 010100
10: 011000
11: 100001
12: 100010
13: 100100
14: 101000
15: 110000
Now, where in the list is 010100? We have d1=2 and d2=4, so the formula gives
[Σ{k=1 to 2} (6-dk)C(2-k+1)] + 1 =
(6-d1)C(2-1+1) + (6-d2)C(2-2+1) + 1 =
(6-2)C2 + (6-4)C1 + 1 = 4C2 + 2C1+ 1 = 6 + 2 + 1 = 9
And that’s correct.
That was fun.
Rarkyan answered:
Wooow hahaha amazing Doctor Peterson.
Many many thanks for your help. It’s really amazing.
Yes, that’s the formula I’m looking for. I think that formula fits in lexical order.
It seems like you also enjoy the process of finding the formula 🙂
Once again thank you.
Note:
I will bother you again if I’m stuck on another math problem :p
He was right:
Yes, I enjoyed the final result, after expecting not to be able to make a formula at all. Our purpose is to help others solve problems on their own, but occasionally there’s a problem that’s just interesting to solve in full.