Count The Repetitions
What is this problem about?
The Count The Repetitions interview question is a "Hard" string problem. You are given two strings s1 and s2, and two integers n1 and n2. Let S1=[s1,n1] (the string s1 repeated n1 times) and S2=[s2,n2] (the string s2 repeated n2 times).
You need to find the maximum integer M such that [S2,M] (the string S2 repeated M times) can be obtained from S1 by deleting some characters.
Why is this asked in interviews?
Google uses this to test a candidate's ability to find cycles in periodic data. S1 is extremely long, so you cannot construct it. You must simulate matching s2 characters against s1 and realize that because s1 and s2 are finite, the state (the index in s2 after a full s1 block) will eventually repeat. Finding this cycle allows you to jump ahead and solve the problem in O(∣s1∣imes∣s2∣) rather than O(n1).
Algorithmic pattern used
The problem uses Cycle Detection and Dynamic Programming (State Tracking).
- Maintain a pointer
j for s2.
- Iterate through s1 blocks one by one (up to n1).
- For each s1 block, count how many characters of s2 you can match.
- Record the state after each s1 block:
(index_in_s2, s2_blocks_completed).
- If you see the same
index_in_s2 twice, a cycle has been found.
- Calculate how many s2 blocks occur in the cycle and how many cycles fit in the remaining n1 blocks.
- Add the "pre-cycle" and "post-cycle" matches to get the total.
- Divide total s2 matches by n2 to find M.
Example explanation
s1="ab",n1=4,s2="ba",n2=1.
- Block 1 of s1 ("ab"): Matches 'b' of "ba".
j=1.
- Block 2 of s1 ("ab"): Matches 'a' of "ba". Completed 1 "ba".
j=0.
- State (0) repeated! Cycle is 2 blocks of s1 containing 1 block of s2.
- Total s1 blocks = 4. Total s2 blocks = 4/2=2.
- M=2/1=2.
Common mistakes candidates make
- Construction: Attempting to build the large string S1, which leads to memory overflow.
- Index management: Getting the cycle length or the number of blocks within the cycle wrong.
- Off-by-one: Miscalculating the total matches when the cycle doesn't end perfectly.
Interview preparation tip
In problems involving large repetitions, look for the Pigeonhole Principle. Since there are only |s2| possible starting indices in s2 when beginning a new s1 block, you are guaranteed to find a cycle within at most |s2| + 1 iterations.