Filter Characters by Frequency
What is this problem about?
The Filter Characters by Frequency interview question asks you to process a string and return only those characters that appear with a specific frequency or within a specific range of frequencies. For example, you might be asked to return all characters that appear exactly twice, or all characters that appear more than k times.
Why is this asked in interviews?
This is a fundamental Hash Table coding problem used by opentext and other companies to test your basic data structure proficiency. It evaluates whether you can use a frequency map (dictionary) to store character counts and then iterate through that map to filter results. It's a basic building block for more complex string analysis tasks.
Algorithmic pattern used
The problem uses a Two-Pass Frequency Counting pattern.
- Pass 1: Use a Hash Map or a fixed-size array (if characters are limited to lowercase English) to count the occurrences of each character in the string.
- Pass 2: Iterate through the original string (to preserve order) or through the frequency map.
- Filter: Check if the frequency of the current character meets the requirement.
- Append valid characters to a result string.
Example explanation
String: "aabbcde", Filter: frequency > 1.
- Count characters:
{a: 2, b: 2, c: 1, d: 1, e: 1}.
- Iterate through string:
- 'a': count is 2 (> 1). Keep.
- 'b': count is 2 (> 1). Keep.
- 'c', 'd', 'e': count is 1. Discard.
Result:
"aabb".
Common mistakes candidates make
- Order Preservation: Returning characters in alphabetical order rather than their original string order (unless the problem specifies otherwise).
- Inefficient Search: Doing a nested loop to count frequencies for each character, resulting in O(N^2) time.
- Case Sensitivity: Not clarifying if 'A' and 'a' should be counted as the same character.
Interview preparation tip
Always use an array of size 26 or 128 if you know the character set is ASCII. It's faster and more memory-efficient than a Hash Map, although a Hash Map is more generic.