Abstract
In 2022, Chakraborty et al. published a streaming algorithm (henceforth, the CVM algorithm) for the distinct elements problem, that deviated considerably from the state-of-the art, due to its simplicity and avoidance of standard derandomization techniques, while still maintaining a close to optimal logarithmic space complexity.
In this entry, we verify the CVM algorithm's correctness using a new technique which simplifies the analysis considerably compared to the orignal proof by Chakraborty et al. The main idea is based on a probabilistic invariant that allows us to derive concentration bounds using the Cramér-Chernoff method.
This new technique opens up the possible algorithm design space, and we introduce a new variant of the CVM algorithm, that is total, and also has an additional property in addition to concentration: unbiasedness. This means the expected result of the algorithm is exactly equal to the desired result. The latter is also a new property, that neither the original CVM algorithm nor classic algorithms for the distinct elements problem possess.