February 27, 2019

SE Radio 358: Probabilistic Data Structure for Big Data Problems

Andrii Gakhov, author of the book Probabilistic Data Structures and Algorithms for Big Data Applications talks about probabilistic data structures and their application to the big data domain. Host Robert Blumen spoke with Dr. Gakhov about how probabilistic data structures differ from their exact counterparts; hash functions – cryptographic and non-cryptographic; space versus accuracy tradeoffs; space versus processing time tradeoffs; the main problem domains: membership testing, cardinality, frequency, similarity and rank. Bloom Filters for membership testing: performance characteristics, use cases, design patterns using Bloom Filters for lookup problems; and how they are implemented. LinearCount and HyperLogLog for cardinality: use cases web applications, implementation. CountMinSketch for frequency estimation. Existing library support. Should PDS be taught in beginning courses?

SE Radio 358: Probabilistic Data Structure for Big Data Problems

Show Notes

Related Links

Join the discussion

1 comment

More from this show

SE Radio 666: Eran Yahav on the Tabnine AI Coding Assistant

SE Radio 665: Malcolm Matalka on Developing in OCaml with Zero Frameworks

SE Radio 664: Emre Baran and Alex Olivier on Stateless Decoupled Authorization Frameworks

Menu

Recent posts

Search

Search

SE Radio 358: Probabilistic Data Structure for Big Data Problems

Show Notes

Related Links

Join the discussion

1 comment

More from this show

SE Radio 666: Eran Yahav on the Tabnine AI Coding Assistant

SE Radio 665: Malcolm Matalka on Developing in OCaml with Zero Frameworks

SE Radio 664: Emre Baran and Alex Olivier on Stateless Decoupled Authorization Frameworks

Menu

Recent posts