Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: =?utf-8?Q?J=C3=B6rg?= W Mittag <JoergWMittag+Usenet-EhRL+R9O3eV+rYvXrdT+RA <at> public.gmane.org>
Subject: Re: Bag/Multiset in scala and a touch of multimap
Newsgroups: gmane.comp.lang.scala.user
Date: Sunday 28th November 2010 03:17:41 UTC (over 6 years ago)
David Christiansen wrote:
> Is there a reason that Map[X, Int] isn't sufficient for your needs? It
> seems to cover the histogram need in any case, though it doesn't
> provide the set-like operations, of course.

Probably for the same reason that we use *any* abstraction instead of
a hand-rolled concrete implementation.

Personally, what I like about MultiSets is that they make some
algorithms completely disappear. Like in the histogram case, for
example. Sure, folding over the input and constructing the Map is a
one-liner, but the cool thing about a MultiSet is that it is a
*zero-liner*, since the MultiSet *is* already the answer.

You simply pass the input to the MultiSet constructor and you're done!

It's the ultimate example of how choosing the right data structure can
massively simplify your algorithms. In this case, the algorithm using
the MultiSet is (at least for some not very mathemetically strict
definition of infinity) actually infinitely simpler!

Say, I want to compute a histogram of letters in a word. The most
straightforward way would be something like

    word foldLeft(Map[Char, Int]())((m, c) => m updated(c, m.getOrElse(c,
0) + 1))

Although this one is much better IMHO:

    word groupBy(identity) map { case (c, cs) => (c, cs.size) }

But behold!

    MultiSet(word:_*)

Yep. That's it. That is the *entire* "algorithm" for computing
histograms.

jwm

PS:  Of course, all the *work* is still there, in the constructor of
MultiSet, but *I* neither have to write nor (more importantly) read
and maintain that.
 
CD: 2ms