Tuesday, July 24, 2007

Cupcake

http://mostlikelytosuck.blogspot.com/

seriously, SOME of you really need to come up with some less defeatist blog names!!! kids these days...

13 comments:

kangway said...

IS THAT DAVID CUPCAKE ROSEN?!!??!

kangway said...

But you're right. I mean, look at the trend, and my trend I mean the two examples provided: Garrett and Dave. I guess they're roommates this summer, but "runhardloseanyways" and "mostlikelytosuck" are pretty pessimistic.

Markkimarkkonnen said...

thank goodness cupcake is here. rosen figure this out for me:
with replacement, makes x draws from a bag of y balls. if x>y, what is the probability that each ball was drawn at least once?

there are y^x possible sequences of drawing the balls, but I don't know how to count how many of those have all the balls in them.

PS - hi

Ian said...

That sounds like a goddamn AIME question. The strategy is always something about pigeons and holes...

Dave said...

Yo Mark,

I've got a solution to this problem, but I don't have the time to fully explain it at the moment (I've got jury duty tomorrow, so I gotta hit the sack). Also, it requires some notation that can't really be expressed in the comment box here. I'll find some way of hosting the solution as a .pdf on my server tomorrow, and I'll put up a link to it here so that the entire Caltech XC community can partake of the joys of combinatorics :-).

Markkimarkkonnen said...

sweet, thanks dude. the original problem actually came up during dinner conversation, and rephrased to make sense out of context was,
"everyone in the world has a telephone and the telephone number of everyone else. a single person is seeded with a rumor. he calls two other people at random and tells them the rumor. they then call two people each at random, etc. what is the expectation value of the number of cycles necessary for everyone in the world to hear the rumor?"

Garrett said...

Hey Dave and I share our own little quirks as runners that unique allow us to have such names. For instance, Dave is fast, but spends his summer losing at robotics competitions instead of putting his running talent to use. I, on the other hand, have little talent, but train as though I have some hope of success. Hence our light-hearted blog names have a surprising amount of accuracy.

Dave said...

Ok, for your viewing pleasure, the solution to Mark's original question is available at http://bag.caltech.edu/documents/MarkProblem.pdf

If you're unfamiliar with multinomial coefficients or the Multinomial Distribution, Wikipedia has excellent articles on both.

Markkimarkkonnen said...

damn, that shit is HOT. i took an entire 100-level class on probability and stats and i didn't even know of such a distribution. obviously, i was aware that such a distribution must exist as a solution to the problem, but had never seen it before.

Markkimarkkonnen said...

although, now that i think about it a little more, i'm sorry to say that it didn't really help.

in fact, I had already figured out the answer given. what i don't know how to do is take "the summation evaluated all sequences" which satisfy the condition k1+k2+...kb = d

for example, how would i numerically evaluate the expression for k=6*10^9 and d=2^35?

Dave said...

"for example, how would i numerically evaluate the expression for k=6*10^9 and d=2^35?"

Hmm... Well, first let start by copping out. Try to keep in mind that I'm a mathematician, so my responsibilities only extend as far as saying "a solution exists", and sometimes (if I'm not too lazy) coming up with a complicated formula for the answer that looks good enough for someone to give me money for it. Actually doing USEFUL things (like calculating numbers) isn't really my field...

One way that you can try to simplify the problem is by considering a sequence to be a set equipped with an additional ordering. We have the advantage in this case that all of our sequences will have the same length. Instead of considering all sequences of positive integers of length b whose sum is d, we can simply consider NONDECREASING sequences of positive integers of length b whose sum is d; once we have found such a sequence, we look at the multinomial probability for that number, and multiply by b! (the number of ways of permuting b distinct elements) to capture all of the other sequences that utilize that same set of numbers (thus reducing the computational complexity of the problem by many, many orders of magnitude).

However, the basic difficulty that you're running up against here is that you're looking at a combinatorial problem involving huge numbers. I know that this frequently comes up in quantum physics and thermodynamics when considering large ensembles of particles, so I think the physicists may have a leg up when it comes to developing techniques to approximate these kinds of combinatorial problems (for example, Stirling's Approximation comes to mind). In any case, we've reduced to the case where we need only find distinct SETS of b positive integers whose sum is d.

I haven't looked in to this yet, but the first method of attack that comes to mind for trying to find such sets would be some kind of recurrence relation. I say this because the base case is trivial; if b = d, then clearly, every number in our set of b positive integers must be 1. If d = b + 1, then exactly one of the numbers must be 2, etc. You can see based on these examples how although calculating the possible sets directly might be difficult, it might not be so bad if you consider a recurrence relation (a recursive mathematical relation, like the Fibonacci numbers). Sometimes, if you're lucky, it turns out that the equation that specifies the recursion can actually be solved explicitly for the nth term, which then gives a closed-form solution (this approach can actually be done in the case of the Fibonacci numbers).

More generally, if I were going to do this problem myself (and the numbers were smaller) I would simply write a computer program that would just exhaustively calculate every such possible set of positive integers and then evaluate the multinomial probability. But again, you run into the whole computational complexity problem. I don't know if there is some clever way you'll be able to get around this. Katherine' a computer scientist, and they spend a lot of time worrying about "trivial" shit like this. You might consider asking her for some ideas....

Dave said...

Sorry it wasn't quite as helpful as you would have liked...

Markkimarkkonnen said...

it's cool. i'm pretty sure the answer to the telephone problem is 42