pocketaces
Joined: Nov 11, 2009
• Posts: 158
November 11th, 2009 at 3:17:04 PM permalink
First, love the new site! Fantastic job.

My question has to do with the governor's ridiculous but predictable response. The governor stated that it was just a 'wild coincidence'. Notwithstanding the overwhelming circumstantial evidence (The bill's sponsor and letter addressee was the person who had hurled insults at the governor a week earlier), do you have an estimate of what the odds are of an exactly seven line letter spelling this phrase by chance? I think taking in to account the letters used it will be even more improbable than just assigning a 1 in 26 chance to each. It doesn't seem like U, Y, and especially K are common word-starting letters.

Thanks!
pocketaces
Joined: Nov 11, 2009
• Posts: 158
November 11th, 2009 at 3:24:07 PM permalink
Definitely should have posted this under math or general questions... sorry about that one
Wizard
Joined: Oct 14, 2009
• Posts: 22083
November 11th, 2009 at 4:23:54 PM permalink
For the benefit of other readers, I discuss this hidden message in my Oct. 31, 2009 newsletter.

A good first place to start would be the frequency of each letter in the English language. Wikipedia has a nice table on that . Granted, the frequency of the first letter per word may not be the same as the frequency of all letters, but we're just making an educated guess here.

Using the Wikipedia table, the probability of seven random letters spelling out the governor of California's secret message would be prob(F) * prob(U) * prob(C)* prob(K)* prob(Y)* prob(O)* prob(U) = 0.02228*0.02758*0.02782*0.00772*0.01974*0.07507*0.02758 = 1 in 185 billion! To that, I say "Nice try Arnold, but I don't believe you."
It's not whether you win or lose; it's whether or not you had a good bet.
teliot
Joined: Oct 19, 2009
• Posts: 1968
November 11th, 2009 at 5:35:03 PM permalink
That's pretty funny, Arnold.

Just for fun I downloaded TWL06 (official Scrabble dictionary) and counted the first letter of each word in the dictionary to get more accurate probabilities:

Based on this, the probability is -- 1-in-270,556,053,505

Here are the stats from TWL06. (I could also do SOWPODS). My results are very different from those posted on Wikipedia for the first letter. I suppose that the Scrabble dictionary TWL06 doesn't count as the "real" dictionary for purposes of counting. The Wikipedia reference for first word-letters is "Calculated from 'Project Gutenberg Selections' available from the NLTK Corpora." I suppose, one should really just look at the rather limited vocabulary of Arnold and for that (roughly) 8k-10k word set compute the probabilities. I contend that Arnold has never before used the phrase "Kicks the can down the alley." Who says that? Why an alley and not a street? And why not "punts the ball down the field?" Maybe "Fucp You" didn't sound as good.

I can hear Arnold ... "I need a 'k' word Maria ..."

Oh well, I should just go and watch Jeopardy ...

A 0.06028 10771
B 0.05581 9973
C 0.09283 16588
D 0.05821 10402
E 0.04020 7184
F 0.03995 7138
G 0.03287 5874
H 0.03636 6498
I 0.03709 6628
J 0.00828 1480
K 0.01039 1857
L 0.02976 5317
M 0.05568 9949
N 0.02493 4454
O 0.03390 6057
P 0.08431 15066
Q 0.00476 850
R 0.05886 10518
S 0.11043 19732
T 0.05037 9000
U 0.02927 5231
V 0.01600 2859
W 0.02195 3922
X 0.00085 152
Y 0.00330 590
Z 0.00336 601