Mathematics desk
< April 3	<< Mar \| April \| May >>	Current desk >

Welcome to the Wikipedia Mathematics Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.

April 4[edit]

John Marlowe and the other John Marlowe[edit]

Being stuck indoors (that's my excuse; I probably would have done this anyway), I chose to watch a couple of old b/w movies on TV today, ones I'd never seen before.

Trent's Last Case (1952) had a major character, played by John McCallum, named John Marlowe.
Then came State Secret (1950), whose major character, played by Douglas Fairbanks Jr., was ... John Marlowe.

What a weird coincidence, I said to myself. So, naturally, I got to wondering how likely this would have been, assuming the films weren't chosen for broadcast deliberately because of the names of the characters, which I think would be extremely unlikely.

I don't know what sorts of assumptions one would need to make to have a stab at this, but let me phrase it thus:

How likely would it be that two films chosen more-or-less randomly would have major characters with identical names? I guess one could narrow it down to English-language films, British films, black and white films, films made in the early 1950s, etc. -- Jack of Oz ^{[pleasantries]} 07:50, 4 April 2020 (UTC)[reply]
Was The Horse Soldiers also shown by any chance? In this 1959 war film (not b/w) John Wayne's character is one Colonel John Marlowe. --Lambiam 11:05, 4 April 2020 (UTC)[reply]
Not today. -- Jack of Oz ^{[pleasantries]} 11:26, 4 April 2020 (UTC)[reply]

Our article Coincidence notes that "[f]rom a statistical perspective, coincidences are inevitable". I dare to state that they are even more inevitable when you are getting bored. In case you are fed up watching old movies, here are three not too long reads about the odds of coincidences:

"Coincidences: What are the chances of them happening?" (BBC Future)
"Coincidences and the Meaning of Life" (The Atlantic)
"The strangest coincidences of your life probably aren’t that strange at all" (The Washington Post)

--Lambiam 13:56, 4 April 2020 (UTC)[reply]

There's no way of knowing for sure, but the writers of both movies might have been influenced by the Raymond Chandler character Philip Marlowe, especially since the radio program The Adventures of Philip Marlowe was airing at about the same time. There's also the Joseph Conrad character Charles Marlow; perhaps not as well known at the time but a professional writer worthy of the title would be familiar with him. Put that together with the fact that John is a very common first name in English and it's not too surprising that there would be two movies from the early 50's with characters named John Marlowe. The real coincidence is that you happened to pick those two same movies as a double feature, but as pointed out above such coincidences aren't always as unlikely as they may seem. You might be interested in the Stanisław Lem novel The Chain of Chance, which explores the nature of coincidence in the guise of a futuristic detective story. --RDBury (talk) 19:12, 4 April 2020 (UTC)[reply]

A database with film character names would not be of help in getting a precise value unless we also know the likelihood of a pair of films being chosen in succession. It is not very likely that a channel will programme The Texas Chain Saw Massacre to follow a broadcasting of The Sound of Music. But Earth vs. the Flying Saucers, although rarely shown, is more likely to be shown right after The Day the Earth Stood Still than after most other flicks. Below I follow an entirely different "armchair statistics" approach. I would not dream of submitting this to a peer-reviewed journal – in real life I have a reputation to uphold.

OK, here we go. Assume that the screen writer (or book author if the film is adapted from a book) creates a character's name by picking the given name of someone reminiscent of the character and the surname of someone else also reminiscent of the character. So for a serial killer they might combine Leonard Fraser with Alexander Pearce to name a character "Leonard Pearce". There is a non-zero chance that this procedure results in a name that must be rejected for obvious reasons, such as "Tony Abbott", but I think this can be disregarded, as the chance is still fairly small. I believe that any name for a serial killer is equally likely as the name is for a bookkeeper, so we can disregard the character of the character. I'll confine myself, though, to English-language male names. Not all names have an equal prevalence. Let us assume that both given names and surnames independently follow (the simplest case of) Zipf's law. While this assumption is not founded on evidence, it is not unreasonable as an approximation.

Before moving on to applying this model to the question, let us first examine a more general question. Given is a discrete probability distribution over a set of

N

items, numbered

i

through

N

, where the probability (relative frequency) of the

i

-th item is denoted by

p_{i}.

Consider a pair of random draws (with replacement) according to the given distribution from these

N

items. If the first one drawn is item

i

, the probability that the second draw yields the same item equals

p_{i}

. To find the overall probability of a matching pair, we need to take the weighted sum, where the weights are the probabilities of the first item. This results in

P_{\mathrm {match} }=\sum _{i=1}^{N}p_{i}^{2}.

Zipf's law corresponds to the distribution given by

p_{i}=(i\cdot H_{N})^{{-}1},

in which the notation

H_{n}

denotes the

n

-th harmonic number, so that the probabilities sum up to

1

as they should. Let

M

be the number of given names and

N

the number of surnames, so that there are

M\cdot N

given-name–surname combinations in total. Each name can be indexed by a pair

(i,j)

and then has probability

p_{i,j}=(i\cdot H_{M}\cdot j\cdot H_{N})^{{-}1}

. Now we find

P_{\mathrm {match} }=\sum _{i,j}p_{i,j}^{2}=\sum _{i,j}(i\cdot H_{M}\cdot j\cdot H_{N})^{{-}2}=\left(\sum _{i}^{M}i^{{-}2}\right)\cdot \left(\sum _{j}^{N}j^{{-}2}\right)\cdot (H_{M}\cdot H_{N})^{{-}2}.

The two sums are partial sums of a convergent series with limit

{\frac {\pi ^{2}}{6}}

(for which see the Basel problem). Since the series converge quickly, we can approximate both sums for large values of

M

and

N

by the limit

{\frac {\pi ^{2}}{6}}

. The harmonic numbers can be approximated by the leading term of their well-known asymptotic expansions:

H_{M}\approx \ln M

,

H_{N}\approx \ln M

. Combining all this gives us the approximation

P_{\mathrm {match} }\approx {\frac {\pi ^{4}}{36}}\cdot (\ln M\cdot lnN)^{{-}2}.

It remains to supply numbers for

M

and

N

. For this we use the numbers of entries (as of 19:58, 4 April 2020 (UTC)) in the Wikipedia categories English-language masculine given names and English-language surnames. This gives us

M=214

and

N=1769

. Plugging this in and taking numeric values results in

P_{\mathrm {match} }\approx 0.0016

.

This approximate estimate is for a name match between one character from the first and one from the second film, say the two main characters. If more characters from each cast are considered, say

A

from movie number one and

B

from movie number two, where both numbers are fairly limited, the chance of a match increases by almost a factor of

A\cdot B

. If both equal

10

, we get

100\times 0.0016=0.16=16\%.

I agree that this seems implausibly high.

Concluding thought. If character names were really distributed as in real life, occasionally two characters should happen to coincidentally have the same name without this being relevant to the plot. Why do we never see this? So many questions remain. --Lambiam 19:58, 4 April 2020 (UTC)[reply]

Double wow! Thanks for all that. I'm very surprised that N is as low as 1769. -- Jack of Oz ^{[pleasantries]} 00:23, 5 April 2020 (UTC)[reply]

I have computed