darsez's Science blog

December 12, 2007

The Champions League Problem

In the UEFA Champions League, 32 teams are divided into eight groups of four. The top two teams from each group qualify for the knockout stage. The qualified teams and their group stage standings are as follows:

Group A Winner: Porto (Portugal), Runner-up: Liverpool (England)
Group B Winner: Chelsea (England), Runner-up: Schalke (Germany)
Group C Winner: Real Madrid (Spain), Runner-up: Olympiacos (Greece)
Group D Winner: Milan (Italy), Runner-up: Celtic (Scotland)
Group E Winner: Barcelona (Spain), Runner-up: Lyon (France)
Group F Winner: Manchester United (England), Runner-up: Roma (Italy)
Group G Winner: Inter (Italy), Runner-up: Fenerbahce (Turkey)
Group H Winner: Sevilla (Spain), Runner-up: Arsenal (England)

The knockout stage will see each group winner facing a randomly drawn runner-up; however, teams from the same group or association (e.g. England) cannot be drawn together.

I am particularly interested in my favorite team, Inter. What is the probability of Inter playing against, say, Liverpool in the next stage?

According to the rules, Inter can face one of these six opponents: Liverpool, Schalke, Olympiacos, Celtic, Lyon, and Arsenal. (Other runners-up, i.e. Roma and Fenerbache, are ruled out.) It is tempting to say that Inter has a 1/6 chance of facing Liverpool, assuming each opponent has an equal chance.

From Liverpool's perspective, it can face Real Madrid, Milan, Barcelona, Inter, or Sevilla. Using the same logic as above, we will conclude that Liverpool has a 1/5 chance of facing Inter.

Since Inter vs. Liverpool and Liverpool vs. Inter are the same event, they must be associated with the same probability. But we arrive at a probability of 1/6 from Inter's perspective and 1/5 from Liverpool's. It must be that the assumption of equal chance is at fault.

Strange as it may seem, Inter's (and/or Liverpool's) potential opponents do not have the same chance of drawing Inter (Liverpool). To better understand this concept, imagine a hypothetical situation where UEFA imposes a stricter rule that reduces Liverpool's potential opponent to Inter only, and everything else is unchanged. Although Inter still has five other teams in the pot, it must now be drawn to Liverpool for the knockout stage to work. The probabilities assigned to the teams are then 1 for Liverpool and 0 for the five remaining teams.

The spirit of the draw is that it should be fair to every team under the UEFA's constraints. In the ideal case, a team should be drawn to its potential opponents with an equal probability; but we have just learned that it is not achievable. Now that some teams are more likely to play against each other under the current rule, how should the probabilities be adjusted? While the probabilities certainly depend upon the exact drawing mechanism, is there an "optimal scheme" that minimizes the probability differences?

I am not sure how UEFA handles this issue. The draw will be held on December 21, 2007.

October 2, 2006

Fairness

I've personally said something similar to the following (without much thinking of its scientific meaning, though):

"He's good at academics, sports and music, but he's short and ugly. God is fair - usually people cannot have all the good things."

Now I'm going to consider this "fairness" from a probabilistic perspective (and let's put the theological issues aside - I will assume God exists). In probability, "fair" has a special meaning. A fair coin is a coin that will equally likely give head or tail in every toss, which is independent of all previous tosses. (That is, even when the previous 10 tosses are all heads, there is still a 50% chance of getting a head in the 11th toss.)

If one tosses 3 fair coins sequentially, one has an 1/8 chance of getting each of the following eight sequences (H denotes heads and T denotes tails):
1. HHH (3H)
2. HHT (2H1T)
3. HTH (2H1T)
4. HTT (1H2T)
5. THH (2H1T)
6. THT (1H2T)
7. TTH (1H2T)
8. TTT (3T)

(The statistics in the parentheses give the total number of heads and tails. For example, "2H1T" means 2 heads and 1 tail in total. It happens in sequences 2, 3 and 5.)

Suppose your friend has made the coin tosses and told you that any two of the three coins are heads (i.e. the 3 coins are HHX, HXH or XHH. You don't know which two coins he is referring to). Do you think the unknown coin X is more likely to be head or tail?

The correct answer is tail. We can see that your friend is referring to sequences 1, 2, 3 or 5, and only sequence 1 has X equal to H. Since all sequences are equally likely, based on what your friend tells you, the unknown coin has a 75% chance of being tail and only 25% chance of being head.

Let's translate all these into the context of the example: suppose the first coin governs your "academic" aptitude, the second coin refers to "sports and music," and the third coin determines your "appearance." Assume head means "good" and tail means "bad."

My argument is valid if I, like your friend, start with any two unspecified attributes (coins). I observe that a guy has two good attributes (two heads), so I conclude that his remaining attribute is more likely to be bad (tail). This is correct in terms of fairness.

However, if I observe the first two attributes (i.e. they are specified) and see that the guy is good at academics, sports and music, then what I said is not logically sound. This is because being "short and ugly" now has nothing to do with his academic and sports and music abilities. In the coin toss case, if your friend tells you that the first two coins are heads, i.e. the coins are HHX (it's either sequence 1 or sequence 2), whether the third coin X is head or tail is 50-50, and is independent of what you are told, since the coin is fair.

This example highlights the different meanings of "fairness" in different situations. Even from a statistician's point of view, fairness does not always mean 50-50. Before making claims such as "God is fair," we should clearly state what our assumptions are.

September 10, 2006

Rotten Lemons Problem

The term "lemons problem" was coined by Nobel laureate George Akerlof in 1970. Roughly, it refers to markets in which there is asymmetric information, meaning that sellers know more than buyers about product quality. Akerlof studies the used car market and finds that it is often impossible for buyers to distinguish bad cars from the good ones. As a result, both good and bad cars are given the same price, thus reflecting a huge discount for the former to compensate buyers in the event of seemingly good cars turning out to be bad.

A commonly suggested solution to the problem is that sellers should find a mechanism, e.g. advertising, to let buyers distinguish accurately between the two types of cars, eliminating the asymmetric information. If this cannot be done, for example when bad car dealers cheat and buyers cannot verify, one might then encounter an even older (and sadder) problem in economics: "bad money drives out the good (also known as Gresham's Law)," which was originated in as early as the 16th century.

When good money (in terms of appearance and real value of the materials used, etc) and bad money are forced to have the same monetary value because they are legal tenders, people tend to save the good and spend the bad. (Imagine in a situation where you have to spend either an ugly torn one-dollar bill or a beautiful antique one-dollar coin, both of which can buy you a dollar's worth of goods.) After a while, all the money circulating in the market will be bad money, since good money is held in people's hands. In the context of the used car market (or any other market in which there is asymmetric information between buyers and sellers), all good used cars will disappear in the long run. Sellers will utilize their cars until they are bad, since good cars do not generate a higher revenue.

Sometimes the problem is worsened when the bad actually fares better than the good. In the pathetic situation when "best guys finish last" or "good girls love bad guys," it makes no sense to be good guys, since they are driven out by the bad even more quickly. I shall call this the "rotten lemons problem," which is, both literally and conceptually, much worse than the original lemons problem.

July 9, 2006

Fortune Telling

No offense. This is only expressing my view on the subject. It is by no means the correct explanation. Even physicists fail to reach a unanimous conclusion.

Can fortune telling be 100% accurate? I personally don't think so.

Take astrology as an example. There is a similar discussion in the classic novel, Sophie's World. Astrology studies the location of other stars and planets that can be millions of light-years away from us (light-year is the distance that light can travel in one year). Therefore, what we are observing on Earth is actually the light that came from the stars and planets millions of years ago. These stellar objects are very likely to be in a different position now, or might even no longer exist. If you still think astrology is correct, then you are probably supporting predictive determinism, i.e. everything, including why you and I exist and what you will be doing at this time tomorrow, is pre-determined by other events that are measurable (even though they happened ages ago).

Heisenburg Uncertainty Principle refutes predictive determinism. The principle states that one cannot measure the position of an electron precisely without disturbing it. There is no way to measure things 100% accurately, and thus no way to construct a function that forecasts future events 100% accurately. Every attempt to predict the future, be it astrology or tarot cards or Chinese palm reading, is deemed to be error-prone.

(Side note: This, however, does not refute casual determinism. It is still possible that everything is pre-determined. It is just that we cannot measure the events that determine it. Albert Einstein is a believer in casual determinism. He once commented, "God does not play dice with the universe," referring to his view that things do not happen in an entirely random fashion.)

Practically, there are often fortune tellers that have made every prediction correct. This, however, can be a statistical phenomenon.

Suppose there is a coin-flipping competition with 1000 participants. In every round a participant is eliminated if he flips a tail. On average, half of the participants are eliminated in each round. Suppose that there is a final "winner" who flips 10 consecutive heads, and all the other players have flipped a tail at some point. While some might attribute his winning to his superior skills, he achieves this only by chance (the chance of flipping 10 heads is 1/1024, so among 1000 players there is about 1 person who is able to do so). If he is asked to make the 11th flip, he still has 1/2 chance of getting a tail, like an average person does.

The moral of this story is: even if no one has superior forecasting ability, some fortune tellers can still seemingly outperform others and gain a 100% track record. With so many fortune tellers in the world, there ought to be some winners. However, if they are asked to forecast one more event, they might not be outperforming average fortune tellers, since they achieve the record only by pure luck.

May 23, 2006

Expectation

"Take nothing on its looks; take everything on evidence. There's no better rule."

- Charles Dickens, Great Expectations.

Expectation is what you reasonably think will happen in the future. Mathematically, it deals with all probable future events and their underlying probability. Let's take a look at two examples.

Example 1 (Reading this passage)

Suppose the universe was created randomly, not with the particular intention of creating the planet Earth and human beings. What is the probability of you and I reading this blog?

If this was asked before the existence of the universe, the probability is practically zero. There are millions of other planets, but as of today, we are still not able to find another planet that is suitable to accommodate sophisticated creatures like us. The odds of having a planet like Earth was already extremely small. Then there was the Ice Age that wiped out 95% of lives, but our ancestors luckily survived and miraculously evolved into homo sapiens, our current form. Then there were the brilliant inventions of electricity, capacitors, computers, the Internet, and so on.

Therefore, the combination of all the seemingly impossible events enables us to read this. No one would have expected that these would happen. Now they have occurred. What a miracle!

Of course, expectations will change. Suppose (after the creation of the Earth, human beings, yourself, the Internet, etc.) that you visit this site every day and have read every previous passage. That you will be reading this passage too is certainly a rational expectation. This leads us to the second example.

Example 2 (Survival of miners)

Earlier this year there was an explosion that trapped 13 miners. It was a serious accident and it was expected that most of the 13 miners were killed. Then the news reported that 12 of them survived (without identifying them who they are), raising everyone's hope. However, later it was found that there was a miscommunication between the mine officials and the press. It should be 12 deaths instead of 12 survivors. The miners' families were furious and have asked for a compensation.

I am discussing this event from an academic perspective, putting the ethical issues aside. The false news did not change the fact. It only changed people's expectations. Assuming the 13 miners have an equal chance, the families were hoping that each miner had a 12/13 chance of survival, based on the false news. When the death toll was corrected later, the probability dropped to 1/13.

However, it should be noted that the expectation of survival was close to 1/13 prior to the false news, since the accident was a serious one. If the families did not raise their expectations based on the news, they would not have suffered as much. One could blame more the change in expectations than the explosion itself.

I personally think expectation is the second worst thing (memory is the first). Not only does the change in expectations make you suffer (as in Example 2), the difference between reality and expectations also surprises you (can be good or bad). In fact, many economists believe that market expectations are already incorporated into macroeconomic variables and stock prices. Only surprises can cause big and sudden movements.

If our expectations are set too high, many surprises will come as disappointments. Increased expectations bring happiness, but not as much as positive surprises do. Therefore, one should not raise his expectations unless there is solid evidence. Live by the Dickens's rule. Do not take everything as granted and increase too much the subjective probability of favorable future events. Doesn't everything strike you as a miracle?

April 27, 2006

Soccer Betting

This is a real example that happened during Euro 2004. Now World Cup 2006 is coming up, so it strikes my mind again.

A friend of mine was betting on which team would win Euro 2004. He thought I'm interested in betting too (but he was wrong - I only enjoy watching soccer, but I hate soccer betting). One day he told me why he ended up putting his money on Germany.

"I think France will win. But France's odds are only 4:1 (meaning that the dealer would give you $4 for every $1 you bet, if France won), while Germany gives 15:1. So I'm betting on Germany."

At the first glance, this statement doesn't make sense at all. If you really think France will win, there is no way that you put your money on any team other than France. It doesn't matter even if Germany gives 1000:1 - if France wins, this is never realized.

So I am assuming that my friend has a subjective probability distribution, i.e. he doesn't know which team will win for sure, but his soccer knowledge enables him to assign winning probability to the teams. Suppose he is risk-neutral (meaning that he is indifferent between 1. $5 for sure and 2. a bet with 50% getting $10 and 50% getting nothing). In making his argument, he is likely to be thinking that France's chance of winning is lower than 1/4 and Germany's chance of winning is greater than 1/15. For example, if he thinks Germany has 10% (which is greater than 1/15) chance in winning Euro 2004, then he will benefit in the long run: suppose there are 100 of these events and each time he puts up $1. If he is right, approximately he will win 10 times, and the dealer is giving him 10*$15 = $150, which is greater than his $100 investment.

There is still a problem here. If the soccer betting market is efficient, the odds should reflect market expectation (this is the definition of "efficiency" in the finance literature); but part of my friend's expectation (France's chance of winning is lower than 1/4) is not reflected in the dealer's odds. (Note that, however, his expectation on Germany is counted, since the dealer will lower Germany's payout when he bets on it.) This is a potential explanation of why there's a huge illegal soccer betting market. Suppose my friend is a soccer expert and he is sure that France's chance of winning is only 20% (which is lower than the odds in the market, 1/4), he cannot make a profit by betting on France, but he can make profit by betting against France and making himself a dealer (which is illegal). If he accepts other people's bet at 1/4 and France wins only 1 out of every 5 times, then among 100 of these events, he collects $100 and pays only 20*$4 = $80.

To prevent illegal soccer betting, the most effective way is to make these activities not profitable. In fact, there are many betting websites that allow you to be on either side of the bet, i.e. you can be on the side that pays people when something happens, and you are effectively a dealer. (However, this is not legal in some sports and in some countries yet.) If everyone can be a dealer, he can bet whenever he doesn't agree with the market odds (My friend can bet against France in this case, and the market will be more efficient as his expectation is counted). There will be no economic justification of engaging in illegal betting markets.

April 26, 2006

Evolution

Are we still evolving? Depends.

For those of you who do not plan to have children, read my following argument.

The heart of evolution rests upon the notion, "survival of the fittest." Now that you have survived, and as long as you can provide your children with enough care, it is your obligation to reproduce; otherwise there is no meaning to evolution. The worst case is that the healthiest/smartest/wealthiest nowadays are among the most reluctant to have offspring. Imagine what would happen after several generations. Yes, the best genes survived, but they disappear once people die. Only inferior genes have a chance to get reproduced and pass along. Evolution works backwards. To say that we are going back from humans to apes would be an exaggeration, but it might be the case that one day we find our world filled with people who are less fit.

The extreme evolutionist might go on and argue that we should never cure inherited diseases, or at least should ask people infected with these diseases not to have children. This helps the evolutionary process by eliminating the "bad" genes. However, this becomes more like a moral issue. The Oath of Hippocrates, i.e. the "doctor's oath," clearly states that "into whatever houses I enter, I will go into them for the benefit of the sick." The doctor is obliged to cure all diseases whenever he is able to. Everyone has the right to live and the right to give birth to children. If they are deemed not to fit the world, natural forces will come into play. It is not you or me or the doctor who makes the decision.

Again, if you have the ability to raise a kid, it is not your decision to have one or not; it is your natural obligation since first signs of life appeared on Earth 3.6 billion years ago. Evolution is a continuous process: it should never be altered by human beings, the species that benefit from it the most.

April 13, 2006

Time Travel

This has been bothering me for a while, mainly due to my obsession with the Japanese cartoon Doraemon, in which the main character Nobita has a time machine in the drawer of his desk.

Time travel is possible if one can travel faster than the speed of light. Suppose it is achieved one day in the future. Why don't we see future time travelers in our current world?

There is a discussion by Stephen Hawking in his famous book, A Brief History of Time. I didn't study enough Physics to comment on Hawking's suggestion, but let me state what my worries are.

If future time travelers can indeed go back in time, here's a couple potential explanations of why we don't see any of them now.

1. Our current world is not interesting enough to warrant them a visit.
2. They are here, but
a) They cannot reveal their identity; or
b) They are invisible to us.

If 1 is true, then we will probably see some of these travelers in the future, which I think is more intimidating than seeing aliens. 2a is possible since they might not be able to act upon their free will, as the past is fixed and they can only be quiet spectators. But the question now is: who are they? Scientists? Homeless people on the streets? Or you and me?

2b offers an explanation to why we don't identify them, but this is even more unpleasant. If this is true, you and I and everyone else have the chance of being watched by these future travelers, and we can't even tell if they are here. However, this poses more questions as well: How do they survive in our world? What makes them invisible to us?

Given the large amount of debate generated when cloning was invented, time travel will certainly be controversial. I hope this technique will never exist (or at least time travel is only confined to going to the future). Even God cannot go back in time, so shouldn't people.

April 11, 2006

Flaws in English

Two examples can show why English language might not be a very good one, at least in terms of logic.

Example 1 (Transitivity)

Logic says:
Premise 1 A>=B
Premise 2 B>=C
Conclusion A>=C

where >= can be interpreted as a Math operator (A is greater than or equal to C), or the preference relation in Economics (A is preferred to C), or anything that has this property, transitivity.

Let's see what happens if we apply this to English:
Premise 1 Having a house is better than having a car.
Premise 2 Having a car is better than having a dog.
Conclusion Having a house is better than having a dog.

Good! But...

Premise 1 Winning $1 is better than nothing.
Premise 2 Nothing is better than winning $100000.
Conclusion Winning $1 is better than winning $100000.

Premises 1 and 2 are correct in English, but the conclusion is obviously not true.

Example 2 (Implication)

Logic says:
A => B is true when
1. A is true and B is true, or
2. A is false and B is true, or
3. A is false and B is false.

where => is the logical operator "imply" (A implies B). In English it is translated to "If A, then B," e.g. "If he is your dad, then he is a male"; "If 18 is divisible by 6, then 18 is divisible by 3"; "If it rains, then I will bring an umbrella."

So far so good. Let's see how it can be problematic:

All of the following are true logically (reason stated in parentheses):
1. If I have $1 billion now, then I will give you $500 million. (Because I don't have $1 billion now)
2. If I have a dog now, then my dog has 5 legs. (Because I don't have a dog now)
3. If I am a male, then Bill Gates is a billionaire. (Because both parts are true, although they are not related)

(Bertrand Russell once publicly stated that if 1=2, then he was God. This is in the same spirit as 1.)

They all make no sense in English. Apparently, the translation of => into "if then" is not a good one. (Actually "imply" isn't a good name for it either.) There is no place in English for this fundamental logic operator.

April 10, 2006

Perhaps...

A Chinese version of this article was posted on my personal blog some two years ago. I like it so much that I have decided to translate it into English and use it as the first post on this blog. You can tell that this blog is going to be academic (or nerdy, since I'm in graduate school), not like the random ramblings on my other blog. If you are not interested in Finance/Econ/Math/Physics/Logic/Philosophy/History, stop reading and do NOT come back again. However, don't be scared/happy, I won't post research ideas or journal papers on here. It'll be more like a scientific discussion of anything that interests me.

Probability is my favorite topic. Let me start off by giving you a typical probability problem.

Problem There are 10 people in the queue. Two of them are named A and B. Suppose all possible queues are equally likely, e.g. the queues XXXAXXXBXX and XBXXAXXXXX carry the same probability (where X denotes the other 8 people). Also suppose that each person's position is independent of everyone else's (except that they cannot be in the same position in a particular queue), and everyone is equally likely to end up in a certain position (this is called the independently and identically distributed, or iid, assumption). What is the probability of having A lining up in front of B in this queue?

Solution Combinatorics approach: Count all the favorable events, e.g. A ranked 1st, B ranked 2nd, 3rd, ..., 10th; A ranked 2nd, B ranked 3rd, 4th, ..., 10th; etc., and then divide this by the number of possible queues, 10! (This is ten factorial, i.e. 10 x 9 x 8 x ... x 1, not TEN). This works nicely, doesn't it? But what if we have 100 people instead? 100! is astronomically large.

Alternative Solution A fifth-grader might have guessed the answer correctly. The probability is simply 0.5. Note that the event "A in front of B" and the event "B in front of A" are equally likely (they are iid), and the two events make up the entire probability space (it is either A in front of B or B in front of A). Probability axioms tell us that the probability has to be 0.5.

(Side note: If there is no reason to believe that one outcome is more likely than the other, then they must be treated as equally likely outcomes. This is the Laplace's Principle of Insufficient Reason, which tells us the probability of getting heads on coin tosses is 0.5 too.)

So, what does 0.5 mean? The most appropriate answer is: If you have a large number of these queues, say 10000, law of large numbers says approximately half of them (5000) will have A in front of B.

What if I only have one queue? How do I know if I see A first or B first? The most appropriate answer is: Who knows? The sample size (one) is too small for applying the law of large numbers.

Also note that the queues themselves are independent, which is the usual and logical assumption. That is, even if we have 10000 of these queues, the arrangements of the first 9999 queues have nothing to do with the 10000th queue. You never know how a particular queue behaves, despite that you have a large sample.

In real life, how often do we have a large sample (10000 queues) and only want to be approximately correct on the macro-level (knowing that about 5000 of them have A in front of B)? Under most circumstances, we are interested to know if something will occur or not (A in front of B in this queue). Sorry, probability is not able to handle this at all. Everywhere we can find examples of highly probable events not occurring, and seemingly improbable events happening. No one knows who decides what will/will not happen.

What makes things worse is the Heisenburg Uncertainty Principle. It states that you cannot simultaneously measure the momentum and the position of an electron. Roughly speaking, you cannot be 100% certain about something by just observing/measuring other things.

So what should we do when everything in the world is uncertain and probability doesn't help much? Nothing. That's why many people resort to God/supernatural forces/fate. Maybe they are right...

To make life easier, instead of calculating probability all the time, why not just "let it be"; instead of calling something "very likely," why not just say "perhaps." Isn't the world so much simpler and nicer?