Statistics and Quantitative Methods
By: ramonblanco • June 23, 2012 • Case Study • 2,066 Words (9 Pages) • 1,587 Views
STATISTICS AND QUANTITATIVE METHODS
PRACTICAL CASE (Individual)
MBA 2010-2011
Story Name: TV, Physicians, and Life Expectancy.
Methods: Descriptive Statistics, Box and Whisker Plot, Correlation and Regression, Confidence Intervals and Hypothesis Testing.
Reference: http://mathforum.org/workshops/sum96/data.collections/datalibrary/data.set6.html
Description: Number of TV´s, number of Physicians and Male and Female life expectancy in 40 countries all over the world.
Number of cases: 40
Variable Names:
1.- Average Life Expectancy
2.- Male Life Expectancy
3.- Female Life Expectancy
4.- People per TV
5.- People per Physician
Television, Physicians, and Life Expectancy
Country Life expectancy People/TV People/ physician Female life expectancy Male life expectancy
Argentina 70,5 4 370 74 67
Bangladesh 53,5 315 6166 53 54
Brazil 65 4 684 68 62
Canada 76,5 1,7 449 80 73
China 70 8 643 72 68
Colombia 71 5,6 1551 74 68
Egypt 60,5 15 616 61 60
Ethiopia 51,5 503 36660 53 50
France 78 2,6 403 82 74
Germany 76 2,6 346 79 73
India 57,5 44 2471 58 57
Indonesia 61 24 7427 63 59
Iran 64,5 23 2992 65 64
Italy 78,5 3,8 233 82 75
Japan 79 1,8 609 82 76
Kenya 61 96 7615 63 59
Korea, North 70 90 370 73 67
Korea, South 70 4,9 1066 73 67
Mexico 72 6,6 600 76 68
Morocco 64,5 21 4873 66 63
Myanmar (Burma) 54,5 592 3485 56 53
Pakistan 56,5 73 2364 57 56
Peru 64,5 14 1016 67 62
Philippines 64,5 8,8 1062 67 62
Poland 73 3,9 480 77 69
Romania 72 6 559 75 69
Russia 69 3,2 259 74 64
South Africa 64 11 1340 67 61
Spain 78,5 2,6 275 82 75
Sudan 53 23 12550 54 52
Taiwan 75 3,2 965 78 72
Tanzania 52,5 * 25229 55 50
Thailand 68,5 11 4883 71 66
Turkey 70 5 1189 72 68
Ukraine 70,5 3 226 75 66
United Kingdom 76 3 611 79 73
United States 75,5 1,3 404 79 72
Venezuela 74,5 5,6 576 78 71
Vietnam 65 29 3096 67 63
Zaire 54 * 23193 56 52
Data Description
This data set contains the values of the life expectancy (for men, women and average) for 40 different countries. It also contains the number of people per TV and physician.
Without doing any statistic calculations, we can easily infer that there must be some kind of relation between the number of physicians per person and the life expectancy. We can also preview that there also has to be any relation between the number of TV´s per person and the number of physician per person because a higher number of TV´s per person will mean a higher number of physician per person too (both variables represent in any way the well-being state of a country).
However, we are going to develop some statistic methods in order to figure out what are the relations between the variables we have mentioned and see how strong are those relations.
Descriptive Statistics
We are now going to take a look at the data set graphs such as the histogram and box-and-whisker plot basically.
We will describe the average life expectancy from an statistical point of view.
First of all, we go for the scattered plot.
As we can see from the scattered plot the data are not concentrated but they are scattered along a wide range. It seems not to be any outlier and the distribution seems to have a quite significant flattered shape. We cannot say so far whether the distribution is symmetric or not but we can advance that the mean and the median should each other be very close.
So, let´s now go to the box-and-whisker plot to try to find out the rest of information.
Now we can describe more data characteristics from the chart. The first thing we see from the graph is that the distribution is not symmetric and that the mean is lower than the median. From what we have learnt, we now that when median>mean the distribution is skewed to the left. In this case we see that the distribution has a light skewed to the left. Even, from this type of distribution we can say that the mode is going to be higher than the median (we will see it when we´ll go for the histogram). We now confirm that there are not outliers (outside points, as we predicted before).
According to the box-and-whisker plot we have the following measures of central tendency:
Minimum = 51,5 Maximum = 79,0 Range = 27,5= (79,0-51,5)
First quartile= q1= percentile 25,0% = 61,0
Second quartile= q2= median= percentile 50,0% = 69,5 (which separates data in two halves)
Mean = 67,0375 (as we said before, the median and the mean are very close)
Third quartile= q3= percentile 75,0% = 73,75
Interquartile Range (IQR) = q3-q1= 73,75-61,00= 12,75
The measures of variability are the following:
Variance = 68,0434 Standard deviation = 8,24884
Finally, if we plot the histogram we will finish the descriptive analysis:
From the histogram we can confirm that the data are not symmetric. Regarding the shape of the distribution, now we see how flattered is our distribution (Stnd. kurtosis = -1,18509, negative value= platykurtic) and we see the light skewed to the left (Stnd. skewness = -1,062, negative value). These are measures of shape.
Descriptive Statistics for Male and Female Life Expectancy
One question that anyone could ask himself is whether or not men have a higher or lower life expectancy than women. In order to answer this obvious question let´s use our knowledge of statistics.
First, we are going to consider the two samples of the population, that means:
Sample 1: Female life expect: 40 values ranging from 53,0 to 82,0
Sample 2: Male life expect: 40 values ranging from 50,0 to 76,0
Without doing any calculation, we see that women have, on average, a higher life expectancy than men. We can also say that the range of variability is a bit lower for women than for men.
Women Men
Minimum 53,0 50,0
Maximum 82,0 76,0
Range 29,0 26,0
But, quite more interesting is to compare the two samples according to the box-and-whisker plots. We can get a lot of information from the analysis of these charts. Let´s see how:
Women Men
Mean 69,575 64,5 (difference of 5,075)
Variance 83,9429 55,0769
Standard deviation 9,16204 7,42138
We can confirm that, on average, women have a higher life expectancy than men.
From the box plot, we also see that the two distributions overlapped, which means that some members of each sample share common values of life expectancy.
...