Difference between revisions of "Statistical formulas documentation"
Line 17: | Line 17: | ||
| Less than $15,000 || width="150"|11,714 || 11,714 | | Less than $15,000 || width="150"|11,714 || 11,714 | ||
|- | |- | ||
− | | $15,000 - $24,999 || 46,054 || 57, 768 | + | | $15,000 - $24,999 || 46,054 || 57, 768 |
|- | |- | ||
| $25,000 - $34,999 || 83,965 || 141,733 | | $25,000 - $34,999 || 83,965 || 141,733 |
Revision as of 17:51, 8 August 2013
mTAB Median Calculation from Income Brackets
Approximate Annual HH Income | |
Median | 71,180 |
Unweighted Sample Total Count | 10,811 |
Approximate Annual HH Income | Weighted Response (1) Accumulated Response |
||
Less than $15,000 | 11,714 | 11,714 | |
$15,000 - $24,999 | 46,054 | 57, 768 | |
$25,000 - $34,999 | 83,965 | 141,733 | |
$35,000 - $44,999 | 102,093 | 243,826 | |
$45,000 - $59,999 | 155,721 | 399,546 | |
$60,000 - $74,999 | 161,435 | 560,981 <--Median will fall here (3) | |
$75,000 - $99,999 | 193,540 | 754,521 | |
$100,000 - $124,999 | 134,706 | 889,227 | |
$125,000 - $149,999 | 59,748 | 948,975 | |
$150,000 - $199,999 | 41,971 | 990,946 | |
$200,000 - $249,999 | 16,391 | 1,007,337 | |
$250,000 or More | 32,409 | 1,039,746 | |
Weighted Subset Total Count | 1,039,746 | ||
Weighted Sample Total Count | 1,255,411 |
(1) | Calculated Accumulated Weighted Response | |
(2) | Divide total (1,039,746) by 2=519,873 | 519,873 |
(3) | Find first value in Accumulated Response column that is greater than step 2 value | |
The median will fall between the $60,000-$74,999 bracket | ||
(4) | Step 2 amount (519,873) MINUS preceding break accumulated response 399,546 = | 120,327 |
(5) | Acc. Response where Median will fall 560,981 MINUS preceding break 399,546 = | 161,435 |
(6) | Step 4 Divided by Step 5 | 0.74536 |
(7) | Multiply Step 6 by the range 14,999 ($60,000-$75,999) | 11180 |
(8) | Add Step 7 to bottom of range $60,000 | 71,180 |
mTAB Mean/Weighted Average Calculation from Income Brackets
Approximate Annual HH Income | |
Mean/Weighted Average | 83,610 |
Unweighted Sample Total Count | 10,811 |
(A) | (B) | (C) | |||
Approximate Annual HH Income | STAT1 | STAT2 | Midpoint | ||
Less than $15,000 | 11,714 | 1 | 14,999 | 7,500 | 87,857,249 |
$15,000 - $24,999 | 46,054 | 15,000 | 24,999 | 20,000 | 921,059,004 |
$25,000 - $34,999 | 83,965 | 25,000 | 34,999 | 30,000 | 2,518,899,346 |
$35,000 - $44,999 | 102,093 | 35,000 | 44,999 | 40,000 | 4,083,654,266 |
$45,000 - $59,999 | 155,721 | 45,000 | 59,999 | 52,500 | 8,175,254,132 |
$60,000 - $74,999 | 161,435 | 60,000 | 74,999 | 67,500 | 10,896,752,251 |
$75,000 - $99,999 | 193,540 | 75,000 | 99,999 | 87,500 | 16,934,669,636 |
$100,000 - $124,999 | 134,706 | 100,000 | 124,999 | 112,500 | 15,154,345,342 |
$125,000 - $149,999 | 59,748 | 125,000 | 149,999 | 137,500 | 8,125,258,359 |
$150,000 - $199,999 | 41,971 | 150,000 | 199,999 | 175,000 | 7,344,910,850 |
$200,000 - $249,999 | 16,391 | 200,000 | 249,000 | 225,000 | 3,688,025,252 |
$250,000 or More | 32,409 | 250,000 | 300,000 | 275,000 | 8,912,452,979 |
Weighted Subset Total Count | 1,039,746 | 86,933,138,666 | |||
Weighted Sample Total Count | 1,255,411 |
- (1) Find Midpoint of data ranges - Column (B)
- (2) Multiply Weighted Counts (A) by Midpoints (B) to generate (C)
- (3) Divide the sum of column (C) by the total weighted response at the bottom of column (A)...
- 86,933,138,666 divided by 1,039,746 = 83,610
- You will notice the calculated average matches the mTAB produced average
Standard Deviation Calculation
For categorized questions, each response is assigned 1 or 2 stat weights. If a single weight is assigned, then this is the value used to calculate the standard deviation. If 2 weights are provided, the midpoint is used.
D = Question Mean - Stat Value as described above
SS = Sum of Squares, D*D*Weighted Response Count, for all table responses
Sample = Sum of all Weighted Response Counts for all table responses
Standard Deviation = SQRT(SS/Sample-1));
The calculation is the same for continuous variables except the actual data values are used instead of stat weight.
D = Question Mean - Response Value
SS = Sum of Squares, D*D*Respondent Weight Count for each response
Sample = Sum of all Respondent Weights for each response
Standard Deviation = SQRT(SS/Sample-1));
Comparison of two population means using T-Statistic
When the two populations have equal variances
\(t = \dfrac{m_1-m_2}{\sqrt{\dfrac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}(\dfrac{1}{n_1}+\dfrac{1}{n_2})}}\)
Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample
m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample
Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%
When the two populations have UNEQUAL variances
\(t = \dfrac{m_1-m_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}}\)
Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample
m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample
Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%