Difference between revisions of "Statistical formulas documentation"
(34 intermediate revisions by the same user not shown) | |||
Line 11: | Line 11: | ||
− | {| | + | <nomathjax>{| |
|- | |- | ||
− | | width="225"|Approximate Annual HH Income || colspan="2"|Weighted<br/>Response <span style="color:red">'''(1) Accumulated Response'''</span> || | + | |width="225"|Approximate Annual HH Income || colspan="2"|Weighted<br/>Response <span style="color:red">'''(1) Accumulated Response'''</span> || |
|- | |- | ||
− | | Less than | + | |Less than $15,000 || width="150"|11,714 || 11,714 |
|- | |- | ||
− | | | + | |$15,000 - $24,999 || 46,054 || 57, 768 |
|- | |- | ||
− | | | + | |$25,000 - $34,999 || 83,965 || 141,733 |
|- | |- | ||
− | | | + | |$35,000 - $44,999 || 102,093 || 243,826 |
|- | |- | ||
− | | | + | |$45,000 - $59,999 || 155,721 || '''399,546''' |
|- | |- | ||
− | | | + | |$60,000 - $74,999 || 161,435 ||<span style="color:green">'''560,981'''</span> <--''Median will fall here'' <span style="color:red">(3)</span> |
|- | |- | ||
− | | | + | |$75,000 - $99,999 || 193,540 || 754,521 |
|- | |- | ||
− | | | + | |$100,000 - $124,999 || 134,706 || 889,227 |
|- | |- | ||
− | | | + | |$125,000 - $149,999 || 59,748 || 948,975 |
|- | |- | ||
− | | | + | |$150,000 - $199,999 || 41,971 || 990,946 |
|- | |- | ||
− | | | + | |$200,000 - $249,999 || 16,391 || 1,007,337 |
|- | |- | ||
− | | | + | |$250,000 or More || 32,409 || <span style="color:blue">'''1,039,746'''</span> |
|- | |- | ||
− | | Weighted Subset Total Count || <span style="color:blue">'''1,039,746'''</span> | + | |Weighted Subset Total Count || <span style="color:blue">'''1,039,746'''</span> |
|- | |- | ||
− | | Weighted Sample Total Count || 1,255,411 | + | |Weighted Sample Total Count || 1,255,411 |
− | |} | + | |}</nomathjax> |
− | {| | + | <nomathjax>{| |
|- | |- | ||
| <span style="color:red">(1)</span> || width="550"|<span style="color:red">'''Calculated Accumulated Weighted Response'''</span> | | <span style="color:red">(1)</span> || width="550"|<span style="color:red">'''Calculated Accumulated Weighted Response'''</span> | ||
Line 53: | Line 53: | ||
| <span style="color:red">(3)</span> || Find first value in Accumulated Response column that is greater than step 2 value | | <span style="color:red">(3)</span> || Find first value in Accumulated Response column that is greater than step 2 value | ||
|- | |- | ||
− | | || The ''median will fall between the | + | | || The ''median will fall between the $60,000-$74,999'' bracket |
|- | |- | ||
| <span style="color:red">(4)</span> || Step 2 amount (<span style="color:red">519,873</span>) MINUS preceding break accumulated response '''399,546''' = || 120,327 | | <span style="color:red">(4)</span> || Step 2 amount (<span style="color:red">519,873</span>) MINUS preceding break accumulated response '''399,546''' = || 120,327 | ||
|- | |- | ||
− | | <span style="color:red">(5)</span> || Acc. Response where | + | | <span style="color:red">(5)</span> || Acc. Response where Median will fall <span style="color:green">560,981</span> MINUS preceding break '''399,546''' = || 161,435 |
|- | |- | ||
| <span style="color:red">(6)</span> || Step 4 Divided by Step 5 || 0.74536 | | <span style="color:red">(6)</span> || Step 4 Divided by Step 5 || 0.74536 | ||
|- | |- | ||
− | | <span style="color:red">(7)</span> || Multiply Step 6 by the range 14,999 ( | + | | <span style="color:red">(7)</span> || Multiply Step 6 by the range 14,999 ($60,000-$75,999) || 11180 |
|- | |- | ||
− | | <span style="color:red">(8)</span> || Add Step 7 to bottom of range | + | | <span style="color:red">(8)</span> || Add Step 7 to bottom of range $60,000 || '''71,180''' |
− | |} | + | |}</nomathjax> |
Line 79: | Line 79: | ||
− | {| | + | <nomathjax>{| |
|- | |- | ||
| || <span style="color:blue">(A)</span> || || || <span style="color:blue">(B)</span> || <span style="color:blue">(C)</span> | | || <span style="color:blue">(A)</span> || || || <span style="color:blue">(B)</span> || <span style="color:blue">(C)</span> | ||
Line 85: | Line 85: | ||
| width="225"|Approximate Annual HH Income || || STAT1 || STAT2 || Midpoint | | width="225"|Approximate Annual HH Income || || STAT1 || STAT2 || Midpoint | ||
|- | |- | ||
− | | Less than | + | | Less than $15,000 || width="150"|11,714 || width="150"|1 || width="150"|14,999 || width="150"|7,500 || 87,857,249 |
|- | |- | ||
− | | | + | | $15,000 - $24,999 || 46,054 || 15,000 || 24,999 || 20,000 || 921,059,004 |
|- | |- | ||
− | | | + | | $25,000 - $34,999 || 83,965 || 25,000 || 34,999 || 30,000 || 2,518,899,346 |
|- | |- | ||
− | | | + | | $35,000 - $44,999 || 102,093 || 35,000 || 44,999 || 40,000 || 4,083,654,266 |
|- | |- | ||
− | | | + | | $45,000 - $59,999 || 155,721 || 45,000 || 59,999 || 52,500 || 8,175,254,132 |
|- | |- | ||
− | | | + | | $60,000 - $74,999 || 161,435 || 60,000 || 74,999 || 67,500 || 10,896,752,251 |
|- | |- | ||
− | | | + | | $75,000 - $99,999 || 193,540 || 75,000 || 99,999 || 87,500 || 16,934,669,636 |
|- | |- | ||
− | | | + | | $100,000 - $124,999 || 134,706 || 100,000 || 124,999 || 112,500 || 15,154,345,342 |
|- | |- | ||
− | | | + | | $125,000 - $149,999 || 59,748 || 125,000 || 149,999 || 137,500 || 8,125,258,359 |
|- | |- | ||
− | | | + | | $150,000 - $199,999 || 41,971 || 150,000 || 199,999 || 175,000 || 7,344,910,850 |
|- | |- | ||
− | | | + | | $200,000 - $249,999 || 16,391 || 200,000 || 249,000 || 225,000 || 3,688,025,252 |
|- | |- | ||
− | | | + | | $250,000 or More || 32,409 || 250,000 || 300,000 || 275,000 || 8,912,452,979 |
|- | |- | ||
| Weighted Subset Total Count || '''1,039,746''' || || || || '''86,933,138,666''' | | Weighted Subset Total Count || '''1,039,746''' || || || || '''86,933,138,666''' | ||
|- | |- | ||
| Weighted Sample Total Count || 1,255,411 | | Weighted Sample Total Count || 1,255,411 | ||
− | |} | + | |}</nomathjax> |
Line 126: | Line 126: | ||
For categorized questions, each response is assigned 1 or 2 stat weights. If a single weight is assigned, then this is the value used to calculate the standard deviation. If 2 weights are provided, the midpoint is used. | For categorized questions, each response is assigned 1 or 2 stat weights. If a single weight is assigned, then this is the value used to calculate the standard deviation. If 2 weights are provided, the midpoint is used. | ||
+ | D = Question Mean - Stat Value as described above<br/> | ||
+ | SS = Sum of Squares, D*D*Weighted Response Count, for all table responses<br/> | ||
+ | Sample = Sum of all Weighted Response Counts for all table responses | ||
− | + | Standard Deviation = SQRT(SS/Sample-1)); | |
− | |||
− | + | The calculation is the same for continuous variables except the actual data values are used instead of stat weight. | |
+ | D = Question Mean - Response Value<br/> | ||
+ | SS = Sum of Squares, D*D*Respondent Weight Count for each response<br/> | ||
+ | Sample = Sum of all Respondent Weights for each response | ||
Standard Deviation = SQRT(SS/Sample-1)); | Standard Deviation = SQRT(SS/Sample-1)); | ||
− | + | '''Comparison of two population means using T-Statistic | |
+ | <u>'''When the two populations have equal variances'''</u> | ||
− | + | <math>t = \dfrac{m_1-m_2}{\sqrt{\dfrac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}(\dfrac{1}{n_1}+\dfrac{1}{n_2})}}</math> | |
− | + | Where<br/> | |
+ | m1 = Mean of the 1st sample<br/> | ||
+ | s1 = Standard Deviation of the 1st sample<br/> | ||
+ | n1 = Un-weighted Sample of the 1st sample | ||
− | + | m2 = Mean of the 2nd sample<br/> | |
+ | s2 = Standard Deviation of the 2nd sample<br/> | ||
+ | n2 = Un-weighted Sample of the 2nd sample | ||
+ | Decision rules:<br/> | ||
+ | If |t|<1.65 then the two populations are NOT significantly different at 90%<br/> | ||
+ | If |t|≥1.65 then the two populations ARE significantly different at 90%<br/> | ||
+ | If |t|<1.95 then the two populations are NOT significantly different at 95%<br/> | ||
+ | If |t|≥1.95 then the two populations ARE significantly different at 95% | ||
− | |||
+ | <u>'''When the two populations have UNEQUAL variances'''</u> | ||
− | + | <math>t = \dfrac{m_1-m_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}}</math> | |
− | < | + | Where<br/> |
+ | m1 = Mean of the 1st sample<br/> | ||
+ | s1 = Standard Deviation of the 1st sample<br/> | ||
+ | n1 = Un-weighted Sample of the 1st sample | ||
+ | |||
+ | m2 = Mean of the 2nd sample<br/> | ||
+ | s2 = Standard Deviation of the 2nd sample<br/> | ||
+ | n2 = Un-weighted Sample of the 2nd sample | ||
− | < | + | Decision rules: <br/> |
+ | If |t|<1.65 then the two populations are NOT significantly different at 90%<br/> | ||
+ | If |t|≥1.65 then the two populations ARE significantly different at 90%<br/> | ||
+ | If |t|<1.95 then the two populations are NOT significantly different at 95%<br/> | ||
+ | If |t|≥1.95 then the two populations ARE significantly different at 95% |
Latest revision as of 13:27, 26 August 2013
mTAB Median Calculation from Income Brackets
Approximate Annual HH Income | |
Median | 71,180 |
Unweighted Sample Total Count | 10,811 |
Approximate Annual HH Income | Weighted Response (1) Accumulated Response |
||
Less than $15,000 | 11,714 | 11,714 | |
$15,000 - $24,999 | 46,054 | 57, 768 | |
$25,000 - $34,999 | 83,965 | 141,733 | |
$35,000 - $44,999 | 102,093 | 243,826 | |
$45,000 - $59,999 | 155,721 | 399,546 | |
$60,000 - $74,999 | 161,435 | 560,981 <--Median will fall here (3) | |
$75,000 - $99,999 | 193,540 | 754,521 | |
$100,000 - $124,999 | 134,706 | 889,227 | |
$125,000 - $149,999 | 59,748 | 948,975 | |
$150,000 - $199,999 | 41,971 | 990,946 | |
$200,000 - $249,999 | 16,391 | 1,007,337 | |
$250,000 or More | 32,409 | 1,039,746 | |
Weighted Subset Total Count | 1,039,746 | ||
Weighted Sample Total Count | 1,255,411 |
(1) | Calculated Accumulated Weighted Response | |
(2) | Divide total (1,039,746) by 2=519,873 | 519,873 |
(3) | Find first value in Accumulated Response column that is greater than step 2 value | |
The median will fall between the $60,000-$74,999 bracket | ||
(4) | Step 2 amount (519,873) MINUS preceding break accumulated response 399,546 = | 120,327 |
(5) | Acc. Response where Median will fall 560,981 MINUS preceding break 399,546 = | 161,435 |
(6) | Step 4 Divided by Step 5 | 0.74536 |
(7) | Multiply Step 6 by the range 14,999 ($60,000-$75,999) | 11180 |
(8) | Add Step 7 to bottom of range $60,000 | 71,180 |
mTAB Mean/Weighted Average Calculation from Income Brackets
Approximate Annual HH Income | |
Mean/Weighted Average | 83,610 |
Unweighted Sample Total Count | 10,811 |
(A) | (B) | (C) | |||
Approximate Annual HH Income | STAT1 | STAT2 | Midpoint | ||
Less than $15,000 | 11,714 | 1 | 14,999 | 7,500 | 87,857,249 |
$15,000 - $24,999 | 46,054 | 15,000 | 24,999 | 20,000 | 921,059,004 |
$25,000 - $34,999 | 83,965 | 25,000 | 34,999 | 30,000 | 2,518,899,346 |
$35,000 - $44,999 | 102,093 | 35,000 | 44,999 | 40,000 | 4,083,654,266 |
$45,000 - $59,999 | 155,721 | 45,000 | 59,999 | 52,500 | 8,175,254,132 |
$60,000 - $74,999 | 161,435 | 60,000 | 74,999 | 67,500 | 10,896,752,251 |
$75,000 - $99,999 | 193,540 | 75,000 | 99,999 | 87,500 | 16,934,669,636 |
$100,000 - $124,999 | 134,706 | 100,000 | 124,999 | 112,500 | 15,154,345,342 |
$125,000 - $149,999 | 59,748 | 125,000 | 149,999 | 137,500 | 8,125,258,359 |
$150,000 - $199,999 | 41,971 | 150,000 | 199,999 | 175,000 | 7,344,910,850 |
$200,000 - $249,999 | 16,391 | 200,000 | 249,000 | 225,000 | 3,688,025,252 |
$250,000 or More | 32,409 | 250,000 | 300,000 | 275,000 | 8,912,452,979 |
Weighted Subset Total Count | 1,039,746 | 86,933,138,666 | |||
Weighted Sample Total Count | 1,255,411 |
- (1) Find Midpoint of data ranges - Column (B)
- (2) Multiply Weighted Counts (A) by Midpoints (B) to generate (C)
- (3) Divide the sum of column (C) by the total weighted response at the bottom of column (A)...
- 86,933,138,666 divided by 1,039,746 = 83,610
- You will notice the calculated average matches the mTAB produced average
Standard Deviation Calculation
For categorized questions, each response is assigned 1 or 2 stat weights. If a single weight is assigned, then this is the value used to calculate the standard deviation. If 2 weights are provided, the midpoint is used.
D = Question Mean - Stat Value as described above
SS = Sum of Squares, D*D*Weighted Response Count, for all table responses
Sample = Sum of all Weighted Response Counts for all table responses
Standard Deviation = SQRT(SS/Sample-1));
The calculation is the same for continuous variables except the actual data values are used instead of stat weight.
D = Question Mean - Response Value
SS = Sum of Squares, D*D*Respondent Weight Count for each response
Sample = Sum of all Respondent Weights for each response
Standard Deviation = SQRT(SS/Sample-1));
Comparison of two population means using T-Statistic
When the two populations have equal variances
\(t = \dfrac{m_1-m_2}{\sqrt{\dfrac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}(\dfrac{1}{n_1}+\dfrac{1}{n_2})}}\)
Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample
m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample
Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%
When the two populations have UNEQUAL variances
\(t = \dfrac{m_1-m_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}}\)
Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample
m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample
Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%