Difference between revisions of "Statistical formulas documentation"

Revision as of 18:51, 8 August 2013

mTAB Median Calculation from Income Brackets

Approximate Annual HH Income
Median	71,180
Unweighted Sample Total Count	10,811

Approximate Annual HH Income	Weighted Response (1) Accumulated Response
Less than $15,000	11,714	11,714
$15,000 - $24,999	46,054	57, 768
$25,000 - $34,999	83,965	141,733
$35,000 - $44,999	102,093	243,826
$45,000 - $59,999	155,721	399,546
$60,000 - $74,999	161,435	560,981 <--Median will fall here (3)
$75,000 - $99,999	193,540	754,521
$100,000 - $124,999	134,706	889,227
$125,000 - $149,999	59,748	948,975
$150,000 - $199,999	41,971	990,946
$200,000 - $249,999	16,391	1,007,337
$250,000 or More	32,409	1,039,746
Weighted Subset Total Count	1,039,746
Weighted Sample Total Count	1,255,411

(1)	Calculated Accumulated Weighted Response
(2)	Divide total (1,039,746) by 2=519,873	519,873
(3)	Find first value in Accumulated Response column that is greater than step 2 value
	The median will fall between the $60,000-$74,999 bracket
(4)	Step 2 amount (519,873) MINUS preceding break accumulated response 399,546 =	120,327
(5)	Acc. Response where Median will fall 560,981 MINUS preceding break 399,546 =	161,435
(6)	Step 4 Divided by Step 5	0.74536
(7)	Multiply Step 6 by the range 14,999 ($60,000-$75,999)	11180
(8)	Add Step 7 to bottom of range $60,000	71,180

mTAB Mean/Weighted Average Calculation from Income Brackets

Approximate Annual HH Income
Mean/Weighted Average	83,610
Unweighted Sample Total Count	10,811

	(A)			(B)	(C)
Approximate Annual HH Income		STAT1	STAT2	Midpoint
Less than $15,000	11,714	1	14,999	7,500	87,857,249
$15,000 - $24,999	46,054	15,000	24,999	20,000	921,059,004
$25,000 - $34,999	83,965	25,000	34,999	30,000	2,518,899,346
$35,000 - $44,999	102,093	35,000	44,999	40,000	4,083,654,266
$45,000 - $59,999	155,721	45,000	59,999	52,500	8,175,254,132
$60,000 - $74,999	161,435	60,000	74,999	67,500	10,896,752,251
$75,000 - $99,999	193,540	75,000	99,999	87,500	16,934,669,636
$100,000 - $124,999	134,706	100,000	124,999	112,500	15,154,345,342
$125,000 - $149,999	59,748	125,000	149,999	137,500	8,125,258,359
$150,000 - $199,999	41,971	150,000	199,999	175,000	7,344,910,850
$200,000 - $249,999	16,391	200,000	249,000	225,000	3,688,025,252
$250,000 or More	32,409	250,000	300,000	275,000	8,912,452,979
Weighted Subset Total Count	1,039,746				86,933,138,666
Weighted Sample Total Count	1,255,411

(1) Find Midpoint of data ranges - Column (B)

(2) Multiply Weighted Counts (A) by Midpoints (B) to generate (C)

(3) Divide the sum of column (C) by the total weighted response at the bottom of column (A)...

86,933,138,666 divided by 1,039,746 = 83,610

You will notice the calculated average matches the mTAB produced average

Standard Deviation Calculation

For categorized questions, each response is assigned 1 or 2 stat weights. If a single weight is assigned, then this is the value used to calculate the standard deviation. If 2 weights are provided, the midpoint is used.

D = Question Mean - Stat Value as described above
SS = Sum of Squares, D*D*Weighted Response Count, for all table responses
Sample = Sum of all Weighted Response Counts for all table responses

Standard Deviation = SQRT(SS/Sample-1));

The calculation is the same for continuous variables except the actual data values are used instead of stat weight.

D = Question Mean - Response Value
SS = Sum of Squares, D*D*Respondent Weight Count for each response
Sample = Sum of all Respondent Weights for each response

Standard Deviation = SQRT(SS/Sample-1));

Comparison of two population means using T-Statistic

When the two populations have equal variances

$t = \dfrac{m_1-m_2}{\sqrt{\dfrac{(n_1-1)s_1^2+(n_2-1)s_2^2}{n_1+n_2-2}(\dfrac{1}{n_1}+\dfrac{1}{n_2})}}$

Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample

m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample

Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%

When the two populations have UNEQUAL variances

$t = \dfrac{m_1-m_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}}$

Where
m1 = Mean of the 1st sample
s1 = Standard Deviation of the 1st sample
n1 = Un-weighted Sample of the 1st sample

m2 = Mean of the 2nd sample
s2 = Standard Deviation of the 2nd sample
n2 = Un-weighted Sample of the 2nd sample

Decision rules:
If |t|<1.65 then the two populations are NOT significantly different at 90%
If |t|≥1.65 then the two populations ARE significantly different at 90%
If |t|<1.95 then the two populations are NOT significantly different at 95%
If |t|≥1.95 then the two populations ARE significantly different at 95%

@@ Line 17: / Line 17: @@
 | Less than &#36;15,000 || width="150"|11,714 || 11,714
 |-
-| &#36;15,000 - &#36;24,999 || 46,054 || 57, 768
+|  &#36;15,000 - &#36;24,999 || 46,054 || 57, 768
 |-
 | &#36;25,000 - &#36;34,999 || 83,965 || 141,733

Difference between revisions of "Statistical formulas documentation"

Revision as of 18:51, 8 August 2013

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Administration

Filter

Format

Layer

Recode

Verbatim Responses

Spreadsheet

Subset

TopN

User Defined Questions

Slice

Sig Testing/Statistics

Charts

Save/Export

Other

Videos

Whitepapers

Tools