This blog site offers a quick overview of statistics issues with illustrations to help the audience understand and appreciate the value, importance, and applications of statistics in everyday life. In today's society, it is common knowledge that in order to understand about something, you must first gather data. The skill of learning from data is known as statistics. It is involved with the gathering of information, its subsequent description, and analysis, which frequently leads to conclusions.
Monday, July 29, 2019
Friday, July 26, 2019
Regression Analysis: Overview
Overview
What is regression analysis and what does it mean to perform a regression?
How does regression analysis work?


Why should your organization use regression analysis?
Regression analysis is helpful statistical method that can be leveraged across an organization to determine the degree to which particular independent variables are influencing dependent variables.
The possible scenarios for conducting regression analysis to yield valuable, actionable business insights are endless.
The next time someone in your business is proposing a hypothesis that states that one factor, whether you can control that factor or not, is impacting a portion of the business, suggest performing a regression analysis to determine just how confident you should be in that hypothesis! This will allow you to make more informed business decisions, allocate resources more efficiently, and ultimately boost your bottom line.
An Interactive Guide To The Fourier Transform
An Interactive Guide To The Fourier Transform


- What does the Fourier Transform do? Given a smoothie, it finds the recipe.
- How? Run the smoothie through filters to extract each ingredient.
- Why? Recipes are easier to analyze, compare, and modify than the smoothie itself.
- How do we get the smoothie back? Blend the ingredients.
- The Fourier Transform takes a time-based pattern, measures every possible cycle, and returns the overall "cycle recipe" (the amplitude, offset, & rotation speed for every cycle that was found).
From Smoothie To Recipe
- Pour through the "banana" filter. 1 oz of bananas are extracted.
- Pour through the "orange" filter. 2 oz of oranges.
- Pour through the "milk" filter. 3 oz of milk.
- Pour through the "water" filter. 3 oz of water.
- Filters must be independent. The banana filter needs to capture bananas, and nothing else. Adding more oranges should never affect the banana reading.
- Filters must be complete. We won't get the real recipe if we leave out a filter ("There were mangoes too!"). Our collection of filters must catch every possible ingredient.
- Ingredients must be combine-able. Smoothies can be separated and re-combined without issue (A cookie? Not so much. Who wants crumbs?). The ingredients, when separated and combined in any order, must make the same result.
See The World As Cycles
- Start with a time-based signal
- Apply filters to measure each possible "circular ingredient"
- Collect the full recipe, listing the amount of each "circular ingredient"
- If earthquake vibrations can be separated into "ingredients" (vibrations of different speeds & amplitudes), buildings can be designed to avoid interacting with the strongest ones.
- If sound waves can be separated into ingredients (bass and treble frequencies), we can boost the parts we care about, and hide the ones we don't. The crackle of random noise can be removed. Maybe similar "sound recipes" can be compared (music recognition services compare recipes, not the raw audio clips).
- If computer data can be represented with oscillating patterns, perhaps the least-important ones can be ignored. This "lossy compression" can drastically shrink file sizes (and why JPEG and MP3 files are much smaller than raw .bmp or .wav files).
- If a radio wave is our signal, we can use filters to listen to a particular channel. In the smoothie world, imagine each person paid attention to a different ingredient: Adam looks for apples, Bob looks for bananas, and Charlie gets cauliflower (sorry bud).
Think With Circles, Not Just Sinusoids
- A "sinusoid" is a specific back-and-forth pattern (a sine or cosine wave), and 99% of the time, it refers to motion in one dimension.
- A "circle" is a round, 2d pattern you probably know. If you enjoy using 10-dollar words to describe 10-cent ideas, you might call a circular path a "complex sinusoid".
Following Circular Paths
- How big is the circle? (Amplitude, i.e. size of radius)
- How fast do we draw it? (Frequency. 1 circle/second is a frequency of 1 Hertz (Hz) or 2*pi radians/sec)
- Where do we start? (Phase angle, where 0 degrees is the x-axis)

[0 1]
means- 0 amplitude for the 0Hz cycle (0Hz = a constant cycle, stuck on the x-axis at zero degrees)
- 1 amplitude for the 1Hz cycle (completes 1 cycle per time interval)
- The blue graph measures the real part of the cycle. Another lovely math confusion: the real axis of the circle, which is usually horizontal, has its magnitude shown on the vertical axis. You can mentally rotate the circle 90 degrees if you like.
- The time points are spaced at the fastest frequency. A 1Hz signal needs 2 time points for a start and stop (a single data point doesn't have a frequency). The time values
[1 -1]
shows the amplitude at these equally-spaced intervals.
[0 1]
is a pure 1Hz cycle.[0 1 1]
means "Nothing at 0Hz, 1Hz of amplitude 1, 2Hz of amplitude 1":[0 1 1]
generate the time values [2 -1 -1]
, which starts at the max (2) and dips low (-1).magnitude:angle
to set the phase. So [0 1:45]
is a 1Hz cycle that starts at 45 degrees:[0 1]
. On the time side we get [.7 -.7]
instead of [1 -1]
, because our cycle isn't exactly lined up with our measuring intervals, which are still at the halfway point (this could be desired!).Making A Spike In Time
(4 0 0 0)
, using cycles? I'll use parentheses ()
for a sequence of time points, and brackets []
for a sequence of cycles.- At time 0, the first instant, every cycle ingredient is at its max. Ignoring the other time points,
(4 ? ? ?)
can be made from 4 cycles (0Hz 1Hz 2Hz 3Hz), each with a magnitude of 1 and phase of 0 (i.e., 1 + 1 + 1 + 1 = 4). - At every future point (t = 1, 2, 3), the sum of all cycles must cancel.
Time 0 1 2 3 ------------ 0Hz: 0 0 0 0 1Hz: 0 1 2 3 2Hz: 0 2 0 2 3Hz: 0 3 2 1
- Time 0: All cycles at their max (total of 4)
- Time 1: 1Hz and 3Hz cancel (positions 1 & 3 are opposites), 0Hz and 2Hz cancel as well. The net is 0.
- Time 2: 0Hz and 2Hz line up at position 0, while 1Hz and 3Hz line up at position 2 (the opposite side). The total is still 0.
- Time 3: 0Hz and 2Hz cancel. 1Hz and 3Hz cancel.
- Time 4 (repeat of t=0): All cycles line up.
[1 1]
, [1 1 1]
, [1 1 1 1]
and notice the signals we generate: (2 0)
, (3 0 0)
, (4 0 0 0)
).
Moving The Time Spike
(0 4 0 0)
?(4 0 0 0)
, but the cycles must align at t=1 (one second in the future). Here's where phase comes in.(4 0 0 0)
time pattern. Boring.- A 0Hz cycle doesn't move, so it's already aligned
- A 1Hz cycle goes 1 revolution in the entire 4 seconds, so a 1-second delay is a quarter-turn. Phase shift it 90 degrees backwards (-90) and it gets to phase=0, the max value, at t=1.
- A 2Hz cycle is twice as fast, so give it twice the angle to cover (-180 or 180 phase shift -- it's across the circle, either way).
- A 3Hz cycle is 3x as fast, so give it 3x the distance to move (-270 or +90 phase shift)
(4 0 0 0)
are made from cycles [1 1 1 1]
, then time points (0 4 0 0)
are made from [1 1:-90 1:180 1:90]
. (Note: I'm using "1Hz", but I mean "1 cycle over the entire time period").
(0 0 4 0)
, i.e. a 2-second delay? 0Hz has no phase. 1Hz has 180 degrees, 2Hz has 360 (aka 0), and 3Hz has 540 (aka 180), so it's [1 1:180 1 1:180]
.Discovering The Full Transform
- Separate the full signal (a b c d) into "time spikes": (a 0 0 0) (0 b 0 0) (0 0 c 0) (0 0 0 d)
- For any frequency (like 2Hz), the tentative recipe is "a/4 + b/4 + c/4 + d/4" (the amplitude of each spike is split among all frequencies)
- Wait! We need to offset each spike with a phase delay (the angle for a "1 second delay" depends on the frequency).
- Actual recipe for a frequency = a/4 (no offset) + b/4 (1 second offset) + c/4 (2 second offset) + d/4 (3 second offset).

- N = number of time samples we have
- n = current sample we're considering (0 .. N-1)
- xn = value of the signal at time n
- k = current frequency we're considering (0 Hertz up to N-1 Hertz)
- Xk = amount of frequency k in the signal (amplitude and phase, a complex number)
- The 1/N factor is usually moved to the reverse transform (going from frequencies back to time). This is allowed, though I prefer 1/N in the forward transform since it gives the actual sizes for the time spikes. You can get wild and even use [Math Processing Error] on both transforms (going forward and back creates the 1/N factor).
- n/N is the percent of the time we've gone through. 2 * pi * k is our speed in radians / sec. e^-ix is our backwards-moving circular path. The combination is how far we've moved, for this speed and time.
- The raw equations for the Fourier Transform just say "add the complex numbers". Many programming languages cannot handle complex numbers directly, so you convert everything to rectangular coordinates and add those.
Onward
(1 0 0 0)
in my head. For me, it was like saying I knew addition but, gee whiz, I'm not sure what "1 + 1 + 1 + 1" would be. Why not? Shouldn't we have an intuition for the simplest of operations?- Scott Young, for the initial impetus for this post
- Shaheen Gandhi, Roger Cheng, and Brit Cruise for kicking around ideas & refining the analogy
- Steve Lehar for great examples of the Fourier Transform on images
- Charan Langton for her detailed walkthrough
- Julius Smith for a fantastic walkthrough of the Discrete Fourier Transform (what we covered today)
- Bret Victor for his techniques on visualizing learning
Appendix: Projecting Onto Cycles

Appendix: Article With R Code Samples
Appendix: Using The Code
Thursday, July 25, 2019
Tuesday, July 23, 2019
What is Principal Component Analysis?
Introduction
Step by step explanation
Step 1: Standardization
Step 2: Covariance Matrix computation
What do the covariances that we have as entries of the matrix tell us about the correlations between the variables?
It’s actually the sign of the covariance that matters :
if positive then : the two variables increase or decrease together (correlated)
if negative then : One increases when the other decreases (Inversely correlated)
Now, that we know that the covariance matrix is not more than a table that summaries the correlations between all the possible pairs of variables, let’s move to the next step.
Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix to identify the principal components
Eigenvectors and eigenvalues are the linear algebra concepts that we need to compute from the covariance matrix in order to determine the principal components of the data. Before getting to the explanation of these concepts, let’s first understand what do we mean by principal components.
Principal components are new variables that are constructed as linear combinations or mixtures of the initial variables. These combinations are done in such a way that the new variables (i.e., principal components) are uncorrelated and most of the information within the initial variables is squeezed or compressed into the first components. So, the idea is 10-dimensional data gives you 10 principal components, but PCA tries to put maximum possible information in the first component, then maximum remaining information in the second and so on.
An important thing to realize here is that the principal components are less interpretable and don’t have any real meaning since they are constructed as linear combinations of the initial variables.
Geometrically speaking, principal components represent the directions of the data that explain a maximal amount of variance, that is to say, the lines that capture most information of the data. The relationship between variance and information here, is that, the larger the variance carried by a line, the larger the dispersion of the data points along with it, and the larger the dispersion along a line, the more the information it has. To put all this simply, just think of principal components as new axes that provide the best angle to see and evaluate the data, so that the differences between the observations are better visible.
Step 4: Feature vector
Last step : Recast the data along the principal components axes
In the previous steps, apart from standardization, you do not make any changes on the data, you just select the principal components and form the feature vector, but the input data set remains always in terms of the original axes (i.e, in terms of the initial variables).
In this step, which is the last one, the aim is to use the feature vector formed using the eigenvectors of the covariance matrix, to reorient the data from the original axes to the ones represented by the principal components (hence the name Principal Components Analysis). This can be done by multiplying the transpose of the original data set by the transpose of the feature vector.


Monday, July 15, 2019
Chi-Square Analysis Test of Independence
Chi-Square Analysis
CRITICAL VALUEb by edniel maratas on Scribd
Analysis of Variance (ANOVA)
DISCUSSION ON ANOVA
CRITICAL VALUE
Friday, July 12, 2019
Tuesday, July 9, 2019
Test Concerning Means:Small Sample
Test on Small Sample (t-test)
4. Determine the critical value and critical region
Since the level of significance is 0.025 and df = n – 1 = 6 – 1 = 5, and the alternative hypothesis is left – tailed test, then the critical value (

Reject Ho, if t computed is less than -2.571.
5. Compute the value of the test statistics:
Given:
Sample mean = 11.6 minutes
Population mean = 12 minutes
Standard deviation = 2.1 minutes
Sample size = 6 samples
6. Decision: Since the computed t = -0.47 is in the acceptance region, thus, we fail to reject Ho.
7. Conclusion:
Test Concerning Means:Large Sample
Test Concerning Means
One sample Case
Test Concerning Means: Large Sample (z-test)
Solution: Following the steps in hypothesis testing we have
1. State the null and alternative hypothesis. Mathematically,
Ho: μ = 36 months (The average lifetime of lightbulbs is 36 months. Or the average lifetime of lightbulb is not different to 36 months)
Ha: μ ≠ 36 months (The average lifetime of lightbulbs is not equal to 36 months. Or the average lifetime of lightbulb is different to 36 months)
2. Level of significance α = 0.01.
3. Select an appropriate test statistic.
The test statistic is the z – test, the sample size is greater 30 and the formula is
6. Decision: Since the computed z = - 3.54 is in the rejection region, thus, reject Ho and accept Ha: μ ≠ 36 months
7. Conclusion: