Math Help Forum

Math Help Forum Feed Site Feed

Go Back   Math Help Forum > University Math Help > Advanced Probability and Statistics
Reply
 
Thread Tools Display Modes
  #1  
Old December 11th, 2008, 08:50 AM
Newbie
 
Join Date: Dec 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
dreamer1 is on a distinguished road
Default CDF of generalized gaussian distribution

What would be the expression for the cumulative distribution function of the generalized gaussian distribution. The PDF of distribution is given by:
f(x)=a e^{-|bx|^c},
where
a=\frac{bc}{2\Gamma(\frac{1}{c})}
and
b=\frac{1}{\sigma_x} \sqrt{\frac{\Gamma(\frac{3}{c})}{\Gamma(\frac{1}{c}) }}

Thanks

Last edited by dreamer1; December 11th, 2008 at 12:11 PM.
Reply With Quote
Advertisement
 
  #2  
Old December 11th, 2008, 10:59 AM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

\int^{x}_{-\infty}a\exp(-|b t|^c)\, dt

After some googling, I have not found anything better than that.

You can write down the expression in terms of the generalized error function (see this) but in the end you still have the same integral at the heart of it. The fact is that you cannot find an expression that doesn't contain either an integral or an infinite series for the general case. For certain values of c (like 1), you could obviously write down a closed form expression (though I think it might have to have to be broken up into 2 expressions, one if x<0 and one for x\geq 0)

Even if you look at c=2, which is Gaussian (see this) they write the CDF in terms of the error function which is an integral expression.
Reply With Quote
  #3  
Old December 11th, 2008, 12:04 PM
Newbie
 
Join Date: Dec 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
dreamer1 is on a distinguished road
Default

Thanks for the reply.
I also got the same integral:
F(x)=\int^{x}_{-\infty}f(x)dx=\int^x_{-\infty}a\exp(-|b x|^c)\, dx

but didn't know (and still don't) what do with it

Actually, I'm trying to get goodness-of-fit of the empyrical data to a GGDs with different shape parameters c.
The Kolmogorov-Smirnov test needs the empyrical F_x(t) and the distribution CFD F(t).
In Matlab (and in general) it is easy to find the empyrical CDF of the given data and evaluate it at each sample, but how do I get the value of the GGD CDF?
Reply With Quote
  #4  
Old December 11th, 2008, 04:09 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

What you need to do is numerically evaluate the integral. There are a few ways of doing this in matlab. There is the "quad" family of functions. I have had problems with them when you want infinite bnds and it won't necessarily create a pdf that is monotonically increasing (since it approximates the function, and then integrates).

So probably the best way (?) is just numerically sample the function over reasonable bnds and a small spacing. Then just do a trapazoidal numeric integration.

So first figure out some reasonable bnds. From your formulation, you seem to know a priori what the standard deviation is \sigma_x.

So let

B = 10\, \sigma_x

By Chebchev's inequality, you are guaranteed to miss at most 1% of the total area under the curve by using this as a bound. For most distributions, it is substantially better than that. You may want to crank that sucker down to 4 or 5, say, rather than 10.

Code:
x = linspace(-B,B,10000);
pdf = a*exp(-(b*x).^c);

% perform trapazoidal cumulative integration
cdf = cumtrapz(x,pdf);
You will be able to tell how well you did by looking at 1-cdf(end). If that is very small, then chances are you have a good sampling of the pdf. If you don't want your cdf to be quite that big you still need to calculate the cdf over a big range and small spacing (as I have done) and then you can downsample.

For example:
Code:
x_small = -B:0.05:B;
cdf_small = interp1(x,cdf,x_small,'linear','extrap');
Reply With Quote
The following users thank meymathis for this useful post:
Donate to MHF
  #5  
Old December 12th, 2008, 07:10 AM
Newbie
 
Join Date: Dec 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
dreamer1 is on a distinguished road
Default

Great, the first code snippet was exactly what I needed!

As for the 1-cdf(end) part, I'm not sure you are correct. The KS test searches for the
max |F_x(t) - F(t)|,

which is probably somewhere near the middle of the 0-0.5 or 0.5-1 ranges of cdf values.
In general, if my cdf is anything even close to gaussian it should have no problem to come very close to 1 at cdf(end), and I expect the 1-cdf(end) to always be (for a reasonable paramaters of GGD) very close to 0. Please, correct me if I'm wrong.

For the last advice on the topic (or a bit off topic), the \chi^2 test needs distributions pdfs. I suppose it should be fine to use
Code:
[pdf,x]=ksdensity(Y);
to estimate the pdf of the values in Y?
Reply With Quote
  #6  
Old December 16th, 2008, 04:05 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

Quote:
Originally Posted by dreamer1 View Post
As for the 1-cdf(end) part, I'm not sure you are correct.
Sorry, this was meant to be a check just on how good the numerical integration approximation was. We are discretely sampling the PDF, and then doing Riemann Sums as the approximation to the integral to get the CDF. If we undersampled the PDF then cdf(end) may not be very close to 1. I wasn't referencing the kstest.

The last part of my post was referring to the fact that maybe you didn't want to have such a finely resolved CDF. If that was the case, then I was showing how you might downsample it.

I assume your Y is the data? I guess the tests that I am familiar with \chi^2 you don't need to do any kernel smoothing of the data. You would just bin the data and "bin" the PDF (take the difference of the endpoints of the CDF for each bin), and do the \chi^2 test. It doesn't look like the ksdensity would be necessary.

If Y is the CDF that we just calculated, I'm not sure why any kernel smoothing would be necessary either.

But I should add the disclaimer that I have not done much on this part of statistics. I have used both kstest and \chi^2, but I have never done any kernel smoothing. I would think kernel smoothing would be useful for visualization, but not really for trying to perform hypothesis tests comparing empirical data to a given distribution.
Reply With Quote
The following users thank meymathis for this useful post:
Donate to MHF
  #7  
Old December 18th, 2008, 03:46 AM
Newbie
 
Join Date: Dec 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
dreamer1 is on a distinguished road
Default

Quote:
Sorry, this was meant to be a check just on how good the numerical integration approximation was...
I misunderstood you. Now it makes sense.

For the \chi^2 test, you are, of course, right again. Binned data is what is used in the test so the cdf differences will do it.

Thanks for the help, I think I've finally got things straightened out
Reply With Quote
  #8  
Old December 18th, 2008, 05:07 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

Excellent
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off
Forum Jump


All times are GMT -7. The time now is 10:00 PM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0 ©2008, Crawlability, Inc.
©2005 - 2009 Math Help Forum


Math Help Forum is a community of maths forums with an emphasis on maths help in all levels of mathematics.
Register to post your math questions or just hang out and try some of our math games or visit the arcade.