Misplaced Pages

User talk:Melcombe

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by Paul August (talk | contribs) at 15:18, 20 January 2010 (Sigma algebra too technical?: new section). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 15:18, 20 January 2010 by Paul August (talk | contribs) (Sigma algebra too technical?: new section)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Negative Binomial MGF

Dear Melcombe, I was wondering if you could explain why my modification to the Negative binomial mgf was wrong. I seem to have conflicting views from my university notes, textbooks and the internet. I have an exam tomorrow so would appreciate any explanation. Econstatgeek (talk) 18:05, 1 May 2008 (UTC)

Unfortunately I do not have time to do a full derivation. I have checked the formula in a couple of books. Also there must be a direct correspondence between the mgf and the characteristic function. ONe thing that confuses things is a tendency for different books to use different parameterisations of the negative binomial, so that the same correct formula can look different. Melcombe (talk) 08:42, 2 May 2008 (UTC)

I'll put it down to a difference in notation convention then. Thank you for your reply. Econstatgeek (talk) 16:11, 2 May 2008 (UTC)

Probability and statistics sub-project?

Dear Melcombe,

I recently proposed starting a "probability and statistics" sub-project (aka task force or work group) of WikiProject Maths and was wondering if you'd be interested in participating. If so, please add your name and any comments at WP:WikiProject Council/Proposals#Probability and statistics. Regards, Qwfp (talk) 22:05, 12 March 2008 (UTC) (PS I believe it's traditional to start a user talk page with some variation on "Welcome to Misplaced Pages" but as you joined before I did I don't feel I'm in a position to welcome you. The lack of previous comments here would seem to reflect positively on your editing.)

Order statistics

I noticed that you put an "expert" tag on F-test and I completely agree with that. But I wonder if you can be specific about your concerns about order statistic. Michael Hardy (talk) 19:35, 10 April 2008 (UTC)

OK, I have placed something on article talk page. Melcombe (talk) 08:53, 11 April 2008 (UTC)

Statistical Arbitrage

In my humble opinion, Statistical arbitrage is more within the scope of something like WP:WikiProject Finance than WP:WikiProject Statistics. It is a topic in Finance (or Mathematical Finance) having some statistical aspects, rather than a mainstream Statistics topic. Encyclops (talk) 17:58, 22 April 2008 (UTC)

See Misplaced Pages talk:WikiProject Statistics#Things on boundary of scope. My thought was that Statistical arbitrage did involve some statistical thinking in setting up the method. At present WikiProject Statistics is trying to bring all articles on statistics topics into the list of statistical topics to provide a basis for further progress. You might want to look at the project page and consider joining. Melcombe (talk) 18:09, 22 April 2008 (UTC)

Category

Good idea for creating Category:Meta analysis. However, I do think that it should be spelled "meta-analysis". I have therefore requested a renaming to Category:Meta-analysis on WP:CFD. Please comment on this page. JFW | T@lk 13:12, 30 April 2008 (UTC)

Actuarial science

Hello. Can you please explain the reason for the change in category? Thanks! -- Avi (talk) 14:38, 6 May 2008 (UTC)

Fair point. Thanks for taking the time to explain. -- Avi (talk) 17:20, 6 May 2008 (UTC)

Estimating Confidence Intervals

Honorable Melcombe: Is it valid to estimate a confidence interval by multiplying the Student's critical T Value (e.g., +/- 1.96 for a 95% confidence range in a normal population) times the standard deviation of the data range? For example, if the Standard Deviation of the series happens to be 10 and the estimated value is 100, then the 95% confidence range would be from 81.4 (100 - 1.96 x 10) to 119.6 (100 + 1.96 x 10).

If this isn't the appropriate forum to ask this question, please forgive my ignorance and intrusion. Mike. --Grizthedog (talk) 21:50, 13 June 2008 (UTC)

You should really find a better forum for discussing all this. You might try the newgroup called sci.stat.math ... if you don't have a news-reader, it is accessible at http://groups.google.com/group/sci.stat.math/topics .
You need to be more clear about what it is you want to construct. The most usual case is a confidence interval for the mean, in which case your formula would be wrong as you would need to divide the standard deviation of the sample by the square root of the sample size. If you want to find a confidence interval for a further observation from the same population, then your formula in approximately correct ... for modest same sizes you would need to multiply the sample standard deviation by the square root of (1+1/N), where N is the sample size (you might find something about this in the article on "tolerance intervals" ). All this would be making the usual assumptions, but your use of the word "series" might indicate that you have a "time-series" in which there might be serial correlation, in which case the assumptions would not hold. Melcombe (talk) 09:07, 16 June 2008 (UTC)

Re: Lists of basic topics

I've replied to your post at WP:VPR.

The Transhumanist    19:58, 4 July 2008 (UTC)

RFC at St. Petersburg paradox

As you have contributed to an earlier related discussion at Talk:St. Petersburg paradox#That d_mn'd period, you may be interested in Talk:St. Petersburg paradox#Request for comments: punctuation after displayed formula.  --Lambiam 18:17, 8 August 2008 (UTC)

K-factor error is now at AfD

Hello Melcombe. Your prod of K-factor error was contested. See Misplaced Pages:Articles for deletion/K-factor error. EdJohnston (talk) 00:44, 25 August 2008 (UTC)

Confidence Level / Significance Level

The latest change is better (though the link doesn't work). However, the article still needs a simple definition of confidence level (i.e. something a layman can understand), and more importantly, an indication that, when used with statistical testing, it should be interpreted as indicating the significance level, preferably with a link to that article. The reason for this is simply the obvious fact that, for the average layman, the reason they would be looking up confidence levels or confidence intervals it that they have seen something in print reporting the results of statistics based research and want to know what the terms mean.

We can do the standard wikipedia "who can hold their breath longest" thing over this, but I'd rather be sensible about it. Perhaps if you changed the article to include these changes in the way you want, we can avoid the baby stuff.

Jim 125.255.16.233 (talk) 12:44, 12 October 2008 (UTC)

Replied on article Talk page. Melcombe (talk) 15:28, 14 October 2008 (UTC)

Central limit theorem

Please look at Talk:Central_limit_theorem#Asymptotic_normality_for_statistical_estimators. Boris Tsirelson (talk) 19:02, 5 November 2008 (UTC)


Chi square distribution

Hi Melcome. Thanks for your constructive edit to Chi squared distribution. It beats me why such a well-known fact has attracted such controversy. People (mostly but not exclusively anons) have repeatedly removed the assertion about normality. And it's not vandalism, as they justify their actions on the talk page. Do you have any insight in to why they do this? Me, I'm baffled. Best wishes, Robinh (talk) 20:46, 11 December 2008 (UTC)

Student's T test

Thank you, Melcombe, for your edit on the standard deviation estimate not being unbiased. I was concerned about this too and was contemplating a similar edit. However I still have some concerns and would value your opinion. Please see the student's t test talk page for details.

Best wishes, SciberDoc (talk) 22:20, 1 January 2009 (UTC)

Minimum of exponential variables

Hi. I reverted a change to the exponential distribution article about the minimum of exponential variables, and then noticed that it was your change, and that you are a statistician, and I am not. This prompts me to notify you of my change, in case I am mistaken. I think that you ran afoul of the notation in the article, where the rate λ {\displaystyle \lambda } is used in the pdf like this: λ e λ x {\displaystyle \lambda e^{-\lambda x}} , instead of the alternative definition of e x / λ / λ {\displaystyle e^{-x/\lambda }/\lambda } . -- Coffee2theorems (talk) 16:07, 8 January 2009 (UTC)

Cook's distance

Hi Melcombe, you were the last person improving Cook's distance page. There is still one symbol that I am not sure what exactly it means: Y j ( i ) {\displaystyle Y_{j(i)}} . Could you please add its description to the page? Thanks, Alex -- talk 17:40, 17 March 2009 (UTC)

I have added something. Melcombe (talk) 17:52, 17 March 2009 (UTC)
Thanks for both the fix and your remarks to me.Alex -- talk 18:50, 23 March 2009 (UTC)

Citing sources

I see you've been listing lots of books as references, but in many cases without any specific citations (footnotes); please check out Misplaced Pages:Citing sources and ask me if you'd like help converting them to something more useful. Dicklyon (talk) 16:30, 27 March 2009 (UTC)

Talk:Arrival theorem

I've responded to your question on Talk:Arrival theorem and would be interested to hear you views on what you feel the most suitable title is. Whilst random observer property or job observer property both seem suitable to me, arrival theorem and PASTA property (for the Poisson process) seem to be more commonly used, judging by Google Scholar. Gareth Jones (talk) 11:44, 3 June 2009 (UTC)

Characteristic function vs. Moment-generating function

Hi Melcombe, I wonder if you wouldn't mind if we carry on this discussion to your talk page? Anyways, this is what Lukacs has to say about the subject of mgf (i have his book right now, 2nd ed.):

On p.10 he defines an arbitrary integral transform of an r.v., and gives as an example kernels K 1 ( t , x ) = e t x {\displaystyle K_{1}(t,x)=e^{tx}} and K 2 ( t , x ) = t x {\displaystyle K_{2}(t,x)=t^{x}} which give rise to the moment-generating function and probability-generating function respectively. These two functions are essentially the same, as mgf ( t ) = pgf ( e t ) {\displaystyle {\text{mgf}}(t)={\text{pgf}}(e^{t})} . He then remarks

Probability generating functions were introduced by Laplace; we will use these functions only rarely (in section 6.3) and mention them here mainly because they were the first integral transforms systematically used in probability theory.

On p.196 he returns to the subject of mgfs to mention that if f(z) is an analytic characteristic function then M(y) = f(-iy), where y is real. Note that Lukacs never defines mgf for complex or imaginary arguments.

On p.198 he gives an example of a function which has moments of all orders yet doesn't possess an mgf (or an analytic cf).

On p.251 he references a paper by Lévy(1937) where he used mgfs to establish certain convolution properties of Poisson-type distributions.

This is an exhaustive list. My interpretation of this list is that Lukacs doesn't really care much about mgfs, though it would have been a shame not to mention them if only but briefly.


And another question, Melcombe, when you say that "And considering that the major work on characteristic functions defines them for complex t it needs to be included in the main definition." -- could you please give a reference where you have seen such definition? Both Cuppens and Lukacs define it as

If p is probability measure on R n {\displaystyle \mathbb {R} ^{n}} , then p ^ {\displaystyle {\hat {p}}} is analytic if there exists a function f defined on C n {\displaystyle \mathbb {C} ^{n}} with complex values which is regular in some neighborhood of the origin and some constant δ such that

p ^ ( t ) = f ( t ) ( t R n ,   t < δ ) {\displaystyle {\hat {p}}(t)=f(t)\quad (t\in \mathbb {R} ^{n},\ \lVert t\rVert <\delta )}

and later on they prove that if cf is analytic then it admits a representation

p ^ ( z ) = R n e i ( z , u ) p ( d u ) {\displaystyle {\hat {p}}(z)=\int _{\mathbb {R} ^{n}}e^{i\,(z,u)}p(du)} in a convex tube R n + i Γ {\displaystyle \mathbb {R} ^{n}+i\Gamma }

My assessment of this theory is following: the characteristic function is defined only for real-valued arguments, because only then it has all the nice properties such as existence, and boundedness, and it is also much easier to generalize such cf to multi-dimensional rv's. It is also sometimes possible to build an analytic continuation of the cf to a horizontal strip { α < Im ( z ) < β } {\displaystyle \{-\alpha <\operatorname {Im} (z)<\beta \}} , which allows us to derive additional results. However such continuation is only possible if r.v. has infinite number of moments and when the rate of growth of those moments is not too high.

Stpasha (talk) 22:19, 18 June 2009 (UTC)

OK, I have now had a little time to look through what sources I have access to, and they do take the approach of defining the cf for real arguments only, with a later extension to complex values by continuation using analytic functions, which they do say is equivalent to the basic Fourier-type integral for complex arguments (once it is known that the integral exists via the analytical continuation argument). This may be for historical reasons. There may be some way of developing the theory starting from the Fourier-type integral for complex arguments but I haven't found this set down. Nevertheless, because some of the most important results of cf theory are derived via this theory of extension into the complex domain, it seems important in an encyclopedia to note this at an early stage.
As for moment generating functions, it may be that they are unnecessary for theroeticians, the fact is that they are taught at a fairly elementary level of probaility and statistics whereas characteristic functions are not (not least because they involve imaginary numbers). And they are actually used with complex aruments in certain fields of application, where the mgf for some system is derived via theoretical arguments and converted to a density function using numerical inversion based on the inversion formula involving an integral parallel to the imaginary axis. Of course they could do all this using characteristic functions and an intergral along or parallel to the real axis, but they don't. I believe the shift in the axis of integration can help the numerical properties of the integration scheme.
Melcombe (talk) 10:09, 25 June 2009 (UTC)

Optimal design's Introduction: Providing sufficient context for Misplaced Pages readers?

I trust that the introduction is improved. Do you have further feedback? ThanksKiefer.Wolfowitz (talk) 13:45, 29 June 2009 (UTC)

Nuisance variables and nuisance parameters

The split into two different pages I still think is a mistake, so I have raised this at WT:WPSTAT to try to get wider input. Jheald (talk) 17:27, 30 June 2009 (UTC)

Empirical statistical laws

I have nominated the article for deletion on the grounds stated. Please provide a prominent source per Misplaced Pages:Verification. I understand the motivation for having an article like this but question it on the grounds of this policy. I'll leave the content in regression to the mean for the moment but the linked article needs to sources for the content to remain. I am assuming good faith, I hope my concerns are clear. Please discuss if not. Cheers. Holon (talk) 12:58, 30 July 2009 (UTC)

"Phenomena"?

The Cambridge Dictionary includes this: "The term is now generally used to label the phenomena that a variable that is extreme on its first measurement will tend to be closer to the centre of the distribution for a later measurement.

Does it really say "phenomena" instead of "phenomenon"? Michael Hardy (talk) 21:41, 1 August 2009 (UTC)

Statistics outlines

I noticed you are interested in statistics, and access to statistics articles.

Please take a look at these:

What is your initial impression?

What's missing?

Are they structured well? That is, do they present their respective subjects well?

How can we make them more useful?

For more information, see WP:OOK, WP:OUTLINE, WP:WPOOK.

For examples of well developed outlines on other subjects see Outline of Japan, Outline of Buddhism, Outline of robotics, and Outline of cell biology, and Outline of forestry.

I look forward to your reply.

The Transhumanist    00:32, 16 September 2009 (UTC)

Null hypothesis article

Your revert of my edits seem to be a bit hasty, not only are you incorrect in your summary later edits incorrectly changed meaning & number was correct as prob was for either 3 heads or three tails which clearly none of my revisions altered (see my diffs), but you have reverted my correction for the probability of 3 heads or tails which is 0.5*0.5*0.5 = 1/8 or an eighth not a quarter. --Zven (talk) 08:46, 24 September 2009 (UTC)

Right I now see what you are getting at, however you also reverted my other edits which probably didn't need to go. --Zven (talk) 08:57, 24 September 2009 (UTC)
Actually, the example is misleading, no wonder there is so much confusion, it states a two headed coin, so three tails is not an outcome at all, only three heads. --Zven (talk) 09:01, 24 September 2009 (UTC)

Dice/Die - Random Variable

Doh! I knew that! :) Mia Culpa. Cheers. Jwesley78 (talk) 14:25, 22 October 2009 (UTC)

Chi-square distribution vs. Chi distribution.

You're right. Thanks! Quantling (talk) 17:32, 17 November 2009 (UTC)

Total variation

Should we have a separate article for the notion of total variation in probability theory and measure theory? I don't like the current arrangement where the material on probability theory has been forked off from the material on measure theory, even though they are essentially the same thing. What seems better to me is to have an article total variation in measure theory and probability theory that deals with both of these notions. Sławomir Biały (talk) 12:15, 9 December 2009 (UTC)

MSWD

A general technique that is used in statistical methods applied to geochronology.

The mathematics provided deal with general cases, split into the two typical ways MSWD is utlized.

First, to provide a way to analyse the outcome of repeated attempts to measure the same "age", each attempt associated with a known variance (obtained by repeated sampling during the individual attempt. This assumes the sample has a single age.

Second, to provide a way to analyse an isochron, usually defined on a plot of two ratios (e.g. 36Ar/39Ar versus 39Ar/40Ar). —Preceding unsigned comment added by 202.55.153.53 (talk) 09:10, 7 January 2010 (UTC)

Contingency table

That was a nice edit of the contingency table page. Thanks. Iss246 (talk) 22:36, 11 January 2010 (UTC)

Negative binomial distribution

Please, before reverting the edit and asserting that I “completely ignored the discussion on the talk page”, make sure you have read that discussion. There was a 1-month old proposal (section Major Changes) to simplify the exposition down to only 1 main definition, and that proposal was met with a (cautious) support.

The main reason why we had 2-column infobox is because readers were frequently confused when they seen apparent discrepancies between the article and their textbooks. Such confusion can be avoided either by bloating the infobox (and actually there are more than 2 possible definitions), or by including a very noticeable alertbox, warning about potential discrepancies when comparing this info to existing textbooks, which is the way we are dealing with the problem right now.  … stpasha »  18:55, 19 January 2010 (UTC)

See User talk:Stpasha for reply. Melcombe (talk) 10:28, 20 January 2010 (UTC)

Sigma algebra too technical?

Hi Melcombe, please see talk:Sigma-algebra#Too technical?, thanks. Paul August 15:18, 20 January 2010 (UTC)