shinypenny <shinypenny0001@yahoo.com> wrote: I never understood stats like this; I mean, c'mon, who are all these men cheating with?
<sarcasm>
Those awful homewrecking still-single women who want to steal our husbands!
</sarcasm>
If these stats are true,
It would be interesting to actually track down the book and find out either
who they're citing or in the unlikely event that they did the research
themselves, what their data collection methodology was.
FWIW: googling on trying to find other statistics, I found an apalling
number of sites claiming all kinds of different numbers ... and using them
to try to get people to subscribed towards personals ads for those looking
for an affair. Silly!
shinypenny
11-13-2003, 09:10 AM
trajan@sfchat.org (Marcus Ulpius Traianus) wrote in message news:<d7va81-nup.ln1@mail.sfchat.org>... shinypenny <shinypenny0001@yahoo.com> wrote: I never understood stats like this; I mean, c'mon, who are all these men cheating with? <sarcasm> Those awful homewrecking still-single women who want to steal our husbands! </sarcasm> If these stats are true, It would be interesting to actually track down the book and find out either who they're citing or in the unlikely event that they did the research themselves, what their data collection methodology was. FWIW: googling on trying to find other statistics, I found an apalling number of sites claiming all kinds of different numbers ... and using them to try to get people to subscribed towards personals ads for those looking for an affair. Silly!
I'm not as knowledgeable about statistics as you and some of the
others here are (even though I did get an A+ in stats in college! Go
figure! And had my very first job as a "market researcher" -- heh heh
-- fancy term for "annoying mall surveyor"!), but I have been finding
it interesting to sift through the US census data lately. It's
interesting to me how the press will pick this stuff up and put their
own spin on it. Then when you go look at the actual stats, you get a
different story or see something even more intriguing to think about.
jen
Doug Anderson
11-13-2003, 04:23 PM
shinypenny0001@yahoo.com (shinypenny) writes:
trajan@sfchat.org (Marcus Ulpius Traianus) wrote in message news:<d7va81-nup.ln1@mail.sfchat.org>... shinypenny <shinypenny0001@yahoo.com> wrote: I never understood stats like this; I mean, c'mon, who are all these men cheating with? <sarcasm> Those awful homewrecking still-single women who want to steal our husbands! </sarcasm> If these stats are true, It would be interesting to actually track down the book and find out either who they're citing or in the unlikely event that they did the research themselves, what their data collection methodology was. FWIW: googling on trying to find other statistics, I found an apalling number of sites claiming all kinds of different numbers ... and using them to try to get people to subscribed towards personals ads for those looking for an affair. Silly! I'm not as knowledgeable about statistics as you and some of the others here are (even though I did get an A+ in stats in college!
In terms of most casual discussions, one doesn't need to be very
knowledgeable about statistics to be critical.
The fundamental fact is that a statistic is only likely to be
representative if it is calculated based on a _large_, _random_ sample.
Lots of the fallacious stuff around is based on non-random samples.
shinypenny
11-14-2003, 06:04 AM
Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<WfVsb.196805$e01.718051@attbi_s02>... shinypenny0001@yahoo.com (shinypenny) writes:
I'm not as knowledgeable about statistics as you and some of the others here are (even though I did get an A+ in stats in college! In terms of most casual discussions, one doesn't need to be very knowledgeable about statistics to be critical. The fundamental fact is that a statistic is only likely to be representative if it is calculated based on a _large_, _random_ sample.
Huh. I seem to recall that the random sample does not necessarily need
to be *large* as long as it is *representative.* Isn't that why you
can poll only a small percentage of a population, and extrapolate your
findings to the rest of the population? Provided that your small
sample is representative of the total.
I'm sure that's what I was doing when I attacked people in the mall to
ask them questions about things like salad dressing. We were not
required to stop and interview every single person in the mall. We
were, however, required to make sure we had a specific number of
people in each age group of the target market.
Lots of the fallacious stuff around is based on non-random samples.
A non-random salad dressing survey would be one in which I called all
my friends and polled them.
jen
Doug Anderson
11-14-2003, 08:11 AM
shinypenny0001@yahoo.com (shinypenny) writes:
Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<WfVsb.196805$e01.718051@attbi_s02>... shinypenny0001@yahoo.com (shinypenny) writes: I'm not as knowledgeable about statistics as you and some of the others here are (even though I did get an A+ in stats in college! In terms of most casual discussions, one doesn't need to be very knowledgeable about statistics to be critical. The fundamental fact is that a statistic is only likely to be representative if it is calculated based on a _large_, _random_ sample. Huh. I seem to recall that the random sample does not necessarily need to be *large* as long as it is *representative.*
The sample doesn't have to be a large part of your population. But if
it is of size 10, let's say, you aren't going to get much confidence
in your result. So it has to be large in some general sense.
The general problem is unless you already know the answer you are
seeking, there isn't a good way to _get_ a representative sample
without choosing one that is large, and random.
Isn't that why you can poll only a small percentage of a population, and extrapolate your findings to the rest of the population? Provided that your small sample is representative of the total. I'm sure that's what I was doing when I attacked people in the mall to ask them questions about things like salad dressing. We were not required to stop and interview every single person in the mall. We were, however, required to make sure we had a specific number of people in each age group of the target market.
Of course your sample in the mall would not have been representative.
You would have been very unlikely to get people like me, for example,
who work hard to avoid ever going in malls. And who don't talk to
pollsters.
Of course your bosses were probably not interested in getting a good
sample of the population of the world, or even of the country you live
in. Probably they just wanted a good sample of people in the mall.
Lots of the fallacious stuff around is based on non-random samples. A non-random salad dressing survey would be one in which I called all my friends and polled them.
Your mall samples were also quite non-random, but they may have been
random enough for the population your bosses cared about.
shinypenny
11-14-2003, 06:55 PM
Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<297tb.146229$mZ5.999848@attbi_s54>... shinypenny0001@yahoo.com (shinypenny) writes: Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<WfVsb.196805$e01.718051@attbi_s02>... shinypenny0001@yahoo.com (shinypenny) writes: > Huh. I seem to recall that the random sample does not necessarily need to be *large* as long as it is *representative.* The sample doesn't have to be a large part of your population. But if it is of size 10, let's say, you aren't going to get much confidence in your result. So it has to be large in some general sense.
I think it is dependent on the size of your target population. I seem
to recall that you need a certain percentage to make it a valid study.
Can't remember what that percentage was - 2 or 4% or something like
that? So 10 may just be fine, if you're doing a study of, say, a
population of 50 people.
The general problem is unless you already know the answer you are seeking, there isn't a good way to _get_ a representative sample without choosing one that is large, and random.
Random, I agree. Large, I debate. It all depends on the population you
are studying. Surely for a DNA sample to test paternity, you'd need
quite a large sample.
Isn't that why you can poll only a small percentage of a population, and extrapolate your findings to the rest of the population? Provided that your small sample is representative of the total. I'm sure that's what I was doing when I attacked people in the mall to ask them questions about things like salad dressing. We were not required to stop and interview every single person in the mall. We were, however, required to make sure we had a specific number of people in each age group of the target market. Of course your sample in the mall would not have been representative. You would have been very unlikely to get people like me, for example, who work hard to avoid ever going in malls. And who don't talk to pollsters.
My sample, as a single pollster, no. But the company I worked for used
various sampling forums, including telephone and mail -- not just
malls. The data was collected from all over the country, not just the
mall I happened to work at.
jen
Doug Anderson
11-14-2003, 07:01 PM
shinypenny0001@yahoo.com (shinypenny) writes:
Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<297tb.146229$mZ5.999848@attbi_s54>... shinypenny0001@yahoo.com (shinypenny) writes: Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<WfVsb.196805$e01.718051@attbi_s02>... > shinypenny0001@yahoo.com (shinypenny) writes: > > Huh. I seem to recall that the random sample does not necessarily need to be *large* as long as it is *representative.* The sample doesn't have to be a large part of your population. But if it is of size 10, let's say, you aren't going to get much confidence in your result. So it has to be large in some general sense. I think it is dependent on the size of your target population. I seem to recall that you need a certain percentage to make it a valid study.
No, that is incorrect.
Can't remember what that percentage was - 2 or 4% or something like that? So 10 may just be fine, if you're doing a study of, say, a population of 50 people.
Generally speaking a random sample of 10 from 50 is not better than a
random sample of 10 from 5 billion, interestingly. Of course a sample
of 50 from 50 is much better than a sample of 50 from 5 billion!
The general problem is unless you already know the answer you are seeking, there isn't a good way to _get_ a representative sample without choosing one that is large, and random. Random, I agree. Large, I debate.
It really isn't a matter of debate. Get any statistics text book. Or
take a stat class at a community college near you. It's an
interesting subject.
It all depends on the population you are studying. Surely for a DNA sample to test paternity, you'd need quite a large sample.
DNA tests are not statistical. To test for paternity, you need the
child and the possible father.
Isn't that why you can poll only a small percentage of a population, and extrapolate your findings to the rest of the population? Provided that your small sample is representative of the total. I'm sure that's what I was doing when I attacked people in the mall to ask them questions about things like salad dressing. We were not required to stop and interview every single person in the mall. We were, however, required to make sure we had a specific number of people in each age group of the target market. Of course your sample in the mall would not have been representative. You would have been very unlikely to get people like me, for example, who work hard to avoid ever going in malls. And who don't talk to pollsters. My sample, as a single pollster, no. But the company I worked for used various sampling forums, including telephone and mail -- not just malls. The data was collected from all over the country, not just the mall I happened to work at.
So their samples _may_ have been representative of the subgroup of
people who go to malls and are willing to talk to pollsters together
with those who fill out polls received by mail and those who talk to
pollsters on the phone. (But even this, only if these individuals
were chosen by a decent random process.)
Still missed me in their population.
Doug
Ellie
11-14-2003, 07:17 PM
Doug Anderson wrote:
This is turning into a cool subject! I know nothing about statistics and find some of the following very
interesting. Could you please elaborate a bit more Doug?
Generally speaking a random sample of 10 from 50 is not better than a random sample of 10 from 5 billion, interestingly.
How so? What do you mean by it's not "better"? If you mean a random sample for something to be extrapolated
to the whole population I can see. But if we are studying something that applies to a particular
population, of which those 50 are members, then a sample of 10 of those 50 would be more representative than
10 from 5 billion, no?
It really isn't a matter of debate. Get any statistics text book. Or take a stat class at a community college near you. It's an interesting subject.
I am being tempted to do it!
DNA tests are not statistical. To test for paternity, you need the child and the possible father.
Yes, and I would assume a "sample" would include a child and a possible father.
So their samples _may_ have been representative of the subgroup of people who go to malls and are willing to talk to pollsters together with those who fill out polls received by mail and those who talk to pollsters on the phone. (But even this, only if these individuals were chosen by a decent random process.) Still missed me in their population.
But then ANY sampling would miss you (and me!). Does that mean no data that is based on polling people
randomly is a good representative, because some people will always be excluded?
Doug Anderson
11-14-2003, 08:03 PM
Ellie <ellie_first@hotmail.com> writes:
Doug Anderson wrote: This is turning into a cool subject! I know nothing about statistics and find some of the following very interesting. Could you please elaborate a bit more Doug? Generally speaking a random sample of 10 from 50 is not better than a random sample of 10 from 5 billion, interestingly. How so? What do you mean by it's not "better"? If you mean a random sample for something to be extrapolated to the whole population I can see. But if we are studying something that applies to a particular population, of which those 50 are members, then a sample of 10 of those 50 would be more representative than 10 from 5 billion, no?
So I mean statistically. Suppose you have a group of 50 people and
30% are mistaken about the identity of their fathers.
Now suppose you have a group of 5 billion people and 30% of them are
mistaken about the identity of their fathers.
Now suppose you pick a random sample of 10 from the big group. The
chance that exactly 3 are mistaken about their fathers is about 27%
(if I did the arithmetica correctly, which I don't guarantee).
If you pick a random sample of 10 from the small group, the chance
that exactly 3 are mistaken about their fathers is also close to 27%
(it isn't actually exactly the same in this example, but it is quite
close).
It really isn't a matter of debate. Get any statistics text book. Or take a stat class at a community college near you. It's an interesting subject. I am being tempted to do it! DNA tests are not statistical. To test for paternity, you need the child and the possible father. Yes, and I would assume a "sample" would include a child and a possible father. So their samples _may_ have been representative of the subgroup of people who go to malls and are willing to talk to pollsters together with those who fill out polls received by mail and those who talk to pollsters on the phone. (But even this, only if these individuals were chosen by a decent random process.) Still missed me in their population. But then ANY sampling would miss you (and me!).
Maybe not if they offered to pay me, but quite possibly even then.
Does that mean no data that is based on polling people randomly is a good representative, because some people will always be excluded?
Yep, that's exactly what it means. You catch on fast!
The really difficult problem in applying statistics is that if you
wish to infer something about a large group by studying a sample, you
either need a random sample (this is typically quite hard to
generate), or an argument that even though your sample _isn't_ random,
it is still representative of the general population. For example, if
you were trying to study the prevalance of color blindness, you might
argue that since it doesn't affect people very much, it should be
sufficient to just sample, say, visitors to a mall. They won't be
typical of your population, but they could well be typical in terms of
how many of them are colorblind. (But they could well by atypical in
that respect too. You have to know something about how colorblindness
is distributed and how it affects people to be able to believe this
would be a representative sample.)
Doing this second thing (showing that a sample you know _isn't_ random
is still representative) is a path through a minefield. Especially if
you are trying to explore a question to which you don't know the
answer. (Of course many people already _have_ the answer and are just
looking for evidence. See the neighboring thread on pseudo-science!)
Marcus Ulpius Traianus
11-14-2003, 09:12 PM
shinypenny <shinypenny0001@yahoo.com> wrote: I think it is dependent on the size of your target population. I seem to recall that you need a certain percentage to make it a valid study.
For sufficiently large populations, the sampling error is entirely a matter
of the sample size and not a matter of the population size. For smaller
populations, the law of large numbers doesn't apply to begin with, and you
want as large a percentage as possible.
http://www.statcan.ca/english/edu/power/ch6/sampling/sampling.htm
Can't remember what that percentage was - 2 or 4% or something like that? So 10 may just be fine, if you're doing a study of, say, a population of 50 people.
A population of 50 is never going to be studiable using the same statistical
techniques used to study a population of thousands; "10 out of 50" sampling
error is huge as compared to "100/500/1000 out of several thousand."
Random, I agree. Large, I debate. It all depends on the population you are studying. Surely for a DNA sample to test paternity, you'd need quite a large sample.
Not really; in smaller populations, you just can't depend on a lot of the
assumptions required.
Ralph DuBose
11-14-2003, 11:07 PM
Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<bGgtb.150581$mZ5.1027125@attbi_s54>... shinypenny0001@yahoo.com (shinypenny) writes: Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<297tb.146229$mZ5.999848@attbi_s54>... shinypenny0001@yahoo.com (shinypenny) writes: > Doug Anderson <ethelthelog@yahoo.com> wrote in message news:<WfVsb.196805$e01.718051@attbi_s02>... > > shinypenny0001@yahoo.com (shinypenny) writes: > > > > Huh. I seem to recall that the random sample does not necessarily need > to be *large* as long as it is *representative.* The sample doesn't have to be a large part of your population. But if it is of size 10, let's say, you aren't going to get much confidence in your result. So it has to be large in some general sense. I think it is dependent on the size of your target population. I seem to recall that you need a certain percentage to make it a valid study. No, that is incorrect. Can't remember what that percentage was - 2 or 4% or something like that? So 10 may just be fine, if you're doing a study of, say, a population of 50 people. Generally speaking a random sample of 10 from 50 is not better than a random sample of 10 from 5 billion, interestingly. Of course a sample of 50 from 50 is much better than a sample of 50 from 5 billion! The general problem is unless you already know the answer you are seeking, there isn't a good way to _get_ a representative sample without choosing one that is large, and random. Random, I agree. Large, I debate. It really isn't a matter of debate. Get any statistics text book. Or take a stat class at a community college near you. It's an interesting subject. It all depends on the population you are studying. Surely for a DNA sample to test paternity, you'd need quite a large sample. DNA tests are not statistical. To test for paternity, you need the child and the possible father. > Isn't that why you > can poll only a small percentage of a population, and extrapolate your > findings to the rest of the population? Provided that your small > sample is representative of the total. > > I'm sure that's what I was doing when I attacked people in the mall to > ask them questions about things like salad dressing. We were not > required to stop and interview every single person in the mall. We > were, however, required to make sure we had a specific number of > people in each age group of the target market. Of course your sample in the mall would not have been representative. You would have been very unlikely to get people like me, for example, who work hard to avoid ever going in malls. And who don't talk to pollsters. My sample, as a single pollster, no. But the company I worked for used various sampling forums, including telephone and mail -- not just malls. The data was collected from all over the country, not just the mall I happened to work at. So their samples _may_ have been representative of the subgroup of people who go to malls and are willing to talk to pollsters together with those who fill out polls received by mail and those who talk to pollsters on the phone. (But even this, only if these individuals were chosen by a decent random process.) Still missed me in their population. Doug
The beauty of DNA testing is that it provides an end-run around the
bull**** that froths up whenever "surveys" are done.
In a sane society, child support should never be assigned until
after DNA confirmation of paternity. When this eventuates, then we can
talk about the real rate of surprises because there will be no
"samplying", only comprehensive knowledge.
You guys who doubt citically high rates of "surprises" ought to be
a little more humble. Because it is very hard to find examples of
comfortably low rates of surprises in modern, Western, societies. You
guys are the ones making big assumptions unsupported by data.
Doug Anderson
11-14-2003, 11:14 PM
rdubose@pdq.net (Ralph DuBose) writes:
You guys who doubt citically high rates of "surprises" ought to be a little more humble. Because it is very hard to find examples of comfortably low rates of surprises in modern, Western, societies. You guys are the ones making big assumptions unsupported by data.
I doubt unsupported statistics of 30% "surprises."
I would also doubt unsupported statistics of 99% "no suprises."
You've now reduced your argument to "you have no evidence either, so
you should believe me." Actually the right argument is believe
_nothing_ statistical until you have convincing evidence.
Randy Poe
11-15-2003, 03:39 AM
On Sat, 15 Nov 2003 04:03:09 GMT, Doug Anderson
<ethelthelog@yahoo.com> wrote:So I mean statistically. Suppose you have a group of 50 people and30% are mistaken about the identity of their fathers.Now suppose you have a group of 5 billion people and 30% of them aremistaken about the identity of their fathers.Now suppose you pick a random sample of 10 from the big group. Thechance that exactly 3 are mistaken about their fathers is about 27%(if I did the arithmetica correctly, which I don't guarantee).If you pick a random sample of 10 from the small group, the chancethat exactly 3 are mistaken about their fathers is also close to 27%(it isn't actually exactly the same in this example, but it is quiteclose).
Nicely done. I was struggling with how to explain "standard error of
the mean" in this context.
I'll just add this: often in newspaper polls, you'll see a little
footnote that "these numbers have an error of X percentage points
either way". That's an estimate based on the size of the study group,
and has no relation to the size of the population (which is assumed to
be much larger).
If the sample is a significant fraction of the population, then a lot
of the basic assumptions of random sampling break down.
- Randy
Doug Anderson
11-15-2003, 08:13 AM
Randy Poe <rpoePA@removethis.yahoo.com> writes:
On Sat, 15 Nov 2003 04:03:09 GMT, Doug Anderson <ethelthelog@yahoo.com> wrote:So I mean statistically. Suppose you have a group of 50 people and30% are mistaken about the identity of their fathers.Now suppose you have a group of 5 billion people and 30% of them aremistaken about the identity of their fathers.Now suppose you pick a random sample of 10 from the big group. Thechance that exactly 3 are mistaken about their fathers is about 27%(if I did the arithmetica correctly, which I don't guarantee).If you pick a random sample of 10 from the small group, the chancethat exactly 3 are mistaken about their fathers is also close to 27%(it isn't actually exactly the same in this example, but it is quiteclose). Nicely done. I was struggling with how to explain "standard error of the mean" in this context. I'll just add this: often in newspaper polls, you'll see a little footnote that "these numbers have an error of X percentage points either way". That's an estimate based on the size of the study group, and has no relation to the size of the population (which is assumed to be much larger).
It also is an estimate that assumes that their sample was truly random.
Unfortunately, the sample rarely is truly random since you have major
non-response problems with surveys.
If the sample is a significant fraction of the population, then a lot of the basic assumptions of random sampling break down.
Yeah, that's true. And with my silly numbers above, that is starting
to happen.
Of course if you are able to sample a significant fraction of your
population, you should just go ahead and sample the entire population
and then get information that you _know_ instead of just an estimate!
Ellie
11-16-2003, 11:23 AM
Doug Anderson wrote:
So I mean statistically. Suppose you have a group of 50 people and 30% are mistaken about the identity of their fathers. Now suppose you have a group of 5 billion people and 30% of them are mistaken about the identity of their fathers. Now suppose you pick a random sample of 10 from the big group. The chance that exactly 3 are mistaken about their fathers is about 27% (if I did the arithmetica correctly, which I don't guarantee). If you pick a random sample of 10 from the small group, the chance that exactly 3 are mistaken about their fathers is also close to 27% (it isn't actually exactly the same in this example, but it is quite close).
[snip]
Thanks for the explanation. I understand it now!
Complete Labor
Law Poster for $24.95 from www.LaborLawCenter.com,
includes State, Federal, & OSHA posting requirements