Wednesday, October 24, 2012

Sampling Method -Random Vs Nonrandom Sampling-

Sampling Methods

1) Probability Sampling Method

2) Non-Probability Sampling Method




Probability Sampling Method


1) A simple random sample

A simple random sample is obtained by choosing elementary units in search a way that each unit in the population has an equal chance of being selected. A simple random sample is free from sampling bias. However, using a random number table to choose the elementary units can be cumbersome. If the sample is to be collected by a person untrained in statistics, then instructions may be misinterpreted and selections may be made improperly. Instead of using a least of random numbers, data collection can be simplified by selecting say every 10th or 100th unit after the first unit has been chosen randomly. Such a procedure is called systematic random sampling.The larger a random sample is in size, the more likely it is to represent the population. 




How to use a random number table.
  1. Let's assume that we have a population of 185 students and each student has been assigned a number from 1 to 185. Suppose we wish to sample 5 students (although we would normally sample more, we will use 5 for this example).
  2. Since we have a population of 185 and 185 is a three digit number, we need to use the first three digits of the numbers listed on the chart.
  3. We close our eyes and randomly point to a spot on the chart. For this example, we will assume that we selected 20631 in the first column.
  4. We interpret that number as 206 (first three digits). Since we don't have a member of our population with that number, we go to the next number 899 (89990). Once again we don't have someone with that number, so we continue at the top of the next column. As we work down the column, we find that the first number to match our population is 100 (actually 10005 on the chart). Student number 100 would be in our sample. Continuing down the chart, we see that the other four subjects in our sample would be students 049, 082, 153, and 005
Microsoft Excel has a function to produce random numbers.
The function is simply:
=RAND()
Type that into a cell and it will produce a random number in that cell. Copy the formula throughout a selection of cells and it will produce random numbers between 0 and 1.
If you would like to modify the formula, you can obtain whatever range you wish. For example.. if you wanted random numbers from 1 to 250, you could enter the following formula:
=INT(250*RAND())+1
The INT eliminates the digits after the decimal, the 250* creates the range to be covered, and the +1 sets the lowest number in the range. 


2) Stratified Sample


 A stratified sample is obtained by independently selecting a separate simple random sample from each population stratum. A population can be divided into different groups may be based on some characteristic or variable like income of education. Like anybody with ten years of education will be in group A, between 10 and 20 group B and between 20 and 30 group C. These groups are referred to as strata. You can then randomly select from each stratum a given number of units which may be based on proportion like if group A has 100 persons while group B has 50, and C has 30 you may decide you will take 10% of each. So you end up with 10 from group A, 5 from group B and 3 from group C. 
The advantage of this sampling is that it increase the likelihood of representativeness, especially if one's sample is not very large.  
The disadvantage is that it requires more effort on the part of the researcher. 


3) Cluster Sample


A cluster sample is obtained by selecting clusters from the population on the basis of simple random sampling. The sample comprises a census of each random cluster selected. For example, a cluster may be something like a village or a school, a state. So you decide all the elementary schools in New Delhi are clusters. You want 20 schools selected. You can use simple or systematic random sampling to select the schools, and then every school selected becomes a cluster. This method is similar to simple random sampling except that groups rather than individuals are randomly selected.
The advantages is that it can be used when it is difficult or impossible to select a random sample of individuals and frequently less time consuming.
The disadvantage is that there is a far greater chance of selecting a sample that is not representative of the population. 

4) Systematic Sampling

Every nth individual in the population list is selected for inclusion in the sample. For example, in a population list of 5,000 names, to select a sample of 500, a researcher would select every tenth name on the list until reaching a total of 500 names.


Non Probability Sampling Method

1) Convenience Sampling


 Where the researcher questions anyone who is available. This method is quick and cheap. However we do not know how representative the sample is and how reliable the result.

2) Purposive Sampling


Purposive sampling is different from convenience sampling in that researchers do not simply study whoever is available but rather use their judgement to select a sample that they believe, based on prior information, will provide the data they need. 
The major disadvantage is that the researcher's judgement may be in error-might not be correct in estimating the representativeness of a sample or their expertise regarding the information needed.

Tuesday, October 16, 2012

Populations and Samples

Populations and Samples


View video lesson on populations and samples

The study of statistics revolves around the study of data sets. This lesson describes two important types of data sets - populations and samples. Along the way, we introduce simple random sampling, the main method used in this tutorial to select samples.

Populations versus Samples


The main difference between populations and samples has to do with how observations are assigned to the data set.

  • A population includes each element from the set of observations that can be made.
  • A sample consists only of observations drawn from the population.

Depending on the sampling method, a sample can have fewer observations than the population, the same number of observations, or more observations. More than one sample can be derived from the same population.

Other differences have to do with nomenclature, notation, and computations. For example,

  • A a measurable characteristic of a population, such as a mean or standard deviation, is called a parameter; but a measurable characteristic of a sample is called a statistic.
  • We will see in future lessons that the mean of a population is denoted by the symbol μ; but the mean of a sample is denoted by the symbol x.
  • We will also learn in future lessons that the formula for the standard deviation of a population is different from the formula for the standard deviation of a sample.

Simple Random Sampling


A sampling method is a procedure for selecting sample elements from a population. Simple random sampling refers to a sampling method that has the following properties.

  • The population consists of N objects.
  • The sample consists of n objects.
  • All possible samples of n objects are equally likely to occur.

An important benefit of simple random sampling is that it allows researchers to use statistical methods to analyze sample results. For example, given a simple random sample, researchers can use statistical methods to define a confidence interval around a sample mean. Statistical analysis is not appropriate when non-random sampling methods are used.

There are many ways to obtain a simple random sample. One way would be the lottery method. Each of the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers. Population members having the selected numbers are included in the sample.

Random Number Generator

In practice, the lottery method described above can be cumbersome, particularly with large sample sizes. As an alternative, use Stat Trek's Random Number Generator. With the Random Number Generator, you can select n random numbers quickly and easily. This tool is provided at no cost - free!! To access the Random Number Generator, simply click on the button below. It can also be found under the Stat Tools tab, which appears in the header of every Stat Trek web page.
Random Number Generator


Sampling With Replacement and Without Replacement


Suppose we use the lottery method described above to select a simple random sample. After we pick a number from the bowl, we can put the number aside or we can put it back into the bowl. If we put the number back in the bowl, it may be selected more than once; if we put it aside, it can selected only one time.

When a population element can be selected more than one time, we are sampling with replacement. When a population element can be selected only one time, we are sampling without replacement.

Population and Sample by Using Statistics

Population and Sample 

  • The major use of inferential statistics is to use information from a sample to infer something about a population. A population is a collection of data whose properties are analyzed. 
  • The population is the complete collection to be studied, it contains all subjects of interest. 
  • A sample is a part of the population of interest, a sub-collection selected from a population. 
  • A parameter is a numerical measurement that describes a characteristic of a population, while a sample is a numerical measurement that describes a characteristic of a sample. 
  • In general, we will use a statistic to infer something about a parameter. Ex. Joe D. Politician is running for President. He calls you on the phone and asks you to find out what percentage of the registered voters in the country will vote for him. 
  • There are a few things you could try. 
  • Option I : Call all registered voters on the phone and ask them who they will vote for. Although this would provide a very accurate result, it would be a very tedious and time consuming project. All registered voters represent the population of interest here, and a better approach would be to use a sample. 
  • Option II : Call 4 registered voters, 1 in each time zone, and ask them who they will vote for. Although this is a very easy task, the results would not be very reliable. To use a sample to make inferences about a population, the sample should be representative of the population. How likely is it that these 4 registered voters would represent the population of all registered voters? Not very! The sample needs to look just like the population, but smaller. 
  • Option III : Somewhere between Option I and Option II. 
  • We want to use a method that will be easier than Option I, but more reliable than Option II. 
  • So, you randomly select 2000 registered voters and poll them. 1,120 (56%) tell you that they will vote for Joe. 
  • The population of interest here is all registered voters, and the parameter is the percentage of them that will vote for Joe. 
  • The sample is the 2000 registered voters that were polled, and the statistic is the percentage of them that will vote for Joe. 
  • You can tell Joe that approximately 56% of all registered voters will vote for him. Ex. In a Statistics class of 40 students, 24 had a credit card with them. 
  •  The statement "60% of the students in this Statistics class had a credit card with them" is a descriptive statement. 
  • The population is the 40 students in this Statistics class. 
  • The 60% represents a parameter. The statement "60% of the students in all classes have a credit card with them" is an inferential statement. 
  • The 40 students in this Statistics class represent a sample of students in all classes. The 60% represents a statistic.

Sunday, October 7, 2012

Samples and Populations

What is a Sample? 


Most people, we think, base their conclusions about a group of people (students, actors, football players and so on) on the experiences they have with a fairly small number or sample of individual members. One of the most important steps in the research process is the selection of the sample of individuals who will participate (be observed or questioned). Sampling refers to the process of selecting these individuals. 

Samples and Populations 

A sample in a research is the group on which information is obtained. The larger group to which one hopes to apply the results is called the population. 

Example:
All 700 students at State University who are majoring in mathematics, constitute a population; 50 of those students constitute a sample. Students who own automobiles make up another population, as do students who live in the campus dormitories. Notice that a group may be both a sample in one context and a population in another context. All State University students who own automobiles constitute a sample of all automobile owners at state universities across the Unites States. 

When it is possible, researchers would prefer to study the entire population of interest. Most populations of interest are large, diverse and scattered over a large geographic area. 

Defining the Population 

The first task in selecting a sample is to define the population of interest. In what group, exactly is the researcher interested? To whom does the results of the study to apply? Below are some examples of population. 

  • All high school principles in the United States
  • All students attending Central High School in Omaha, during the academic year 2005-2006
  • All students in Ms. Brown's third grade class at Wharton Elementary School
The above examples shows that a population can be any size and that it will have at least one or several characteristic(s) that sets it off from any other population.