Introduction to statistics Introduction to statistics

# Introduction to statistics

## Introduction to statistics

Statistical techniques are employed in almost every phase of human life. Statistics no longer consists merely of the collection of information and its presentation in charts and tables; it is now considered to encompass the science of basing inferences on observed data and on the entire problem of making decisions in the face of uncertainty. Surveys are designed to collect early returns on election day to forecast the outcome of an election, consumers are sampled to provide information for predicting product preference.

The research physician conducts experiments to determine the effect of various drugs and controlled environmental conditions on humans to infer the appropriate method of treatment of a particular disease. Newly manufactured products are sampled and checked before going to market.

The economist observes various indices of economic condition over time and uses the information to forecast the condition of the economy next fall, a teacher compares the abilities of students to determine the quality of teaching methods and so forth. Statistical techniques play an important role in achieving the objective of each of these practical problems. This book is devoted to the methods and techniques that help the researchers to find the answers to these and other related practical problems.

### Meaning and Definition of Statistics

The word statistics has different meanings for different people. When most people hear the word they think of tables of figures giving births, deaths, marriages, road accidents, and so on. This is indeed a vital and correct use of the terin. In fact, the word statistics was first applied to these affairs of state, to the data that finds necessary for effective planning, ruling, and text collecting.

Today, of course, the term statistics is applied to nearly any kind of information given in terms of numbers. For example, newspapers publish the article about the beauty contests giving the statistics of the contestants, and radio and television announcers tell us they will 'announce the statistics of the game in a few minutes and so forth.

The term statistics, however, has other meanings and the people who have not studied the subject are relatively unfamiliar with these meanings. Statistics is a body of knowledge in the area of applied mathematics, with its own symbolism, terminology, content, theorem, and techniques. When people study the subject, they usually attempt to master some of these techniques.

The term statistics has other meanings for those who have been initiated into the mysteries of the subject. In this sense statistics are the quantities that have been calculated from sample data.

### Definition of Statistics

There are hundreds of definitions of statistics that could be considered. The only few definitions that would help the reader understand the meaning and scope of inferential statistics have been presented here.

Webster's New Collegiate Dictionary defines statistics as a branch of mathematics dealing with the collection, presentation, analysis, and interpretation of masses of numerical data.

In the words of Kendall and Stuart "statistics is the branch of the scientific method which deals with the data obtained by counting or measuring the properties of populations."

Fraser, commenting on experimentation and statistical application states statistics is concerned with the methods for drawing conclusions from the results of the experiments or processes.

Freund, among others, views statistics as the entire science of decision-making in the face of uncertainty. Mood defines statistics as the technology of the scientific, method and it is concerned with the designs of experiments and investigations, and statistical inference.

Simpson and Kafka view statistics as the tool of all scientific research. A superficial examination of these definitions suggests a bewildering agreement, but all possess common elements. Each implies a collection of data with inference as to its objective.

Each requires the selection of a subset (sample) of a large collection of data (population), either existent or conceptual, to infer the characteristics of the complete set. Thus statistics is a theory of information with inference making as its objectives.

### Theory of Statistics

The theory of inferential statistics is a theory of information concerned with its quantification, with the design of experiments or procedures for data collection that will minimize the cost of a specified quantity of information, and with the use of this information in making inferences about the population. The most important difference in inference about the unknown population is a two-step procedure. First, we seek the best inferential procedure for the given situation, and second, we desire a measure of its goodness. For example, every estimate of a population characteristic based on information contained in the sample might have associated with it a probabilistic bound on the error of estimation.

### Descriptive and Inferential Statistics

Descriptive statistics comprises those methods concerned with collecting and describing a set of data to yield meaningful information. They are but one type of statistic that researchers use to analyze the data. Many times they also wish to make inferences about a population based on data obtained from a sample. Various inferential statistics allow them to do this.

Statistical inference is the use of samples to reach conclusions about the population from which these samples have been drawn. Inferential statistics refers to certain types of procedures that allow the researchers to make inferences about a population based on the findings from a sample. Making inferences about the population based on random samples is what inferential statistics is all about.

The main objective of inferential statistics is to make an inference about a population based on information contained in a sample and to provide an associated measure of goodness for inference. A necessary prerequisite in making inferences about a population is the ability to describe a set of numbers. Descriptive statistics provide numerical information to make inferences about a population. Taking a representative sample from the population, organizing and analyzing the information obtained from the sample, and drawing conclusions for the whole population are the phases of the inferential techniques. The whole process of inferential statistics can schematically be shown as:

### Techniques of Inferential Statistics

As with descriptive statistics, the techniques of inferential statistics differ depending on which type of data a researcher analyzes. Various inferential methods are available. While the details of both mathematical rationale and calculation differ greatly among these procedures, the important things to consider are as follows

1. The end product of all inference procedures is the same: a statement of probability relating the sample data to hypothesized population characteristics.

2. Inference techniques are intended to answer only one question: Given the sample data, what are probable population characteristics? These techniques do not help decide whether the data show meaningful results or useful-only the extent to which they may be generalizable.

3. All inference techniques assume random sampling. Without random sampling, the resulting probabilities are in error to an unknown degree. The most commonly used inferential procedures and data types appropriate to their use have been presented in the table below:

 Parametric tests Non-Parametric tests Quantitative t-test for independent meanst-test for correlated meansAnalysis of varianceAnalysis of covariance Mann-Whitney U testKruskal Wallis one way analysis of varianceSign test, Median testRank sum test, Runs test Categorical ------------ Chi-square test

### Parameter and Statistics

#### Parameter:

Any numerical value describing a characteristic of a population is called a parameter. The statistical constants of the population such as mean, variance, skewness, Kurtosis, moments, correlation coefficient,, etc. are known as parameters.

#### Statistics:

Any numerical value describing a characteristic of a sample is called a statistic. The statistical constants of the sample such as mean-variance, skewness, Kurtosis, moments, correlation coefficient, etc. are known as statistics.