statistic
Statistic
March , 2025
Basics of Statistics
By Deepanjali
Sahoo and Shewata Vikrant Patil
Definition: Science of collection,
presentation, analysis, and reasonable interpretation of data.
Statistics presents a rigorous
scientific method for gaining insight into data. For example, suppose we
measure the weight of 100 patients in a study. With so many measurements,
simply looking at the data fails to provide an informative account. However
statistics can give an instant overall picture of data based on graphical
presentation or numerical summarization irrespective to the number of data
points. Besides data summarization, another important task of statistics is to
make inference and predict relations of variables.
Statistical Description of Data
Statistics describes a numeric set of
data by its
- Center
- Variability
- Shape
Statistics describes a categorical
set of data by
- Frequency
- percentage or proportion of
each category
Measures of central tendency
Whenever you measure things of the same kind, a fairly large number of such measurements will tend to cluster around the middle value. Such a value is called a measure of "Central Tendency". The other terms that are used synonymously are "Measures of Location", or "Statistical Averages". The most widely used measures of central tendency are Arithmetic Mean, Median, and Mode
Mean: Summing up all the observation
and dividing by number of observations.
Example- Mean of 20, 30, 40 is
(20+30+40)/3 = 30.
Median: The middle value in an ordered
sequence of observations. That is, to find the median we need to order the data
set and then find the middle value. In case of an even number of observations
the average of the two middle most values is the median.
Example - To find the
median of {9, 3, 6, 7, 5}, we first sort the data giving {3, 5, 6, 7, 9}, then
choose the middle value 6. If the number of observations is even, e.g., {9, 3,
6, 7, 5, 2}, then the median is the average of the two middle values from the
sorted sequence, in this case, (5 + 6) / 2 = 5.5.
Mode: The value that is observed most
frequently. The mode is undefined for sequences in which no observation is
repeated.
Example - The life in number of
hours of 10 flashlight batteries are as follows: Find the mode.
340 350 340 340 320 340 330 330
340 350
340 occurs five times. Hence,
mode=340.
Mean or Median
The median is less sensitive to
outliers (extreme scores) than the mean and thus a better measure than the mean
for highly skewed distributions, e.g. family income. For example mean of 20,
30, 40, and 990 is (20+30+40+990)/4 =270. The median of these four observations
is (30+40)/2 =35. Here 3 observations out of 4 lie between 20-40. So, the
mean 270 really fails to give a realistic picture of the major part of the
data. It is influenced by extreme value 990
Well written
ReplyDeleteInformative!!
ReplyDeleteInformative
ReplyDeleteVery informative
ReplyDeleteGood keep it up
ReplyDeleteWell written
ReplyDeleteInformative
ReplyDelete