Fundamentals of Data
Data: A collection of facts or numerical information used for a specific purpose.Statistics: The branch of mathematics dealing with the collection, organization, analysis, and interpretation of data.
Primary Data: Collected directly by the researcher (e.g., your own survey).
Secondary Data: Gathered from an existing source (e.g., a census report).
Raw Data: Data that has been collected but not yet organized or edited.
Organizing Data
Range: The difference between the highest and lowest values \((Range = Max - Min)\).
Frequency: How often a specific value appears in a dataset.
Class Intervals
Continuous Series: No gaps between classes (e.g., \(0-10, 10-20, 20-30\)).Discontinuous Series: Gaps exist between classes (e.g., \(30-39, 40-49\)).
Class Size: The difference between successive upper (or lower) limits.
Graphical Representations
Bar Graphs: Uses vertical or horizontal bars with equal spacing between them.Histograms: Used for continuous data; rectangles are joined together without gaps.
Frequency Polygons: A line graph formed by joining the mid-points of the tops of histogram rectangles.
Measures of Central Tendency
These measures identify the value that tends to cluster around the middle of a data set.
Mean (\(\bar{x}\))
The sum of all observations divided by the total number of observations.
Formula:
- Grouped frequency distribution: \(\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}\).
- For ungrouped frequency distribution: \(\bar{x} = \frac{\sum_{i=1}^{n} f_i x_i}{\sum_{i=1}^{n} f_i}\) \(= \sum_{i=1}^{n} \frac {f_i x_i}{f_i}\).
Median
The middle value that divides a dataset into two equal parts.
- If \(n\) (number of observations) is odd: The median is the \((\frac{n+1}{2})^{th}\) observation.
- If \(n\) is even: The median is the mean of the \((\frac{n}{2})^{th}\) and \((\frac{n}{2} + 1)^{th}\) observations.
The number which occurs most frequently in the given set of data.
It is the observation having the maximum frequency.