assignment posted, due next week. based on today’s lecture. Looking forward a little bit.

On example 5, P(+| Normal) = 5%, P(Symptom| All population) = 0.1%, P(+| Symptom) = 99%, P(+ |All population) = 5.094%, P(Normal| +) = 98.15469%

Simpson’s Paradox

The basic distribution of population may result skew in general probabilities.

Popluation and Samples (Chapter I)

Population: a collection of all objects/items/human/animals about which information is sought.

Sample: a subset of the population that is actually observed. (Sampling refers to the process of selecting in a number of population units and recording their characteristics.)

Parameter: a numerical characteristic of the population for a specified variables that we are interested to learn.

Statistic: a numerical function of the sample data that is used to estimate the unknown parameter of the population.

Population – Sampling –> Sample Parameters <– Inference– Statistics

Example

Interest: Is the content of lead in a lake within the safety limit? Population: All locations in the lake Sample: 30 selected location for examination Parameter: Average lead concentration in the lake Statistic: Average lead concentration of our 30 sampled locations

Why Sampling

Can we collect all population units? - Yes, census, but costly, time-consuming, rarely use.

We can used the sample statistics to provide estimates of the population parameters.

Scope of statistc: - Point estimation; Confidence Intervals, Testing, Prediction.