It was exactly a year before, I finished the R course with markdown,
lol.
Introduction
Check syllabus
Discussion Board will be maintained
Textbook is recommended (We will not include materials in Chapter
11)
You can access course material under modules
R and RStudio
Calculators, (including statistical functions is recommended but not
required)
Quizes with 2 drops 3 subsession absent maximum
About R (Base R)
- R is a very powerful and popular statistical software
- R is open source and FREE
- R can be used on many OS: windows, Mac, Unix, Linux …
- R is an efficient data managing and storage facility
- R is the most widely used statistical software in research
- R has a sharp learning curve and good documentation support
Learning R
- Installation and interface of R (and RStudio)
- R console, script, workspace, working directory, libraries (base R
for most of the time)
- R code: comments, help
?lm
## starting httpd help server ... done
# R arise from python, the comment like this is really nice.
- Data structure in R: scaler, vector, matrix, array, lists, data
frame
- Data type in R:class (numeric, factor, character, user defined),
names, attributes,
- Summary statistics: mean, sd, cor, var, plot. hist, summary
- In and out of R: workspace, csv/excel data
- Use RMarkdown to write dynamic/interactive statistical analysis
report
Statistics is learning from data
Statistc is the art and science of
learning from data.
- Art comes form the various creative and informative ways to
visualize, summarize, and analyze data
- Science comes form using applied and theoretic mathematics and
probability to make objective decisions.
An analysis that does not contain both aspects is often
incomplete and difficult to understand and use.
- Note, “statistics” is also the plural of “statistic” which is a
numerical fact of summary. Example: average/mean, variance, range,
…
- Formally, a statistc is a function of data.
- Key: understand and quantify
uncertainty/variability.
Scope of Statistics
Formulation of the problem Collecting the data:
Analyzing, interpreting, and communicating the results
- Modeling
- Estimation
- Hypothesis testing
- Interpretation