Simple Linear Regression in R

Quick Look: Linear Regression

Linear regression analysis is a common statistical method of examining the relationship between two or more variables. Particularly, we might use regression to see if changes in one variable “explain” changes in another variable.

In this post, we will look at simple linear regression, which means we have one explantory variable, or independent variable, explain one dependent variable.

Linear regression will show us not just if a relationship exists, but the strength of the relationship and the statistical significance of the relationship.

Let’s get started.

The Data

First, We read in the data:

#read  in the csv file
Data <- read.csv("Data.csv")

#rename the column names
colnames(Data) <- c("Weight","HeartRate")

This is a small dataset with two variables, Weight(kg) and Heart Rate, and 6 observations.

Data
##   Weight HeartRate
## 1     62        90
## 2     45        86
## 3     40        67
## 4     55        89
## 5     64        81
## 6     53        75

We can use the sapply function with the mean statistic to view the mean of the two variables:

sapply(Data, mean)
##    Weight HeartRate
##  53.16667  81.33333

Plotting X and Y

The plot of weight against heart rate is displayed below. Using this plot function, we can see a positive relationship between heart rate and weight. This relationship does not appear to be very strong.

attach(Data)
plot(Weight,HeartRate,pch = 16, cex = 1.3, col = "blue", main = "WEIGHT PLOTTED AGAINST HEART RATE", xlab = "Weight (kg)", ylab = "Heart Rate")

Simple Linear Regression

Now that we’ve looked at our data in a couple ways, we will run a regression. We are going to look at the effect weight has on heart rate, thus, we will regress heart rate on weight.

The lm function, or Linear Model function, requires two parameters, the formula and the data.

Our formula is Weight~HeartRate, with the Y variable on the left and X variable on the right. Our data is written with data=Data

We can view the regression results with the summary function.

Linear <- lm(HeartRate~Weight, data=Data)

#View the results
summary(Linear)
##
## Call:
## lm(formula = HeartRate ~ Weight, data = Data)
##
## Residuals:
##      1      2      3      4      5      6
##  3.863  9.108 -7.172  6.670 -6.225 -6.243
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  52.4178    21.1795   2.475   0.0686 .
## Weight        0.5439     0.3933   1.383   0.2389
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.239 on 4 degrees of freedom
## Multiple R-squared:  0.3234, Adjusted R-squared:  0.1543
## F-statistic: 1.912 on 1 and 4 DF,  p-value: 0.2389

Interpretting Coefficients

Let’s look closer at the coefficents, B0 and B1, where B0 is the (Intercept) coefficient, and B1 is the Weight coefficient:

coef(Linear)
## (Intercept)      Weight
##  52.4177744   0.5438663

Interpretation of the intercept:

In this scenario, the intercept cannot be realistically interpretted - as one must have a weight to be alive. However, we would interpret it as follows:

When the weight in kilograms is 0, the heart rate is 52.4177, holding everything else constant.

Interpretation of B1:

As Weight increases by one kg, Heart Rate increase by .543, holding everything else constant.

Confidence Interval

Now, I add a confidence interval at a 95% confidence level:

confint(Linear, level = .95)
##                  2.5 %     97.5 %
## (Intercept) -6.3859967 111.221546
## Weight      -0.5481237   1.635856

Interpretation of the Confidence Interval:

The confidence interval on the intercept tells us we can be 95% confident that the intercept falls between -6.386 and 111.221

The confidence interval on HeartRate tells us we can 95% confident that the Weight parameter falls between -.548 and 1.636.

Plotting the Regression Line

We can use the abline function to plot the regression line on the scatterplot. The regression line is in red:

plot(x=Weight,y=HeartRate,pch = 16, cex = 1.3, col = "blue", main = "WEIGHT PLOTTED AGAINST HEART RATE W/ REGRESSION LINE", xlab = "Weight (kg)", ylab = "Heart Rate");abline(lm(HeartRate~Weight,data=Data),col="red")

Wrap Up

This is a very simple introduction to simple linear regression. Good luck on your regression journey!

Go Top