I was under the impression if the Pvalue is below .05 that implies there is a relationship between the independent variable and the dependent variable. If there is also a positive relationship at what point can we confidently determine that the model is a good fit and the increase is caused by the independent variable. Is there a percent threshold for R-sqr/adj r-sqr?

The significance level of 0.05 is commonly used to determine whether a relationship between the independent and dependent variables is statistically significant. However, statistical significance alone does not indicate the strength or practical importance of the relationship. The coefficient of determination (R^2) or adjusted R^2 provides a measure of the proportion of variance explained by the regression model. There isn't a specific threshold for R^2 or adjusted R^2 to determine a "good" fit, as it can vary depending on the context and field of study. Generally, higher values of R^2 indicate a better fit, but it's essential to consider other factors and conduct further analysis to assess the model's adequacy.

Why doesn't "bx" come first in ŷ=a+bx, whereas "mx" comes first in y=mx+b.

I don't think the order matters as long as you have the correct value for the constant and slope.

Can anybody please explain why the constant coefficient 2.544 is the Y-intercept, and the caffeine coefficient 0.164 is the slope in the question 1? I can't seem to get my head around this. Please help!

The y-intercept is always displayed in the top row, and the slope is always displayed in the bottom row. (Unfortunately, I don't know the reasoning behind them - sorry! Generally, I've found that the slope, y-intercept, s, and r^2 are the most useful pieces of information in these data charts.)

I don't understand which is the x and y values on the charts

The constant is y which stands for number of hrs studied and the x is the number of milligrams of caffeine taken

why is the format different? We don't use y=mx+b but y=b+mx. Whats the difference?

there is no difference, sometimes it's just written differently if you think of the commutative addition property, you can reorder the terms and the result won't change

Main content

Course: Statistics and probability > Unit 5

Lesson 5: Assessing the fit in least-squares regression

Interpreting computer output for regression

Google Classroom

Desiree is interested to see if students who consume more caffeine tend to study more as well. She randomly selects

20

students at her school and records their caffeine intake (mg) and the number of hours spent studying. A scatterplot of the data showed a linear relationship.

This is computer output from a least-squares regression analysis on the data:

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

Question 1

What is the equation of the least-squares regression line?

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

(Choice A)
$\hat{y} = 0.164 + 2.544 x$ ‍, where $x$ ‍ represents caffeine intake and $y$ ‍ represents hours spent studying
(Choice B)
$\hat{y} = 0.164 + 2.544 x$ ‍, where $x$ ‍ represents hours spent studying and $y$ ‍ represents caffeine intake
(Choice C)
$\hat{y} = 2.544 + 0.164 x$ ‍, where $x$ ‍ represents caffeine intake and $y$ ‍ represents hours spent studying
(Choice D)
$\hat{y} = 2.544 + 0.164 x$ ‍, where $x$ ‍ represents hours spent studying and $y$ ‍ represents caffeine intake

Question 2

Which statement about the slope is true?

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

(Choice A)
For each additional $1$ ‍ hour of study time, the caffeine intake is predicted to increase by $0.164 mg$ ‍.
(Choice B)
For each additional $1$ ‍ hour of study time, the caffeine intake is predicted to decrease by $0.164 mg$ ‍..
(Choice C)
For each additional $1 mg$ ‍ of caffeine, the study time is predicted to increase by $0.164$ ‍ hours.
(Choice D)
For each additional $1 mg$ ‍ of caffeine, the study time is predicted to decrease by $0.164$ ‍ hours.

question 3

Which statement about the $y$ ‍-intercept is true?

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

(Choice A)
When the caffeine intake is $0 mg$ ‍, the study time is predicted to be $2.544$ ‍ hours.
(Choice B)
When the study time is $0$ ‍ hours, the caffeine intake is predicted to be $2.544 mg$ ‍.
(Choice C)
When the caffeine intake is $0 mg$ ‍, the study time is predicted to be $0.164$ ‍ hours.
(Choice D)
When the study time is $0$ ‍ hours, the caffeine intake is predicted to be $0.164 mg$ ‍.

question 4

How large is a typical prediction error when using this model to predict study time from caffeine intake?

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

question 5

About what percentage of the variation in study time can be explained by the regression on caffeine intake?

Predictor	Coef	SE Coef	T	P
Constant	$2.544$ ‍	$0.134$ ‍	$18.955$ ‍	$0.000$ ‍
Caffeine (mg)	$0.164$ ‍	$0.057$ ‍	$2.862$ ‍	$0.005$ ‍

S = 1.532 R-Sq = 60.032 % R-Sq(adj) = 58.621 %

Question 6

Based on these data, can we conclude that consuming more caffeine will cause someone to study more?

Want to join the conversation?

Sort by:

saikyun
Posted 7 years ago. Direct link to saikyun's post “In the earlier video, "R-...”
In the earlier video, "R-squared or coefficient of determination", you mentioned the SEline, as in, the sum of errors between the line and the points. Would the S (standard deviation in residuals" be SEline/n?
Button navigates to signup pageComment on saikyun's post “In the earlier video, "R-...”
(11 votes)
Answer
- Vasu Jha
  Posted 5 months ago. Direct link to Vasu Jha's post “The SEline represents the...”
  The SEline represents the aggregate (sum of) error of the regression line in predicting y. Whereas, the RMSD of the residuals of the line represents the avg. prediction error in y. One is average, the other is the sum.
  Button navigates to signup page
  (1 vote)
abojovic
Posted 4 years ago. Direct link to abojovic's post “I was under the impressio...”
I was under the impression if the Pvalue is below .05 that implies there is a relationship between the independent variable and the dependent variable. If there is also a positive relationship at what point can we confidently determine that the model is a good fit and the increase is caused by the independent variable. Is there a percent threshold for R-sqr/adj r-sqr?
Button navigates to signup pageComment on abojovic's post “I was under the impressio...”
(6 votes)
Answer
- daniella
  Posted 4 months ago. Direct link to daniella's post “The significance level of...”
  The significance level of 0.05 is commonly used to determine whether a relationship between the independent and dependent variables is statistically significant. However, statistical significance alone does not indicate the strength or practical importance of the relationship. The coefficient of determination (R^2) or adjusted R^2 provides a measure of the proportion of variance explained by the regression model. There isn't a specific threshold for R^2 or adjusted R^2 to determine a "good" fit, as it can vary depending on the context and field of study. Generally, higher values of R^2 indicate a better fit, but it's essential to consider other factors and conduct further analysis to assess the model's adequacy.
  Button navigates to signup page
  (2 votes)
Akira
Posted 3 years ago. Direct link to Akira's post “Why doesn't "bx" come fir...”
Why doesn't "bx" come first in ŷ=a+bx, whereas "mx" comes first in y=mx+b.
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Sam Alarcon
  Posted 3 years ago. Direct link to Sam Alarcon's post “I don't think the order m...”
  I don't think the order matters as long as you have the correct value for the constant and slope.
  Button navigates to signup page
  (9 votes)
Won-jae Lee
Posted 5 years ago. Direct link to Won-jae Lee's post “Can anybody please explai...”
Can anybody please explain why the constant coefficient 2.544 is the Y-intercept, and the caffeine coefficient 0.164 is the slope in the question 1? I can't seem to get my head around this. Please help!
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- taila
  Posted 5 years ago. Direct link to taila's post “The y-intercept is always...”
  The y-intercept is always displayed in the top row, and the slope is always displayed in the bottom row. (Unfortunately, I don't know the reasoning behind them - sorry! Generally, I've found that the slope, y-intercept, s, and r^2 are the most useful pieces of information in these data charts.)
  Button navigates to signup page
  (5 votes)
Sophia Perez-Esparza
Posted 9 months ago. Direct link to Sophia Perez-Esparza's post “Is the R-Sq always going ...”
Is the R-Sq always going to be the typical prediction error?
Button navigates to signup pageButton navigates to signup page
(2 votes)
Answer
- Vasu Jha
  Posted 5 months ago. Direct link to Vasu Jha's post “Standard Deviation of the...”
  Standard Deviation of the residuals is the typical/average prediction error. R-Sq is the % reduction in prediction error when using a regression line compared to using the avg. y line (total variation in y)
  Button navigates to signup page
  (1 vote)
leviaphan0913
Posted 4 months ago. Direct link to leviaphan0913's post “Why does more caffeine in...”
Why does more caffeine intake not lead to studying more when there is a strong positive linear relationship
Button navigates to signup pageButton navigates to signup page
(1 vote)
Answer
- daniella
  Posted 4 months ago. Direct link to daniella's post “While there may be a stro...”
  While there may be a strong positive linear relationship between caffeine intake and study time, it does not necessarily imply causation. Correlation does not imply causation, meaning that even if two variables are strongly correlated, it doesn't mean that changes in one variable cause changes in the other. There could be other variables or factors influencing the relationship, and establishing causation requires additional evidence from experimental studies or rigorous causal inference methods. Therefore, it's not accurate to conclude that consuming more caffeine leads to studying more based solely on the observed correlation.
  Button navigates to signup page
  (1 vote)
Adaugo Chikezie
Posted 6 years ago. Direct link to Adaugo Chikezie's post “I don't understand which ...”
I don't understand which is the x and y values on the charts
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- Kelvin
  Posted 5 years ago. Direct link to Kelvin's post “The constant is y which s...”
  The constant is y which stands for number of hrs studied and the x is the number of milligrams of caffeine taken
  Button navigates to signup page
  (3 votes)
paperangel220
Posted 2 years ago. Direct link to paperangel220's post “What does regression mean...”
What does regression mean?
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- Violet SW
  Posted 2 years ago. Direct link to Violet SW's post “Correlation quantifies th...”
  Correlation quantifies the strength of the linear relationship between a pair of variables, whereas regression expresses the relationship in the form of an equation.
  Button navigates to signup page
  (1 vote)
Maecie Rogahn
Posted 2 years ago. Direct link to Maecie Rogahn's post “more caffeine = more awak...”
more caffeine = more awake more willingness to study
Button navigates to signup pageComment on Maecie Rogahn's post “more caffeine = more awak...”
(0 votes)
Answer
575368
Posted 2 years ago. Direct link to 575368's post “why is the format differe...”
why is the format different? We don't use y=mx+b but y=b+mx. Whats the difference?
Button navigates to signup pageButton navigates to signup page
(0 votes)
Answer
- var
  Posted 2 years ago. Direct link to var's post “there is no difference, s...”
  there is no difference, sometimes it's just written differently
  if you think of the commutative addition property, you can reorder the terms and the result won't change
  Button navigates to signup page
  (2 votes)