## Project案例展示-441Project

**(1) ****Introduction:**

** **

Concrete is the most important material in civil engineering. And concrete is composed by many ingredients including cement.

So the question I want to ask is the relationship between cement content and concrete compressive strength. If we know the relationship between them, then we can change the concrete compressive strength by changing cement content in it.

** **

**(2) ****Data description:**

** **

I found this data set in UCI dataset (http://archive.ics.uci.edu/ml/datasets.html).The donated date of this dataset is 2007-08-03 according to the website. Since the area of this dataset is physics, rather than economics area which require high timeliness, so the donated data is not so important.

The independent variable (x) is cement whose measurement is
kg in a m^{3} mixture. The dependent variable (y) is concrete
compressive strength whose measurement is MPa.

The following is data description of x and y and their scatter plot.

**(3) ****Regression results:**

** **

The following is the regression results. From the first table, we can find that the equation is:

__Concrete compressive
strength = 13.443 + 0.08*cement.__

__ __

The intercept is 13.443. When cement content is 0, then the concrete compressive strength is 13.443.

The slope is 0.08. When cement content increase by 1 unit, then the concrete compressive strength will increase by 0.08 unit.

The R^{2} is 0.248. 24.8% of the variation of
concrete compressive strength can be explained by cement content in it.

The result of t test is also presented in the first table. Since the sig value is 0. So the x is significant in this regression model.

** **

**(4) ****Assumptions:**

** **

SLR.1: The model is linear in the parameters. Because the
model form is y = β_{0} + β_{1}*x +ε.

SLR.2: The dataset is collected through simple random sampling, since it is neither a time series data nor a date with selection bias in the sampling.

SLR.3: There are sample variations in the independent variables, since.

SLR.4: The error term has zero mean. Since the mean of unstandardized residual is 0 according to the table below.

SLR.5: Unfortunately, this assumption may not hold in this model. The error term is correlated with the independent variable or some function of the independent variable. ε may include other component content in concrete, which may have relationship with the cement content.

SLR.6: Unfortunately, this assumption may not hold in this model. The error term doesn’t have constant variance. Because the scatter plot presents a state of divergent as x become larger, which indicate a heteroskedastic.

**(5) ****Conclusion:**

We can find in this regression model and scatter plot that concrete compressive strength have a positive relationship with cement content. When the concrete compressive strength wants to change to satisfy different strength requirement in different buildings, we can use this result. We can change concrete compressive strength by changing cement content in it. But if we want to do such thing more precisely, maybe add more independent variables (such as components other than cement) into the model will be help.