Hello Friends,.In the last video on Multivariate Analysis,.we had seen the Introduction of Multivariate.analysis, some of the important concepts used.in it and the introduction of various tools.and techniques as a part of it..In this video, we are going to learn the 1st.tool in multivariate analysis in Minitab software.with the help of a practical example for easy.understanding and better clarity..So, let’s begin….Principal Components Analysis:.The Principal Components Analysis is used.to identify a smaller number of uncorrelated.variables, also called \"principal components\",.from a large set of data..With this analysis, you create new variables.(principal components) that are linear combinations.of the observed variables..The goal of principal components analysis.is to explain the maximum amount of variance.with the fewest number of principal components..For example, a bank requires eight sections.of information from loan applicants like income,.education level, age, length of time at current.residence, length of time with current employer,.savings, debt, and the number of credit cards..A bank administrator wants to analyze this.data to determine the best way to group and.report it..The administrator collects this information.for 30 loan applicants..Here, the administrator performs a principal.component analysis to reduce the number of.variables to make the data easier to analyze..The administrator wants enough components.to explain at least 90% of the variation in.the data..Data considerations for Principal Components.Analysis:.To ensure that your results are valid, consider.the following guidelines when you collect.data, perform the analysis, and interpret.your results..In the case of Principal Component Analysis,.there is only one requirement of data and.i.e..You should have at least two variables.And the measurements for each variable should.be recorded in separate numeric columns..Example of Principal Components Analysis:.Let’s continue with the same example..A bank requires eight sections of information.from loan applicants like income, education.level, age, length of time at current residence,.length of time with current employer, savings,.debt, and the number of credit cards..A bank administrator wants to analyze this.data to determine the best way to group and.report it..The administrator collects this information.for 30 loan applicants..Here, the administrator performs a principal.component analysis to reduce the number of.variables to make the data easier to analyze..The administrator wants enough components.to explain at least 90% of the variation in.the data..Conduct Principal Component Analysis (PCA).in Minitab:.To conduct a Principal Component Analysis.in Minitab, please follow the steps:.1..Enter or copy the data to Minitab worksheet.with data for one variable in one column,.as shown in the picture..2..Select Stat > Multivariate > Principal Components..3..In Variables, enter C1-C8..4..In the Number of components to compute, keep.the field blank..Here, enter the number of principal components.that you want Minitab to calculate..If you have a large number of variables, you.may want to specify a smaller number of components.to reduce the amount of output..If you do not know how many components to.enter, you can leave this field blank..5..In Type of Matrix, keep the default selection.of Correlation as it is..Here, please select the correct type of matrix.to use to calculate the principal components..• Correlation: This is used when your variables.have different scales and you want to weigh.all the variables equally..Our example falls in this category..And.• Covariance: This is used when your variables.use the same scale, or when your variables.have different scales, but you want to give.more emphasis to variables with higher variances..6..From the Graphs, select the graphs you want.to see for an analysis..Scree plot: Use a scree plot to identify the.number of components that explain most of.the variation in the data..Score plot for the first 2 components: Use.the score plot to look for clusters, trends,.and outliers in the first two principal components..Loading plot for the first 2 components: Use.the loading plot to visually interpret the.first two principal components..Biplot for the first 2 components: Use the.biplot to look for clusters, trends, and outliers.through the interpretation of the first two.principal components..The biplot overlays the score plot and the.loading plot on the same graph..Outlier plot: Use the outlier plot to identify.outliers in the data..And.7..Click OK in each dialogue box to get the results..We will get the results of an analysis in.the Session Window and in Graph Window..Interpretation of Results:.In these results, use the cumulative proportion.to determine the amount of variance that the.principal components explain..Retain the principal components that explain.an acceptable level of variance..The acceptable level depends on your application..For descriptive purposes, you may only need.80% of the variance explained..However, if you want to perform other analyses.on the data, you may want to have at least.90% of the variance explained by the principal.components..This is the case in our example..The first four principal components explain.90.7% of the variation in the data..Therefore, the administrator decides to use.these components to analyze loan applicants..You can also use the size of the eigenvalue.to determine the number of principal components..Retain the principal components with the largest.eigenvalues i.e. >1..The scree plot orders the eigenvalues from.largest to smallest..The ideal pattern is a steep curve, followed.by a bend, and then a straight line..Use the components in the steep curve before.the first point that starts the line trend..The loading plot visually shows the results.for the first two components..Age, Residence, Employ, and Savings have large.positive loadings on component 1, so this.component measures long-term financial stability..Debt and Credit Cards have large negative.loadings on component 2, so this component.primarily measures an applicant's credit history..Use the outlier plot to identify outliers..Any point that is above the reference line.is an outlier..Outliers can significantly affect the results.of your analysis..In these results, there are no outliers..All the points are below the reference line..The first principal component accounts for.44.3% of the total variance..The variables that correlate the most with.the first principal component (PC1) are Age.(0.484), Residence (0.466), Employ (0.459),.and Savings (0.404)..The first principal component is positively.correlated with all four of these variables..Therefore, increasing values of Age, Residence,.Employ, and Savings increase the value of.the first principal component..