The first objective is to combine those files and stack them as three large files, one for each time period. Run basic EDA and descriptive statistics on some columns and clean any obvious outliers from each time period. Make sure that no more than 1% of the data are removed from within each time period in this process. Clearly write the details of outlier detection and descriptive analysis.
This section is further broken down into two parts:
Part A: (30 points)
There are several numeric columns listed in the datasets. Many of those columns are corelated and contain repetitive information. Use the tools of dimension reduction learnt during the course and condense the number of columns to smaller dimension for each time period separately.
Use the reduced dimensions to perform “grouping” of similar vehicles. Keep the number of groups between 5 and 8 for each time period. Clearly define groups based on their characteristics by running descriptive analytics on each group. Now compare the groups for the three time periods and point out any vehicles that jumped from one group to the other over time. Also explain what that jump means in your own words.
Part B: (30 Points)
This part is about predictive modeling where you are asked to try several modeling techniques separately for each time period. You will then compare the results from the best models for each time period.
The response variable for this problem is mileage per gallon (columns name: RND_ADJ_FE). You will create the best predictive model predicting the mileage per gallon for each time period. You will then compare those models for the predictors and accuracy (R2, MSE etc.) and describe the results in your own words.
This is not a separate section of the analysis, but you are required to create several visual depictions of the analysis in both descriptive statistics and modeling parts of the report. Your grade will depend upon the uniqueness and description of the visuals.
PLACE THIS ORDER OR A SIMILAR ORDER WITH USA ELITE WRITERS TODAY AND GET AN AMAZING DISCOUNT