Exploratory Data Analysis and Decision Tree Analysis

Referencing Styles : Harvard Task 1 Exploratory Data Analysis and Decision Tree Analysis (Worth 30 Marks) a). Assignment 2 requires that you research and critically evaluate literature surrounding the problem of effectively assessing loan applications for credit worthiness. Credit worthiness assessment reduces risks associated with lending by determining which potential loan applications are considered to be good, or alternatively a poor, credit risk and should on that basis be approved or rejected. Good risk management of loan applications can significantly improve the bottom line of financial institutions such as banks, building societies and credit unions. This research will inform your assessment of the key variables in credit data set which is provided for Assignment 2 (About 250 words) b). Using Rapid Miner conduct an exploratory data analysis of the creditdata.csv to identify five variables and build a decision tree model for predicting the credit score of customers and present and discuss the results of your exploratory data analysis and decision tree analysis (about 250 words) Then using RapidMiner Studio data mining tool build a simple predictive model of Credit risk using a reduced creditdata.csv data set using a DecisionTree. Discuss each of your five top variables in about 50 words in terms of the results of your exploratory data analysis and discuss the results of your decision tree analysis drawing on the key outputs from RapidMiner Studio data mining tool and the relevant supporting literature on credit assessment and relevant supporting literature on the interpretation of decision trees. Your discussion should also include appropriate statistical analysis results such as graphs and results tables from conducting an exploratory data analysis in the RapidMiner data mining tool with some supporting references on predictive model building and interpretation using Decision Trees in data mining. Task 2 Data Warehousing and Big Data (Worth 35 Marks) A data warehouse is the foundation of any Business Intelligence or Business Analytics initiative. Consider the following scenario a large local government consisting of seven departments with many different data sets residing in each department. They want high level advice on the logical design of a data warehouse that would incorporate big data analytics. a) Discuss the possible approaches could be used for designing a data warehouse architecture using Kimball or Inmon’s methodology and provide a high level logical design of a data warehouse in a diagram (About 750 words) b) Discuss how your high level warehouse architecture design in part A could incorporate the capture, processing, storage and presentation of big data. Your answer here should focus on providing explanation of a revised high level diagrammatic representation of the logical design of your data warehouse that show how big data analytics would be incorporated/integrated in the design (About 750 words) Task 3 Sales Reports using Tableau Desktop (Worth 25 Marks) Task 3 Sales Reports using Tableau Desktop consists of the following sub tasks With the following Excel file SalesSuperstore.xlsx provided on the course study desk Assignment 2 Folder link and using Tableau Desktop produce the four following reports with appropriate accompanying graphs based on a Tableau workbook sheet view for each. Briefly comment on each report in about 125 words in terms of what trends and patterns are apparent in each report. The SalesSuperstore.xlsx file contains the following dimensions and information: 1. Customer Name, Customer Segment 2. Location-Region, State, City, Zip code 3. Product Category, Sub Category, Product Name, Product Container, Unit Price 4. Order Information 5. Shipping Information 6. Sales Information 7. Profit a) Create a rep or t and accompanying graph using Tableau that shows a trend analysis for sales by Product Category over the years 2009 to 2012 and comment on key trends and patterns apparent in this report (about 125 words) b) Create a report and accompanying graph using Tableau that shows for each Product Category Average Profit and Total Sales for each month over the years 2009 to 2012 and comment on key trends and patterns apparent in this report (about 125 words) c) Create a geographical map presentation using Tableau that shows graphically the relative size by City within each state, Product Sales for year 2010 and comment on key trends and patter ns in this report (about 125 words) d) Create a report and accompanying graph using Tableau that shows for Product Sub Categories that are technology based Unit Prices, Sales and Profit for each month over the years 2009 to 2012 and comment on key trends and patterns in this report (about 125 words)

Leave a Reply

Your email address will not be published.