Student Guidelines
Assessment 1
Research Study & Presentation
Due: 22 December 2019 – 11:59 pm
Total Weightage: 20%
Individual assignment
Python is one of the most frequently used programming languages in many fields, particularly in data science. It
is also one of the best data science tools for the big data job.
The assignment has two phases: 1) writing a report and 2) presentation of findings using Python codes.
1. Report (Weightage: 10%)
Choose data:
Choose a data from Kaggle website, , or a government open source data. You
can also use Twitter data, which you can download using Python Tweepy package.
Find out what you can do with that data or what kind of decision making you can do with it. First (Step 1), do an
exploratory data analysis on the data that you have gathered. Exploratory data analysis is an approach for
analysing data sets to summarize their main characteristics, often with visual methods. Then
(Step 2), Build a
machine learning model on top of your data and make necessary recommendations.
Python implementation:
To be consistent with all students, implementation must be done in google Colab:
Colab is a free notebook environment that requires no setup and runs entirely in the cloud. You need to login to
Colab and write your Python code for analysing the data. Add your google Colab account showing your
name on it into your report, by clicking orange button on top-right corner and taking screenshot.
Your report should have 1500-2000 words addressing the following: information on the data and why it is
important, literature review on the data and methodology you are going to work, what you are going to solve
and how, plots and recommendations. The report should have at least 4-6 plots (screenshots) from your findings
with explanations.
2. Presentation (Weightage: 10%)
The presentation should be a maximum of 10 minutes. It must cover the research report, research findings and
visualisation and step by step discussion on how you’ve done this project.

