STAT4002: Basic Data Analysis

Subject STAT4002: Basic Data Analysis

Question 1

This question uses the PULSE data that you used in Practical 2.

a) Describe the type of each of the following variables.
i. Pulse 1
ii. Ran
iii. Activity

b) What is the mean weight of the females?

c) What is the interquartile range of the heights of the males?

d) Examine boxplots of Pulse 1 for men and women and make two comments (about two different aspects).

e) Examine the boxplots of Pulse 1 for the different activity levels (ignore the 0 value). What conclusion can you draw?

f) Compare the average weight of smokers and non-smokers. What can be concluded?

g) Look at dotplots of Pulse 2 for the two values of the Ran variable. Write down two comments, one for each value of the Ran variable.

h) Create a single small image (a single Minitab output) that contains two separate histograms for males and females for Weight, so that someone could use this to compare weights of males and females

Question 2

a) Do you agree with this sentence? “If something has a probability of 0.7 it  can be expected to happen about 7 times as often as its opposite”. Explain your answer.

b) A standard pack of 52 cards is shuffled. You deal one card at a time until a seven turns up. You have gone through 8 cards and still not seen a seven. What is the chance of getting a seven on the 9th card?

c) Draw small-scale tree diagrams to work out each of the following:

i. The probability of obtaining two tails with two coins.
ii. The probability of obtaining four heads with four coins.

Question 3

The Environment Protection Agency has developed a testing programme to monitor petrol vehicle emission levels of several pollutants. Data collected under a variety of conditions suggests that the Normal curve provides an adequate approximation for the distribution of the key variable of interest,
the amount of oxides of nitrogen (g/mile) emitted by a vehicle. The mean emission has been found to be 1.7 g/mile, with a standard deviation of 0.48 g/mile.

a) Calculate the probability that a randomly selected petrol vehicle emits more than 0.88 g/mile.

b) Calculate the proportion of petrol vehicles that are expected to emit between 1.0 and 2.0 g/mile.

c) Calculate the upper permissible limit on vehicle pollution, defined by the agency to be the amount that only 1% of petrol vehicles exceed.

d) An ecological monitoring group proposes that a progressive pollution tax is introduced as follows. For each vehicle, the pollution level should be measured and the appropriate taxes would be payable:
below 1.0 g/mile no tax
between 1.0 and 1.6 g/mile £50
between 1.6 and 2.2 g/mile £100
between 2.2 and 2.8 g/mile £250
above 2.8 g/mile £2000

Assuming there are 15 million petrol vehicles in the UK, and their emissions follow the above model, how much tax would be raised in total if this proposal was implemented?

Question 4

It was claimed that 60% of people wanted a regulation to ban mobile phone usage on public transport. 84 people, as far as possible selected randomly from the population, were asked if they supported this proposed rule. 39 were in agreement.

a) Construct a 99% confidence interval for the proportion of people wanting to change the law.

b) Does your confidence interval support the claim? Explain your answe