User interface language: English | Español

SL Paper 2

A wind turbine is designed so that the rotation of the blades generates electricity. The turbine is built on horizontal ground and is made up of a vertical tower and three blades.

The point A is on the base of the tower directly below point B at the top of the tower. The height of the tower, AB, is 90m. The blades of the turbine are centred at B and are each of length 40m. This is shown in the following diagram.

The end of one of the blades of the turbine is represented by point C on the diagram. Let h be the height of C above the ground, measured in metres, where h varies as the blade rotates.

Find the

The blades of the turbine complete 12 rotations per minute under normal conditions, moving at a constant rate.

The height, h, of point C can be modelled by the following function. Time, t, is measured from the instant when the blade [BC] first passes [AB] and is measured in seconds.

ht=90-40cos72t°, t0

Looking through his window, Tim has a partial view of the rotating wind turbine. The position of his window means that he cannot see any part of the wind turbine that is more than 100 m above the ground. This is illustrated in the following diagram.

maximum value of h.

[1]
a.i.

minimum value of h.

[1]
a.ii.

Find the time, in seconds, it takes for the blade [BC] to make one complete rotation under these conditions.

[1]
b.i.

Calculate the angle, in degrees, that the blade [BC] turns through in one second.

[2]
b.ii.

Write down the amplitude of the function.

[1]
c.i.

Find the period of the function.

[1]
c.ii.

Sketch the function h(t) for 0t5, clearly labelling the coordinates of the maximum and minimum points.

[3]
d.

Find the height of C above the ground when t=2.

[2]
e.i.

Find the time, in seconds, that point C is above a height of 100 m, during each complete rotation.

[3]
e.ii.

At any given instant, find the probability that point C is visible from Tim’s window.

[3]
f.i.

The wind speed increases. The blades rotate at twice the speed, but still at a constant rate.

At any given instant, find the probability that Tim can see point C from his window. Justify your answer.

[2]
f.ii.



The following table shows values of ln x and ln y.

The relationship between ln x and ln y can be modelled by the regression equation ln y = a ln x + b.

Find the value of a and of b.

[3]
a.

Use the regression equation to estimate the value of y when x = 3.57.

[3]
b.

The relationship between x and y can be modelled using the formula y = kxn, where k ≠ 0 , n ≠ 0 , n ≠ 1.

By expressing ln y in terms of ln x, find the value of n and of k.

[7]
c.



In the month before their IB Diploma examinations, eight male students recorded the number of hours they spent on social media.

For each student, the number of hours spent on social media ( x ) and the number of IB Diploma points obtained ( y ) are shown in the following table.

N16/5/MATSD/SP2/ENG/TZ0/01

Use your graphic display calculator to find

Ten female students also recorded the number of hours they spent on social media in the month before their IB Diploma examinations. Each of these female students spent between 3 and 30 hours on social media.

The equation of the regression line y on x for these ten female students is

y = 2 3 x + 125 3 .

An eleventh girl spent 34 hours on social media in the month before her IB Diploma examinations.

On graph paper, draw a scatter diagram for these data. Use a scale of 2 cm to represent 5 hours on the x -axis and 2 cm to represent 10 points on the y -axis.

[4]
a.

(i)     x ¯ , the mean number of hours spent on social media;

(ii)     y ¯ , the mean number of IB Diploma points.

[2]
b.

Plot the point ( x ¯ ,   y ¯ )  on your scatter diagram and label this point M.

[2]
c.

Write down the equation of the regression line y on x for these eight male students.

[2]
e.

Draw the regression line, from part (e), on your scatter diagram.

[2]
f.

Use the given equation of the regression line to estimate the number of IB Diploma points that this girl obtained.

[2]
g.

Write down a reason why this estimate is not reliable.

[1]
h.



A medical centre is testing patients for a certain disease. This disease occurs in 5% of the population.

They test every patient who comes to the centre on a particular day.

It is intended that if a patient has the disease, they test “positive”, and if a patient does not have the disease, they test “negative”.

However, the tests are not perfect, and only 99% of people who have the disease test positive. Also, 2% of people who do not have the disease test positive.

The tree diagram shows some of this information.

Write down the value of

Use the tree diagram to find the probability that a patient selected at random

The staff at the medical centre looked at the care received by all visiting patients on a randomly chosen day. All the patients received at least one of these services: they had medical tests (M), were seen by a nurse (N), or were seen by a doctor (D). It was found that:

State the sampling method being used.

[1]
a.

a.

[1]
b.i.

b.

[1]
b.ii.

c.

[1]
b.iii.

d.

[1]
b.iv.

will not have the disease and will test positive.

[2]
c.i.

will test negative.

[3]
c.ii.

has the disease given that they tested negative.

[3]
c.iii.

The medical centre finds the actual number of positive results in their sample is different than predicted by the tree diagram. Explain why this might be the case.

[1]
d.

Draw a Venn diagram to illustrate this information, placing all relevant information on the diagram.

[3]
e.

Find the total number of patients who visited the centre during this day.

[2]
f.



A group of 800 students answered 40 questions on a category of their choice out of History, Science and Literature.

For each student the category and the number of correct answers, N , was recorded. The results obtained are represented in the following table.

N17/5/MATSD/SP2/ENG/TZ0/01

A χ 2 test at the 5% significance level is carried out on the results. The critical value for this test is 12.592.

State whether N is a discrete or a continuous variable.

[1]
a.

Write down, for N , the modal class;

[1]
b.i.

Write down, for N , the mid-interval value of the modal class.

[1]
b.ii.

Use your graphic display calculator to estimate the mean of N ;

[2]
c.i.

Use your graphic display calculator to estimate the standard deviation of N .

[1]
c.ii.

Find the expected frequency of students choosing the Science category and obtaining 31 to 40 correct answers.

[2]
d.

Write down the null hypothesis for this test;

[1]
e.i.

Write down the number of degrees of freedom.

[1]
e.ii.

Write down the p -value for the test;

[1]
f.i.

Write down the χ 2 statistic.

[2]
f.ii.

State the result of the test. Give a reason for your answer.

[2]
g.



Fiona walks from her house to a bus stop where she gets a bus to school. Her time, W minutes, to walk to the bus stop is normally distributed with W~N12, 32.

Fiona always leaves her house at 07:15. The first bus that she can get departs at 07:30.

The length of time, B minutes, of the bus journey to Fiona’s school is normally distributed with B~N50, σ2. The probability that the bus journey takes less than 60 minutes is 0.941.

If Fiona misses the first bus, there is a second bus which departs at 07:45. She must arrive at school by 08:30 to be on time. Fiona will not arrive on time if she misses both buses. The variables W and B are independent.

Find the probability that it will take Fiona between 15 minutes and 30 minutes to walk to the bus stop.

[2]
a.

Find σ.

[3]
b.

Find the probability that the bus journey takes less than 45 minutes.

[2]
c.

Find the probability that Fiona will arrive on time.

[5]
d.

This year, Fiona will go to school on 183 days.

Calculate the number of days Fiona is expected to arrive on time.

[2]
e.



A company performs an experiment on the efficiency of a liquid that is used to detect a nut allergy.

A group of 60 people took part in the experiment. In this group 26 are allergic to nuts. One person from the group is chosen at random.

A second person is chosen from the group.

When the liquid is added to a person’s blood sample, it is expected to turn blue if the person is allergic to nuts and to turn red if the person is not allergic to nuts.

The company claims that the probability that the test result is correct is 98% for people who are allergic to nuts and 95% for people who are not allergic to nuts.

It is known that 6 in every 1000 adults are allergic to nuts.

This information can be represented in a tree diagram.

N17/5/MATSD/SP2/ENG/TZ0/04.c.d.e.f.g

An adult, who was not part of the original group of 60, is chosen at random and tested using this liquid.

The liquid is used in an office to identify employees who might be allergic to nuts. The liquid turned blue for 38 employees.

Find the probability that this person is not allergic to nuts.

[2]
a.

Find the probability that both people chosen are not allergic to nuts.

[2]
b.

Copy and complete the tree diagram.

[3]
c.

Find the probability that this adult is allergic to nuts and the liquid turns blue.

[2]
d.

Find the probability that the liquid turns blue.

[3]
e.

Find the probability that the tested adult is allergic to nuts given that the liquid turned blue.

[3]
f.

Estimate the number of employees, from this 38, who are allergic to nuts.

[2]
g.



A survey was conducted on a group of people. The first question asked how many pets they each own. The results are summarized in the following table.

The second question asked each member of the group to state their age and preferred pet. The data obtained is organized in the following table.

A χ 2 test is carried out at the 10 % significance level.

Write down the total number of people, from this group, who are pet owners.

[1]
a.

Write down the modal number of pets.

[1]
b.

For these data, write down the median number of pets.

[1]
c.i.

For these data, write down the lower quartile.

[1]
c.ii.

For these data, write down the upper quartile.

[1]
c.iii.

Write down the ratio of teenagers to non-teenagers in its simplest form.

[1]
d.

State the null hypothesis.

[1]
e.i.

State the alternative hypothesis.

[1]
e.ii.

Write down the number of degrees of freedom for this test.

[1]
f.

Calculate the expected number of teenagers that prefer cats.

[2]
g.

State the conclusion for this test. Give a reason for your answer.

[2]
i.



Sila High School has 110 students. They each take exactly one language class from a choice of English, Spanish or Chinese. The following table shows the number of female and male students in the three different language classes.

A χ 2  test was carried out at the 5 % significance level to analyse the relationship between gender and student choice of language class.

Use your graphic display calculator to write down

The critical value at the 5 % significance level for this test is 5.99.

One student is chosen at random from this school.

Another student is chosen at random from this school.

Write down the null hypothesis, H, for this test.

[1]
a.

State the number of degrees of freedom.

[1]
b.

the expected frequency of female students who chose to take the Chinese class.

[1]
c.i.

State whether or not H0 should be rejected. Justify your statement.

[2]
d.

Find the probability that the student does not take the Spanish class.

[2]
e.i.

Find the probability that neither of the two students take the Spanish class.

[3]
e.ii.

Find the probability that at least one of the two students is female.

[3]
e.iii.



There are three fair six-sided dice. Each die has two green faces, two yellow faces and two red faces.

All three dice are rolled.

Ted plays a game using these dice. The rules are:

The random variable D  ($) represents how much is added to his winnings after a turn.

The following table shows the distribution for D , where $ w represents his winnings in the game so far.

Find the probability of rolling exactly one red face.

[2]
a.i.

Find the probability of rolling two or more red faces.

[3]
a.ii.

Show that, after a turn, the probability that Ted adds exactly $10 to his winnings is  1 3 .

[5]
b.

Write down the value of x .

[1]
c.i.

Hence, find the value of y .

[2]
c.ii.

Ted will always have another turn if he expects an increase to his winnings.

Find the least value of w for which Ted should end the game instead of having another turn.

[3]
d.



Don took part in a project investigating wind speed, xkm h-1, and the time, y minutes, to fully charge a solar powered robot.

The investigation was carried out six times. The results are recorded in the table.

M is the point with coordinates (x, y).

On graph paper, draw a scatter diagram to show the results of Don’s investigation. Use a scale of 1cm to represent 2 units on the x-axis, and 1cm to represent 5 units on the y-axis.

[4]
a.

Calculate x, the mean wind speed.

[1]
b.i.

Calculate y, the mean time to fully charge the robot.

[1]
b.ii.

Plot and label the point M on your scatter diagram.

[2]
c.

Calculate r, Pearson’s product–moment correlation coefficient.

[2]
d.i.

Describe the correlation between the wind speed and the time to fully charge the robot.

[2]
d.ii.

Write down the equation of the regression line y on x, in the form y=mx+c.

[2]
e.i.

Draw this regression line on your scatter diagram.

[2]
e.ii.

Hence or otherwise estimate the charging time when the wind speed is 27km h-1.

[2]
e.iii.

Don concluded from his investigation: “There is no causation between wind speed and the time to fully charge the robot”.

In the context of the question, briefly explain the meaning of “no causation”.

[1]
f.



It is known that the weights of male Persian cats are normally distributed with mean 6.1kg and variance 0.52kg2.

A group of 80 male Persian cats are drawn from this population.

Sketch a diagram showing the above information.

[2]
a.

Find the proportion of male Persian cats weighing between 5.5kg and 6.5kg.

[2]
b.

Determine the expected number of cats in this group that have a weight of less than 5.3kg.

[3]
c.

It is found that 12 of the cats weigh more than xkg. Estimate the value of x.

[3]
d.

Ten of the cats are chosen at random. Find the probability that exactly one of them weighs over 6.25kg.

[4]
e.



In a company it is found that 25 % of the employees encountered traffic on their way to work. From those who encountered traffic the probability of being late for work is 80 %.

From those who did not encounter traffic, the probability of being late for work is 15 %.

The tree diagram illustrates the information.

The company investigates the different means of transport used by their employees in the past year to travel to work. It was found that the three most common means of transport used to travel to work were public transportation (P ), car (C ) and bicycle (B ).

The company finds that 20 employees travelled by car, 28 travelled by bicycle and 19 travelled by public transportation in the last year.

Some of the information is shown in the Venn diagram.

There are 54 employees in the company.

Write down the value of a.

[1]
a.i.

Write down the value of b.

[1]
a.ii.

Use the tree diagram to find the probability that an employee encountered traffic and was late for work.

[2]
b.i.

Use the tree diagram to find the probability that an employee was late for work.

[3]
b.ii.

Use the tree diagram to find the probability that an employee encountered traffic given that they were late for work.

[3]
b.iii.

Find the value of x.

[1]
c.i.

Find the value of y.

[1]
c.ii.

Find the number of employees who, in the last year, did not travel to work by car, bicycle or public transportation.

[2]
d.

Find  n ( ( C B ) P ) .

[2]
e.



In a school, all Mathematical Studies SL students were given a test. The test contained four questions, each one on a different topic from the syllabus. The quality of each response was classified as satisfactory or not satisfactory. Each student answered only three of the four questions, each on a separate answer sheet.

The table below shows the number of satisfactory and not satisfactory responses for each question.

M17/5/MATSD/SP2/ENG/TZ2/01

A χ 2 test is carried out at the 5% significance level for the data in the table.

The critical value for this test is 7.815.

If the teacher chooses a response at random, find the probability that it is a response to the Calculus question;

[2]
a.i.

If the teacher chooses a response at random, find the probability that it is a satisfactory response to the Calculus question;

[2]
a.ii.

If the teacher chooses a response at random, find the probability that it is a satisfactory response, given that it is a response to the Calculus question.

[2]
a.iii.

The teacher groups the responses by topic, and chooses two responses to the Logic question. Find the probability that both are not satisfactory.

[3]
b.

State the null hypothesis for this test.

[1]
c.

Show that the expected frequency of satisfactory Calculus responses is 12.

[1]
d.

Write down the number of degrees of freedom for this test.

[1]
e.

Use your graphic display calculator to find the χ 2 statistic for this data.

[2]
f.

State the conclusion of this χ 2 test. Give a reason for your answer.

[2]
g.



Mackenzie conducted an experiment on the reaction times of teenagers. The results of the experiment are displayed in the following cumulative frequency graph.

Use the graph to estimate the

Mackenzie created the cumulative frequency graph using the following grouped frequency table.

Upon completion of the experiment, Mackenzie realized that some values were grouped incorrectly in the frequency table. Some reaction times recorded in the interval 0<t0.2 should have been recorded in the interval 0.2<t0.4.

median reaction time.

[1]
a.i.

interquartile range of the reaction times.

[3]
a.ii.

Find the estimated number of teenagers who have a reaction time greater than 0.4 seconds.

[2]
b.

Determine the 90th percentile of the reaction times from the cumulative frequency graph.

[2]
c.

Write down the value of a.

[1]
d.i.

Write down the value of b.

[1]
d.ii.

Write down the modal class from the table.

[1]
e.

Use your graphic display calculator to find an estimate of the mean reaction time.

[2]
f.

Suggest how, if at all, the estimated mean and estimated median reaction times will change if the errors are corrected. Justify your response.

[4]
g.



A pharmaceutical company has developed a new drug to decrease cholesterol. The final stage of testing the new drug is to compare it to their current drug. They have 150 volunteers, all recently diagnosed with high cholesterol, from which they want to select a sample of size 18. They require as close as possible 20% of the sample to be below the age of 30, 30% to be between the ages of 30 and 50 and 50% to be over the age of 50.

Half of the 18 volunteers are given the current drug and half are given the new drug. After six months each volunteer has their cholesterol level measured and the decrease during the six months is shown in the table.

Calculate the mean decrease in cholesterol for

The company uses a t-test, at the 1% significance level, to determine if the new drug is more effective at decreasing cholesterol.

State the name for this type of sampling technique.

[1]
a.

Calculate the number of volunteers in the sample under the age of 30.

[3]
b.

The new drug.

[1]
c.i.

The current drug.

[1]
c.ii.

State an assumption that the company is making, in order to use a t-test.

[1]
d.

State the hypotheses for this t-test.

[1]
e.

Find the p-value for this t-test.

[3]
f.

State the conclusion of this test, in context, giving a reason.

[2]
g.



The following table shows a probability distribution for the random variable X , where E ( X ) = 1.2 .

M17/5/MATME/SP2/ENG/TZ2/10

A bag contains white and blue marbles, with at least three of each colour. Three marbles are drawn from the bag, without replacement. The number of blue marbles drawn is given by the random variable X .

A game is played in which three marbles are drawn from the bag of ten marbles, without replacement. A player wins a prize if three white marbles are drawn.

Find q .

[2]
a.i.

Find p .

[2]
a.ii.

Write down the probability of drawing three blue marbles.

[1]
b.i.

Explain why the probability of drawing three white marbles is 1 6 .

[1]
b.ii.

The bag contains a total of ten marbles of which w are white. Find w .

[3]
b.iii.

Grant plays the game until he wins two prizes. Find the probability that he wins his second prize on his eighth attempt.

[4]
d.



Two events A and B are such that P(A) = 0.62 and P ( A B ) = 0.18.

Find P(AB′ ).

[2]
a.

Given that P((AB)′) = 0.19, find P(A |B).

[4]
b.



The weight, W, of basketball players in a tournament is found to be normally distributed with a mean of 65 kg and a standard deviation of 5 kg.

The probability that a basketball player has a weight that is within 1.5 standard deviations of the mean is q.

A basketball coach observed 60 of her players to determine whether their performance and their weight were independent of each other. Her observations were recorded as shown in the table.

She decided to conduct a χ 2 test for independence at the 5% significance level.

Find the probability that a basketball player has a weight that is less than 61 kg.

[2]
a.i.

In a training session there are 40 basketball players.

Find the expected number of players with a weight less than 61 kg in this training session.

[2]
a.ii.

Sketch a normal curve to represent this probability.

[2]
b.i.

Find the value of q.

[1]
b.ii.

Given that P(W > k) = 0.225 , find the value of k.

[2]
c.

For this test state the null hypothesis.

[1]
d.i.

For this test find the p-value.

[2]
d.ii.

State a conclusion for this test. Justify your answer.

[2]
e.



On one day 180 flights arrived at a particular airport. The distance travelled and the arrival status for each incoming flight was recorded. The flight was then classified as on time, slightly delayed, or heavily delayed.

The results are shown in the following table.

A χ2 test is carried out at the 10 % significance level to determine whether the arrival status of incoming flights is independent of the distance travelled.

The critical value for this test is 7.779.

A flight is chosen at random from the 180 recorded flights.

State the alternative hypothesis.

[1]
a.

Calculate the expected frequency of flights travelling at most 500 km and arriving slightly delayed.

[2]
b.

Write down the number of degrees of freedom.

[1]
c.

Write down the χ2 statistic.

[2]
d.i.

Write down the associated p-value.

[1]
d.ii.

State, with a reason, whether you would reject the null hypothesis.

[2]
e.

Write down the probability that this flight arrived on time.

[2]
f.

Given that this flight was not heavily delayed, find the probability that it travelled between 500 km and 5000 km.

[2]
g.

Two flights are chosen at random from those which were slightly delayed.

Find the probability that each of these flights travelled at least 5000 km.

[3]
h.



A transportation company owns 30 buses. The distance that each bus has travelled since being purchased by the company is recorded. The cumulative frequency curve for these data is shown.

It is known that 8 buses travelled more than m kilometres.

Find the number of buses that travelled a distance between 15000 and 20000 kilometres.

[2]
a.

Use the cumulative frequency curve to find the median distance.

[2]
b.i.

Use the cumulative frequency curve to find the lower quartile.

[1]
b.ii.

Use the cumulative frequency curve to find the upper quartile.

[1]
b.iii.

Hence write down the interquartile range.

[1]
c.

Write down the percentage of buses that travelled a distance greater than the upper quartile.

[1]
d.

Find the number of buses that travelled a distance less than or equal to 12 000 km.

[1]
e.

Find the value of m.

[2]
f.

The smallest distance travelled by one of the buses was 2500 km.
The longest distance travelled by one of the buses was 23 000 km.

On graph paper, draw a box-and-whisker diagram for these data. Use a scale of 2 cm to represent 5000 km.

[4]
g.



A discrete random variable X has the following probability distribution.

N17/5/MATME/SP2/ENG/TZ0/04

Find the value of k .

[4]
a.

Write down P ( X = 2 ) .

[1]
b.

Find P ( X = 2 | X > 0 ) .

[3]
c.



The scores of the eight highest scoring countries in the 2019 Eurovision song contest are shown in the following table.

For this data, find

Chester is investigating the relationship between the highest-scoring countries’ Eurovision score and their population size to determine whether population size can reasonably be used to predict a country’s score.

The populations of the countries, to the nearest million, are shown in the table.

Chester finds that, for this data, the Pearson’s product moment correlation coefficient is r=0.249.

Chester then decides to find the Spearman’s rank correlation coefficient for this data, and creates a table of ranks.

Write down the value of:

the upper quartile.

[2]
a.i.

the interquartile range.

[2]
a.ii.

Determine if the Netherlands’ score is an outlier for this data. Justify your answer.

[3]
b.

State whether it would be appropriate for Chester to use the equation of a regression line for y on x to predict a country’s Eurovision score. Justify your answer.

[2]
c.

a.

[1]
d.i.

b.

[1]
d.ii.

c.

[1]
d.iii.

Find the value of the Spearman’s rank correlation coefficient rs.

[2]
e.i.

Interpret the value obtained for rs.

[1]
e.ii.

When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478. Explain why the value of the Spearman’s rank correlation rs does not change despite this error.

[1]
f.



All lengths in this question are in metres.

 

Consider the function f ( x ) = 4 x 2 8 , for −2 ≤ x  ≤ 2. In the following diagram, the shaded region is enclosed by the graph of f and the x -axis.

A container can be modelled by rotating this region by 360˚ about the x -axis.

Water can flow in and out of the container.

The volume of water in the container is given by the function g ( t ) , for 0 ≤ t ≤ 4 , where t is measured in hours and g ( t ) is measured in m3. The rate of change of the volume of water in the container is given by g ( t ) = 0.9 2.5 cos ( 0.4 t 2 ) .

The volume of water in the container is increasing only when  p  < t  < q .

Find the volume of the container.

[3]
a.

Find the value of  p and of  q .

[3]
b.i.

During the interval  p  < t  < q , he volume of water in the container increases by k  m3. Find the value of k .

[3]
b.ii.

When t = 0, the volume of water in the container is 2.3 m3. It is known that the container is never completely full of water during the 4 hour period.

 

Find the minimum volume of empty space in the container during the 4 hour period.

[5]
c.



Let f ( x ) = 0.5 x 4 + 3 x 2 + 2 x . The following diagram shows part of the graph of f .

M17/5/MATME/SP2/ENG/TZ2/08

 

There are x -intercepts at x = 0 and at x = p . There is a maximum at A where x = a , and a point of inflexion at B where x = b .

Find the value of p .

[2]
a.

Write down the coordinates of A.

[2]
b.i.

Write down the rate of change of f  at A.

[1]
b.ii.

Find the coordinates of B.

[4]
c.i.

Find the the rate of change of f at B.

[3]
c.ii.

Let R be the region enclosed by the graph of f , the x -axis, the line x = b and the line x = a . The region R is rotated 360° about the x -axis. Find the volume of the solid formed.

[3]
d.



The marks obtained by nine Mathematical Studies SL students in their projects (x) and their final IB examination scores (y) were recorded. These data were used to determine whether the project mark is a good predictor of the examination score. The results are shown in the table.

The equation of the regression line y on x is y = mx + c.

A tenth student, Jerome, obtained a project mark of 17.

Use your graphic display calculator to write down y ¯ , the mean examination score.

[1]
a.ii.

Use your graphic display calculator to write down r , Pearson’s product–moment correlation coefficient.

[2]
a.iii.

Find the exact value of m and of c for these data.

[2]
b.i.

Use the regression line y on x to estimate Jerome’s examination score.

[2]
c.i.

Justify whether it is valid to use the regression line y on x to estimate Jerome’s examination score.

[2]
c.ii.



A group of 1280 students were asked which electronic device they preferred. The results per age group are given in the following table.

A student from the group is chosen at random. Calculate the probability that the student

A χ2 test for independence was performed on the collected data at the 1% significance level. The critical value for the test is 13.277.

prefers a tablet.

[2]
a.i.

is 1113 years old and prefers a mobile phone.

[2]
a.ii.

prefers a laptop given that they are 1718 years old.

[2]
a.iii.

prefers a tablet or is 1416 years old.

[3]
a.iv.

State the null and alternative hypotheses.

[1]
b.

Write down the number of degrees of freedom.

[1]
c.

Write down the χ2 test statistic.

[2]
d.i.

Write down the p-value.

[1]
d.ii.

State the conclusion for the test in context. Give a reason for your answer.

[2]
d.iii.



160 students attend a dual language school in which the students are taught only in Spanish or taught only in English.

A survey was conducted in order to analyse the number of students studying Biology or Mathematics. The results are shown in the Venn diagram.

 

Set S represents those students who are taught in Spanish.

Set B represents those students who study Biology.

Set M represents those students who study Mathematics.

 

A student from the school is chosen at random.

Find the number of students in the school that are taught in Spanish.

[2]
a.i.

Find the number of students in the school that study Mathematics in English.

[2]
a.ii.

Find the number of students in the school that study both Biology and Mathematics.

[2]
a.iii.

Write down  n ( S ( M B ) ) .

[1]
b.i.

Write down n ( B M S ) .

[1]
b.ii.

Find the probability that this student studies Mathematics.

[2]
c.i.

Find the probability that this student studies neither Biology nor Mathematics.

[2]
c.ii.

Find the probability that this student is taught in Spanish, given that the student studies Biology.

[2]
c.iii.



On a school excursion, 100 students visited an amusement park. The amusement park’s main attractions are rollercoasters (R), water slides (W), and virtual reality rides (V).

The students were asked which main attractions they visited. The results are shown in the Venn diagram.

A total of 74 students visited the rollercoasters or the water slides.

Find the value of a.

[2]
a.i.

Find the value of b.

[2]
a.ii.

Find the number of students who visited at least two types of main attraction.

[2]
b.

Write down the value of n( RW) .

[1]
c.

Find the probability that a randomly selected student visited the rollercoasters.

[2]
d.i.

Find the probability that a randomly selected student visited the virtual reality rides.

[1]
d.ii.

Hence determine whether the events in parts (d)(i) and (d)(ii) are independent. Justify your reasoning. 

[2]
e.



The following table shows the hand lengths and the heights of five athletes on a sports team.

The relationship between x and y can be modelled by the regression line with equation y = ax + b.

Another athlete on this sports team has a hand length of 21.5 cm. Use the regression equation to estimate the height of this athlete.




The table below shows the distribution of test grades for 50 IB students at Greendale School.

M17/5/MATSD/SP2/ENG/TZ1/05

A student is chosen at random from these 50 students.

A second student is chosen at random from these 50 students.

The number of minutes that the 50 students spent preparing for the test was normally distributed with a mean of 105 minutes and a standard deviation of 20 minutes.

Calculate the mean test grade of the students;

[2]
a.i.

Calculate the standard deviation.

[1]
a.ii.

Find the median test grade of the students.

[1]
b.

Find the interquartile range.

[2]
c.

Find the probability that this student scored a grade 5 or higher.

[2]
d.

Given that the first student chosen at random scored a grade 5 or higher, find the probability that both students scored a grade 6.

[3]
e.

Calculate the probability that a student chosen at random spent at least 90 minutes preparing for the test.

[2]
f.i.

Calculate the expected number of students that spent at least 90 minutes preparing for the test.

[2]
f.ii.



A dice manufacturer claims that for a novelty die he produces the probability of scoring the numbers 1 to 5 are all equal, and the probability of a 6 is two times the probability of scoring any of the other numbers.

To test the manufacture’s claim one of the novelty dice is rolled 350 times and the numbers scored on the die are shown in the table below.

A χ2 goodness of fit test is to be used with a 5% significance level.

Find the probability of scoring a six when rolling the novelty die.

[3]
a.

Find the probability of scoring more than 2 sixes when this die is rolled 5 times.

[4]
b.

Find the expected frequency for each of the numbers if the manufacturer’s claim is true.

[2]
c.i.

Write down the null and alternative hypotheses.

[2]
c.ii.

State the degrees of freedom for the test.

[1]
c.iii.

Determine the conclusion of the test, clearly justifying your answer.

[4]
c.iv.



The stopping distances for bicycles travelling at 20km h-1 are assumed to follow a normal distribution with mean 6.76m and standard deviation 0.12m.

Under this assumption, find, correct to four decimal places, the probability that a bicycle chosen at random travelling at 20km h-1 manages to stop

1000 randomly selected bicycles are tested and their stopping distances when travelling at 20km h-1 are measured.

Find, correct to four significant figures, the expected number of bicycles tested that stop between

The measured stopping distances of the 1000 bicycles are given in the table.

It is decided to perform a χ2 goodness of fit test at the 5% level of significance to decide whether the stopping distances of bicycles travelling at 20km h-1 can be modelled by a normal distribution with mean 6.76m and standard deviation 0.12m.

in less than 6.5m.

[2]
a.i.

in more than 7m.

[1]
a.ii.

6.5m and 6.75m.

[2]
b.i.

6.75m and 7m.

[1]
b.ii.

State the null and alternative hypotheses.

[2]
c.

Find the p-value for the test.

[3]
d.

State the conclusion of the test. Give a reason for your answer.

[2]
e.



The maximum temperature T , in degrees Celsius, in a park on six randomly selected days is shown in the following table. The table also shows the number of visitors, N , to the park on each of those six days.

M17/5/MATME/SP2/ENG/TZ2/02

The relationship between the variables can be modelled by the regression equation N = a T + b .

Find the value of a and of b .

[3]
a.i.

Write down the value of  r .

[1]
a.ii.

Use the regression equation to estimate the number of visitors on a day when the maximum temperature is 15 °C.

[3]
b.



A water container is made in the shape of a cylinder with internal height h cm and internal base radius r cm.

N16/5/MATSD/SP2/ENG/TZ0/06

The water container has no top. The inner surfaces of the container are to be coated with a water-resistant material.

The volume of the water container is 0.5   m 3 .

The water container is designed so that the area to be coated is minimized.

One can of water-resistant material coats a surface area of 2000  c m 2 .

Write down a formula for A , the surface area to be coated.

[2]
a.

Express this volume in  c m 3 .

[1]
b.

Write down, in terms of r and h , an equation for the volume of this water container.

[1]
c.

Show that A = π r 2 + 1 000 000 r .

[2]
d.

Find d A d r .

[3]
e.

Using your answer to part (e), find the value of r which minimizes A .

[3]
f.

Find the value of this minimum area.

[2]
g.

Find the least number of cans of water-resistant material that will coat the area in part (g).

[3]
h.



A group of 66 people went on holiday to Hawaii. During their stay, three trips were arranged: a boat trip ( B ), a coach trip ( C ) and a helicopter trip ( H ).

From this group of people:

went on all three trips;
16  went on the coach trip only;
13  went on the boat trip only;
went on the helicopter trip only;
went on the coach trip and the helicopter trip but not the boat trip;
2 went on the boat trip and the helicopter trip but not the coach trip;
4 went on the boat trip and the coach trip but not the helicopter trip;
did not go on any of the trips.

One person in the group is selected at random.

Draw a Venn diagram to represent the given information, using sets labelled B , C and H .

[5]
a.

Show that x = 3 .

[2]
b.

Write down the value of n ( B C ) .

[1]
c.

Find the probability that this person

(i)     went on at most one trip;

(ii)     went on the coach trip, given that this person also went on both the helicopter trip and the boat trip.

[4]
d.



As part of his mathematics exploration about classic books, Jason investigated the time taken by students in his school to read the book The Old Man and the Sea. He collected his data by stopping and asking students in the school corridor, until he reached his target of 10 students from each of the literature classes in his school.

Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this book.

 

Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.

For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea x, measured in hours, and paired this with their percentage score on the final exam y. These data are represented on the scatter diagram.

Jason correctly calculates the equation of the regression line y on x for these students to be

y=-1.54x+98.8.

He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1.5 hours.

Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.

Jason intends to analyse the data using Spearman’s rank correlation coefficient, rs.

State which of the two sampling methods, systematic or quota, Jason has used.

[1]
a.

Write down the median time to read the book.

[1]
b.

Calculate the interquartile range.

[2]
c.

Determine whether Jason is correct. Support your reasoning.

[4]
d.

Describe the correlation.

[1]
e.

Find the percentage score calculated by Jason.

[2]
f.

State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer.

[2]
g.

Copy and complete the information in the following table.

[2]
h.

Calculate the value of rs.

[2]
i.i.

Interpret your result.

[1]
i.ii.



Emlyn plays many games of basketball for his school team. The number of minutes he plays in each game follows a normal distribution with mean m minutes.

In any game there is a 30% chance he will play less than 13.6 minutes.

In any game there is a 70% chance he will play less than 17.8 minutes.

The standard deviation of the number of minutes Emlyn plays in any game is 4.

There is a 60% chance Emlyn plays less than x minutes in a game.

Emlyn will play in two basketball games today.

Emlyn and his teammate Johan each practise shooting the basketball multiple times from a point X. A record of their performance over the weekend is shown in the table below.

On Monday, Emlyn and Johan will practise and each will shoot 200 times from point X.

Sketch a diagram to represent this information.

[2]
a.

Show that m=15.7.

[2]
b.

Find the probability that Emlyn plays between 13 minutes and 18 minutes in a game.

[2]
c.i.

Find the probability that Emlyn plays more than 20 minutes in a game.

[2]
c.ii.

Find the value of x.

[2]
d.

Find the probability he plays between 13 minutes and 18 minutes in one game and more than 20 minutes in the other game.

[3]
e.

Find the expected number of successful shots Emlyn will make on Monday, based on the results from Saturday and Sunday.

[2]
f.

Emlyn claims the results from Saturday and Sunday show that his expected number of successful shots will be more than Johan’s.

Determine if Emlyn’s claim is correct. Justify your reasoning.

[2]
g.



A random variable X is normally distributed with mean, μ . In the following diagram, the shaded region between 9 and μ represents 30% of the distribution.

M17/5/MATME/SP2/ENG/TZ1/09

The standard deviation of X is 2.1.

The random variable Y is normally distributed with mean λ and standard deviation 3.5. The events X > 9 and Y > 9 are independent, and P ( ( X > 9 ) ( Y > 9 ) ) = 0.4 .

Find P ( X < 9 ) .

[2]
a.

Find the value of μ .

[3]
b.

Find λ .

[5]
c.

Given that Y > 9 , find P ( Y < 13 ) .

[5]
d.



The heights of adult males in a country are normally distributed with a mean of 180 cm and a standard deviation of σ  cm . 17% of these men are shorter than 168 cm. 80% of them have heights between ( 192 h )  cm and 192 cm.

Find the value of h .




Arianne plays a game of darts.

The distance that her darts land from the centre, O, of the board can be modelled by a normal distribution with mean 10cm and standard deviation 3cm.

Find the probability that

Each of Arianne’s throws is independent of her previous throws.

In a competition a player has three darts to throw on each turn. A point is scored if a player throws all three darts to land within a central area around O. When Arianne throws a dart the probability that it lands within this area is 0.8143.

In the competition Arianne has ten turns, each with three darts.

a dart lands less than 13cm from O.

[2]
a.i.

a dart lands more than 15cm from O.

[1]
a.ii.

Find the probability that Arianne throws two consecutive darts that land more than 15cm from O.

[2]
b.

Find the probability that Arianne does not score a point on a turn of three darts.

[2]
c.

Find the probability that Arianne scores at least 5 points in the competition.

[3]
d.i.

Find the probability that Arianne scores at least 5 points and less than 8 points.

[2]
d.ii.

Given that Arianne scores at least 5 points, find the probability that Arianne scores less than 8 points.

[2]
d.iii.



A group of 7 adult men wanted to see if there was a relationship between their Body Mass Index (BMI) and their waist size. Their waist sizes, in centimetres, were recorded and their BMI calculated. The following table shows the results.

The relationship between x and y can be modelled by the regression equation y = a x + b .

Write down the value of a and of b .

[3]
a.i.

Find the correlation coefficient.

[1]
a.ii.

Use the regression equation to estimate the BMI of an adult man whose waist size is 95 cm.

[2]
b.



Let  f ( x ) = 16 x . The line L  is tangent to the graph of  f at  x = 8 .

L can be expressed in the form r  = ( 8 2 ) + t u.

The direction vector of y = x is  ( 1 1 ) .

Find the gradient of L .

[2]
a.

Find u.

[2]
b.

Find the acute angle between y = x and L .

[5]
c.

Find  ( f f ) ( x ) .

[3]
d.i.

Hence, write down f 1 ( x ) .

[1]
d.ii.

Hence or otherwise, find the obtuse angle formed by the tangent line to f at x = 8 and the tangent line to f at x = 2 .

[3]
d.iii.



The weights, in grams, of oranges grown in an orchard, are normally distributed with a mean of 297 g. It is known that 79 % of the oranges weigh more than 289 g and 9.5 % of the oranges weigh more than 310 g.

The weights of the oranges have a standard deviation of σ.

The grocer at a local grocery store will buy the oranges whose weights exceed the 35th percentile.

The orchard packs oranges in boxes of 36.

Find the probability that an orange weighs between 289 g and 310 g.

[2]
a.

Find the standardized value for 289 g.

[2]
b.i.

Hence, find the value of σ.

[3]
b.ii.

To the nearest gram, find the minimum weight of an orange that the grocer will buy.

[3]
c.

Find the probability that the grocer buys more than half the oranges in a box selected at random.

[5]
d.

The grocer selects two boxes at random.

Find the probability that the grocer buys more than half the oranges in each box.

[2]
e.



The following table shows a probability distribution for the random variable X , where E ( X ) = 1.2 .

M17/5/MATME/SP2/ENG/TZ2/10

A bag contains white and blue marbles, with at least three of each colour. Three marbles are drawn from the bag, without replacement. The number of blue marbles drawn is given by the random variable X .

A game is played in which three marbles are drawn from the bag of ten marbles, without replacement. A player wins a prize if three white marbles are drawn.

Jill plays the game nine times. Find the probability that she wins exactly two prizes.




Ten students were surveyed about the number of hours, x , they spent browsing the Internet during week 1 of the school year. The results of the survey are given below.

i = 1 10 x i = 252 ,   σ = 5  and median = 27.

During week 4, the survey was extended to all 200 students in the school. The results are shown in the cumulative frequency graph:

N16/5/MATME/SP2/ENG/TZ0/08.d

Find the mean number of hours spent browsing the Internet.

[2]
a.

During week 2, the students worked on a major project and they each spent an additional five hours browsing the Internet. For week 2, write down

(i)     the mean;

(ii)     the standard deviation.

[2]
b.

During week 3 each student spent 5% less time browsing the Internet than during week 1. For week 3, find

(i)     the median;

(ii)     the variance.

[6]
c.

(i)     Find the number of students who spent between 25 and 30 hours browsing the Internet.

(ii)     Given that 10% of the students spent more than k hours browsing the Internet, find the maximum value of k .

[6]
d.



The following table shows the mean weight, y kg , of children who are x years old.

The relationship between the variables is modelled by the regression line with equation  y = a x + b .

Find the value of a and of b.

[3]
a.i.

Write down the correlation coefficient.

[1]
a.ii.

Use your equation to estimate the mean weight of a child that is 1.95 years old.

[2]
b.



A healthy human body temperature is 37.0 °C. Eight people were medically examined and the difference in their body temperature (°C), from 37.0 °C, was recorded. Their heartbeat (beats per minute) was also recorded.

Write down, for this set of data the mean temperature difference from 37 °C, x ¯ .

[1]
b.i.

Write down, for this set of data the mean number of heartbeats per minute, y ¯ .

[1]
b.ii.

Plot and label the point M( x ¯ , y ¯ ) on the scatter diagram.

[2]
c.

Use your graphic display calculator to find the Pearson’s product–moment correlation coefficient, r .

[2]
d.i.

Hence describe the correlation between temperature difference from 37 °C and heartbeat.

[2]
d.ii.

Draw the regression line y on x on the scatter diagram.

[2]
f.



A teacher is concerned about the amount of lesson time lost by 8 students through arriving late at school. Over a period of 2 weeks he records the total number of minutes they are late. He also asks them how far they live from school. The results are shown in the table below.

Which of the correlation coefficients would you recommend is used to assess whether or not there is an association between total number of minutes late and distance from school? Fully justify your answer.




Adam is a beekeeper who collected data about monthly honey production in his bee hives. The data for six of his hives is shown in the following table.

N17/5/MATME/SP2/ENG/TZ0/08

The relationship between the variables is modelled by the regression line with equation P = a N + b .

Adam has 200 hives in total. He collects data on the monthly honey production of all the hives. This data is shown in the following cumulative frequency graph.

N17/5/MATME/SP2/ENG/TZ0/08.c.d.e

Adam’s hives are labelled as low, regular or high production, as defined in the following table.

N17/5/MATME/SP2/ENG/TZ0/08.c.d.e_02

Adam knows that 128 of his hives have a regular production.

Write down the value of a and of b .

[3]
a.

Use this regression line to estimate the monthly honey production from a hive that has 270 bees.

[2]
b.

Write down the number of low production hives.

[1]
c.

Find the value of k ;

[3]
d.i.

Find the number of hives that have a high production.

[2]
d.ii.

Adam decides to increase the number of bees in each low production hive. Research suggests that there is a probability of 0.75 that a low production hive becomes a regular production hive. Calculate the probability that 30 low production hives become regular production hives.

[3]
e.



The manager of a folder factory recorded the number of folders produced by the factory (in thousands) and the production costs (in thousand Euros), for six consecutive months.

M17/5/MATSD/SP2/ENG/TZ2/03

Every month the factory sells all the folders produced. Each folder is sold for 2.99 Euros.

Draw a scatter diagram for this data. Use a scale of 2 cm for 5000 folders on the horizontal axis and 2 cm for 10 000 Euros on the vertical axis.

[4]
a.

Write down, for this set of data the mean number of folders produced, x ¯ ;

[1]
b.i.

Write down, for this set of data the mean production cost, C ¯ .

[1]
b.ii.

Label the point M ( x ¯ ,   C ¯ ) on the scatter diagram.

[1]
c.

State a reason why the regression line C on x is appropriate to model the relationship between these variables.

[1]
e.

Use your graphic display calculator to find the equation of the regression line C on x .

[2]
f.

Draw the regression line C on x on the scatter diagram.

[2]
g.

Use the equation of the regression line to estimate the least number of folders that the factory needs to sell in a month to exceed its production cost for that month.

[4]
h.



In a large university the probability that a student is left handed is 0.08. A sample of 150 students is randomly selected from the university. Let k be the expected number of left-handed students in this sample.

Find k .

[2]
a.

Hence, find the probability that exactly k students are left handed;

[2]
b.i.

Hence, find the probability that fewer than k students are left handed.

[2]
b.ii.



Contestants in a TV gameshow try to get through three walls by passing through doors without falling into a trap. Contestants choose doors at random.
If they avoid a trap they progress to the next wall.
If a contestant falls into a trap they exit the game before the next contestant plays.
Contestants are not allowed to watch each other attempt the game.

The first wall has four doors with a trap behind one door.

Ayako is a contestant.

Natsuko is the second contestant.

The second wall has five doors with a trap behind two of the doors.

The third wall has six doors with a trap behind three of the doors.

The following diagram shows the branches of a probability tree diagram for a contestant in the game.

Write down the probability that Ayako avoids the trap in this wall.

[1]
a.

Find the probability that only one of Ayako and Natsuko falls into a trap while attempting to pass through a door in the first wall.

[3]
b.

Copy the probability tree diagram and write down the relevant probabilities along the branches.

[3]
c.

A contestant is chosen at random. Find the probability that this contestant fell into a trap while attempting to pass through a door in the second wall.

[2]
d.i.

A contestant is chosen at random. Find the probability that this contestant fell into a trap.

[3]
d.ii.

120 contestants attempted this game.

Find the expected number of contestants who fell into a trap while attempting to pass through a door in the third wall.

[3]
e.



A manufacturer produces 1500 boxes of breakfast cereal every day.

The weights of these boxes are normally distributed with a mean of 502 grams and a standard deviation of 2 grams.

All boxes of cereal with a weight between 497.5 grams and 505 grams are sold. The manufacturer’s income from the sale of each box of cereal is $2.00.

The manufacturer recycles any box of cereal with a weight not between 497.5 grams and 505 grams. The manufacturer’s recycling cost is $0.16 per box.

A different manufacturer produces boxes of cereal with weights that are normally distributed with a mean of 350 grams and a standard deviation of 1.8 grams.

This manufacturer sells all boxes of cereal that are above a minimum weight, w .

They sell 97% of the cereal boxes produced.

Draw a diagram that shows this information.

[2]
a.

(i)     Find the probability that a box of cereal, chosen at random, is sold.

(ii)     Calculate the manufacturer’s expected daily income from these sales.

[4]
b.

Calculate the manufacturer’s expected daily recycling cost.

[2]
c.

Calculate the value of w .

[3]
d.



Jim writes a computer program to generate 500 values of a variable Z. He obtains the following table from his results.

In this situation, state briefly what is meant by

Use a chi-squared goodness of fit test to investigate whether or not, at the 5 % level of significance, the N(0, 1) distribution can be used to model these results.

[12]
a.

a Type I error.

[2]
b.i.

a Type II error.

[2]
b.ii.



Casanova restaurant offers a set menu where a customer chooses one of the following meals: pasta, fish or shrimp.

The manager surveyed 150 customers and recorded the customer’s age and chosen meal. The data is shown in the following table.

A χ2 test was performed at the 10% significance level. The critical value for this test is 4.605.

Write down

A customer is selected at random.

State H0, the null hypothesis for this test.

[1]
a.

Write down the number of degrees of freedom.

[1]
b.

Show that the expected number of children who chose shrimp is 31, correct to two significant figures.

[2]
c.

the χ2 statistic.

[2]
d.i.

the p-value.

[1]
d.ii.

State the conclusion for this test. Give a reason for your answer.

[2]
e.

Calculate the probability that the customer is an adult.

[2]
f.i.

Calculate the probability that the customer is an adult or that the customer chose shrimp.

[2]
f.ii.

Given that the customer is a child, calculate the probability that they chose pasta or fish.

[2]
f.iii.



A nationwide study on reaction time is conducted on participants in two age groups. The participants in Group X are less than 40 years old. Their reaction times are normally distributed with mean 0.489 seconds and standard deviation 0.07 seconds.

The participants in Group Y are 40 years or older. Their reaction times are normally distributed with mean 0.592 seconds and standard deviation σ seconds.

In the study, 38 % of the participants are in Group X.

A randomly selected participant has a reaction time greater than 0.65 seconds. Find the probability that the participant is in Group X.

[6]
c.

Ten of the participants with reaction times greater than 0.65 are selected at random. Find the probability that at least two of them are in Group X.

[3]
d.



At Penna Airport the probability, P(A), that all passengers arrive on time for a flight is 0.70. The probability, P(D), that a flight departs on time is 0.85. The probability that all passengers arrive on time for a flight and it departs on time is 0.65.

The number of hours that pilots fly per week is normally distributed with a mean of 25 hours and a standard deviation σ . 90 % of pilots fly less than 28 hours in a week.

Show that event A and event D are not independent.

[2]
a.

Find P ( A D ) .

[2]
b.i.

 Given that all passengers for a flight arrive on time, find the probability that the flight does not depart on time.

[3]
b.ii.

Find the value of σ .

[3]
c.

All flights have two pilots. Find the percentage of flights where both pilots flew more than 30 hours last week.

[4]
d.



The following table shows the average body weight, x , and the average weight of the brain, y , of seven species of mammal. Both measured in kilograms (kg).

M17/5/MATSD/SP2/ENG/TZ1/01

The average body weight of grey wolves is 36 kg.

In fact, the average weight of the brain of grey wolves is 0.120 kg.

Find the range of the average body weights for these seven species of mammal.

[2]
a.

For the data from these seven species calculate r , the Pearson’s product–moment correlation coefficient;

[2]
b.i.

For the data from these seven species describe the correlation between the average body weight and the average weight of the brain.

[2]
b.ii.

Write down the equation of the regression line y on x , in the form y = m x + c .

[2]
c.

Use your regression line to estimate the average weight of the brain of grey wolves.

[2]
d.

Find the percentage error in your estimate in part (d).

[2]
e.



A biased four-sided die is rolled. The following table gives the probability of each score.

Find the value of k.

[2]
a.

Calculate the expected value of the score.

[2]
b.

The die is rolled 80 times. On how many rolls would you expect to obtain a three?

[2]
c.



The Malvern Aquatic Center hosted a 3 metre spring board diving event. The judges, Stan and Minsun awarded 8 competitors a score out of 10. The raw data is collated in the following table.

The Commissioner for the event would like to find the Spearman’s rank correlation coefficient.

Write down the value of the Pearson’s product–moment correlation coefficient, r .

[2]
a.i.

Using the value of r , interpret the relationship between Stan’s score and Minsun’s score.

[2]
a.ii.

Write down the equation of the regression line y on x .

[2]
b.

Use your regression equation from part (b) to estimate Minsun’s score when Stan awards a perfect 10.

[2]
c.i.

State whether this estimate is reliable. Justify your answer.

[2]
c.ii.

Copy and complete the information in the following table.

[2]
d.

Find the value of the Spearman’s rank correlation coefficient, r s .

[2]
e.i.

Comment on the result obtained for r s .

[2]
e.ii.

The Commissioner believes Minsun’s score for competitor G is too high and so decreases the score from 9.5 to 9.1.

Explain why the value of the Spearman’s rank correlation coefficient r s does not change.

[1]
f.



Lucy sells hot chocolate drinks at her snack bar and has noticed that she sells more hot chocolates on cooler days. On six different days, she records the maximum daily temperature, T, measured in degrees centigrade, and the number of hot chocolates sold, H. The results are shown in the following table.

The relationship between H and T can be modelled by the regression line with equation H=aT+b.

Find the value of a and of b.

[3]
a.i.

Write down the correlation coefficient.

[1]
a.ii.

Using the regression equation, estimate the number of hot chocolates that Lucy will sell on a day when the maximum temperature is 12°C.

[2]
b.



The aircraft for a particular flight has 72 seats. The airline’s records show that historically for this flight only 90% of the people who purchase a ticket arrive to board the flight. They assume this trend will continue and decide to sell extra tickets and hope that no more than 72 passengers will arrive.

The number of passengers that arrive to board this flight is assumed to follow a binomial distribution with a probability of 0.9.

Each passenger pays $150 for a ticket. If too many passengers arrive, then the airline will give $300 in compensation to each passenger that cannot board.

The airline sells 74 tickets for this flight. Find the probability that more than 72 passengers arrive to board the flight.

[3]
a.

Write down the expected number of passengers who will arrive to board the flight if 72 tickets are sold.

[2]
b.i.

Find the maximum number of tickets that could be sold if the expected number of passengers who arrive to board the flight must be less than or equal to 72.

[2]
b.ii.

Find, to the nearest integer, the expected increase or decrease in the money made by the airline if they decide to sell 74 tickets rather than 72.

[8]
c.



The mass M of apples in grams is normally distributed with mean μ. The following table shows probabilities for values of M.

The apples are packed in bags of ten.

Any apples with a mass less than 95 g are classified as small.

Write down the value of k.

[2]
a.i.

Show that μ = 106.

[2]
a.ii.

Find P(M < 95) .

[5]
b.

Find the probability that a bag of apples selected at random contains at most one small apple.

[3]
c.

Find the expected number of bags in this crate that contain at most one small apple.

[3]
d.i.

Find the probability that at least 48 bags in this crate contain at most one small apple.

[2]
d.ii.



Consider the function  f ( x ) = x 2 e 3 x ,   x R .

Find f ( x ) .

[4]
a.

The graph of f has a horizontal tangent line at x = 0 and at x = a . Find a .

[2]
b.



The weights, W , of newborn babies in Australia are normally distributed with a mean 3.41 kg and standard deviation 0.57 kg. A newborn baby has a low birth weight if it weighs less than w kg.

Given that 5.3% of newborn babies have a low birth weight, find w .

[3]
a.

A newborn baby has a low birth weight.

Find the probability that the baby weighs at least 2.15 kg.

[3]
b.



A discrete random variable X has the following probability distribution.

Find an expression for q in terms of p.

[2]
a.

Find the value of p which gives the largest value of EX.

[3]
b.i.

Hence, find the largest value of EX.

[1]
b.ii.



The following diagram shows the graph of f ( x ) = a sin b x + c , for 0 x 12 .

N16/5/MATME/SP2/ENG/TZ0/10

The graph of f has a minimum point at ( 3 ,   5 ) and a maximum point at ( 9 ,   17 ) .

The graph of g is obtained from the graph of f by a translation of ( k 0 ) . The maximum point on the graph of g has coordinates ( 11.5 ,   17 ) .

The graph of g changes from concave-up to concave-down when x = w .

(i)     Find the value of c .

(ii)     Show that b = π 6 .

(iii)     Find the value of a .

[6]
a.

(i)     Write down the value of k .

(ii)     Find g ( x ) .

[3]
b.

(i)     Find w .

(ii)     Hence or otherwise, find the maximum positive rate of change of g .

[6]
c.



A jar contains 5 red discs, 10 blue discs and m green discs. A disc is selected at random and replaced. This process is performed four times.

Write down the probability that the first disc selected is red.

[1]
a.

Let X be the number of red discs selected. Find the smallest value of m for which Var ( X   ) < 0.6 .

[5]
b.