SD Sample Exam Solutions

In April 2016, VCAA released some sample exam questions for the 2016-2019 Software Development study design. Note that – unlike Informatics – the format of the exam is unchanged from previous years.

Here are some sample answers. They are neither official, nor guaranteed.

To save time and effort, I have not reproduced the questions. Use the link above to download them.

Legend- wait for it…

  • Explanatory notes will be in blue.
  • My editorial comments that are not part of the answer will be shown in red.

Section A (multiple choice)

A1 – You can’t test something that has not been developed.

A2 – I’m not sure why the “multi- column, multi-row” was repeated for every option. Why include invariant informations? And since when has “multi-” been an acceptable, formal term in examinations? I’d expect “multiple”.

A3 – An interesting question. XML provides structure to a data file, but does not describe its formatting. Don’t confuse XML with HTML which does describe formatting.

A4 – Both linear and binary searching are listed in the study design (U3O1 KK07) so they are examinable.  The “selection” and “quick” options refer to sorting, not searching. Be careful with the difference between searching and sorting – many students confuse the two ideas.

A lot of reading for a multiple choice question!

A5 – Know your basic project management terms. It’s unclear how deep you need to go, for example whether you need to understand terms like “lead time” or “lag time” but milestones and dependencies are listed in the study design (U3O2 KK14) so they are compulsory.  It is wise, however to know what critical paths and resources are.

A6 – The legislation theory is dry and boring, but you need to know it.

A7 – Goals are long-term, rather vague, idealistic targets. Objectives are specific, measurable achievements that can be seen to be reached or not. For example, “Good customer service” is a goal. It’s what is aimed at in general, but it’s hard to prove whether, or how well, it has been accomplished.

Each goal should be specifically defined with objectives that can be quantified (measured). For example “All customer enquiries should be responded to within 24 hours”. If all of the objectives related to a goal are achieved, it indicates that the goal has also been achieved.

Objectives can often be identified by the use of numerical targets, such as “80% reduction”. This can be measured and judged, whereas a goal such as “reliability” is too vague to measure, but is a valuable long-term target.

The fact the objective relates to a system (the software solution) makes it an information systems objective.  It is relevant to only that system, not the organisation that produces it.

An organisational goal or objective relates to the entire organisation that produces or uses systems. For example, “Good customer service” relates to every person, department and system within the organisation, and is sought every day with every action of the whole organisation.

System goals and objectives relate to individual systems within the organisation, such as its payroll system, communications, stock management, security, customer service.

Logically, each system goal and objective should aim to achieve a larger, broader organisational goal. So, for example, if an organisational goal were “Good customer service”, the communications system would have a goal of “Contacting customers promptly” and a related objective of “Responding to all customer enquiries within 24 hours.” Likewise, the billing system would have a goal such as “Maintaining accurate billing information” and an objective such as “No more than 1% errors in customer billing per month.” 

A8 – Mobile devices need to use wireless. Otherwise they could not be mobile.

A9 – Efficiency criteria relate to time (speed), cost (of running, maintaining, upgrading, consumables, etc), and labour (person-hours needed). Every other criterion you can think of is related to effectiveness – including ease of use in this new study design. That’s all you have to remember.

A10 – An interesting question. What does “data integrity” include? Factors that make data reliable and trustworthy. Not effectiveness,  so rule out options A and D. Option B’s security is tempting – we all want our information to be securely stored – but it’s not a quality of the information itself. Option C are all qualities of quality information.

Some of these questions are really tricky!

A11 – I don’t like this at all. Phishing is misrepresenting oneself as a reliable person. This scenario does not sound like that at all. And students could argue that a worm or trojan from the website could have created the pop-up screen. No. Don’t like this question.

A12 – Even if you didn’t know how a selection sort works (which you should, BTW), it’s pretty clear the other options would not create that output.

Selection sort is inefficient when sorting large lists, but is relatively simple to implement, like bubble sort.

Selection sort method: the data to be sorted are divided into two parts: the data items that have already been sorted and moved to the start of the list; and the data items that have not yet been sorted.

The selection sort then finds the smallest (or largest, if sorting in descending order) data item in the unsorted items.  It swaps that item with the leftmost item in the unsorted list. It then moves the dividing line between sorted/unsorted items 1 place to the right.

An example, using the data from the practice exam.

Shaded cells are the unsorted items.
In the original data, no items are sorted so the dividing line is at position zero – all items are in the unsorted sub-list.

Remember: the question asks about the state of the data after pass 2.

Unsorted 14 7 69 27 15 23 11 10
After pass 1 7 14 69 27 15 23 11 10
After pass 2 7 10 69 27 15 23 11 14
After pass 3 7 10 11 27 15 23 69 14
After pass 4 7 10 11 14 15 23 69 27
After pass 5 7 10 11 14 15 23 69 27
After pass 6 7 10 11 14 15 23 69 27
After pass 7 7 10 11 14 15 23 69 27
After pass 8 7 10 11 14 15 23 27 69

Note that after pass 8 we can finish, since the last number must be the maximum and will always stay in place.

Selection sort’s number of passes required  = number of values to sort – 1

Try it yourself with this test data:

Original data 6 2 9 1 7
After pass 1          
After pass 2          
After pass 3          
After pass 4          
After pass 5          

Get the selection sort slideshow

A13 – A control structure controls the flow of execution. The WHILE loop forces lines to be executed in a particular iterative order.

A14 – Security (stopping unauthorised use by employees), interoperability (between 2 systems), marketability (to tempt the public).

Yes the last two terms are pretty ugly, but they are listed in the SD study design (U3O2 KK11) , so examiners can use them.

ESL kids will want to create their own personal glossaries of odd study design terms, with definitions and examples that mean something to them.

A15 – Validation checks input for reasonableness in terms of range, data type, and existence. Not to be confused with testing that checks output for accuracy.

A16 – The lookup value (‘Amateur Skater 10’) is found in the associative array’s value column. The associated data (‘Tony’) is returned. For those who know VLOOKUP in a spreadsheet, that’s really an example of an associated array.

Golly. Do video game rental stores still exist in 2016?

A17 – This is just wrong. Selection sort is a pretty dumb and (usually) slower algorithm compared with quick sort! The answer should be C.

Visit Wikipedia and read how it says that,  “Selection sort is an in-place comparison sort. It has O(n^2) complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and also has performance advantages over more complicated algorithms in certain situations.”

A18 – How simple is that? A wardrobe could have answered that.

A19 – I don’t like the wording of this. It’s very confusing, especially for ESL students. It would be clearer to have said:
A. Accessing the intranet via a virtual private network.
B. Accessing the internet via a virtual private network.

And ‘via’ is also probably a problem for some kids.

I realise the sample exam was probably not checked as rigorously as a real exam, but VCAA still needs to be careful.

A20 –  Another tricky question.

Section B (short answer)


Mechanics benefit by being able to access a car’s service history records regardless of where it was bought or previously serviced. This gives them information about what they may need to pay attention to during a service.

The car owners benefit because they can take their car to any dealership and their mechanics will immediately have a full history of the car before it is serviced. 

This sounds dodgy to me. Since when do mechanics benefit from a car’s mechanical history? They are not doctors looking at a patient’s medical history.


If the trustworthiness of the data in the central database is uncertain, no user can be sure that they can use it confidently. All decisions that are made based on that data will be tentative and uncertain. Users’ confidence in the MMS will suffer, leading to even fewer updates, and increased improper usage.


   If Student in Student Details File Then
      If Competition_Score >= 50 Then
         If Competition_Score >= 95 then
            Award equals “High Distinction”
            Award equals “Pass”
      Print “Certificate” Award

The examiners would have to accept any sensible and logical pseudocode that students invented. The only non-negotiable convention is (apparently) the equals symbol for assignment. The rest of SD pseudocode bears a spooky resemblance to good old QuickBASIC 4.5.
Students could also use alternative tests, such as >94 instead of >=95 as long as they produce the required logical results.

I’m not sure if students would be penalised for inefficient (but effective) code, such as using
Elseif Competition_Score >=50 
instead of the simpler, optimised, and equally-effective “Else“.


3.5GB software package – DVD-ROM.

3.5 < 4.7. Hardly a brain-bender.

30,000,000 records of 20KB each – 1TB external hard drive

My fear of maths leaps at me. like a digital dementor 
Arghh! Memories of year 10 maths, stand back!

30,000,000 * 20,000 (roughly a KB depending on if the question is using kilobytes or those damned stupid kibibytes) = 600,000,000,000 bytes = 600,000 MB = 600 GB which fits into 1 TB.

Phew. Got through that in the end. Have I ever told you guys that I bought one of the first calculators in the country in about 1973 – the size of a brick, needed mains power, only had four functions. Didn’t help my maths marks one single bit.


First, I must point out that tapes do not cost $2,300. An 800GB tape is closer to $85. The question might have meant to say that the tape drive costs $2,300 but that is not what it said!

There are several possible factors that are relevant. Choose any two of these, or make up your own.

  • Tape backups are portable and are always in the complete control of the data owner. Cloud backups are entrusted to total strangers who may or may not be competent in protecting one’s data. On the other hand, responsible and capable cloud operators should be equipped and be skilled enough to carry out excellent backups.
  • Any data stored online must be considered vulnerable to unauthorised access, use, damage or theft, regardless of the fame, size and power of the cloud host. Most online services have been hacked at least once.
  • It is the data owner’s responsibility to carry out regular backups, test the backup procedure occasionally and take the tape backups safely offsite at the end of each day. Cloud backups  are always stored offsite, and are automated. Less time and effort is required of the owner of the data.
  • Tapes and drives will stretch or wear out over time. Degraded tapes will eventually become unreliable, then fail, and data may be lost.
  • Data on magnetic tapes may be accidentally wiped if they are exposed to strong electrical or magnetic fields, such those from electric motors.
  • While there are costs associated with buying new tapes occasionally, the cloud backup are expensive. 5GB (about 5000MB) means 100MB * 50 * $20 per month = $1,000. That adds up to $12,000 a year for backups – not a small amount ! The tape backup would be far cheaper.
  • Documents often cannot be copied (or backed up)  when they are in use, so tape backups would need to be set up in advance and scheduled to take place at night when the server is idle. Off-peak backups also prevent the backup from slowing down the network or preventing the opening of documents during work hours.
  • Cloud backups would be very slow compared to tape. At 160MBps, 5GB would be backed up in 31 seconds! The upload speed of Jake’s internet connection is not specified, but assuming a modest 1Mbps upload speed, it would take 11.11 hours to upload 5.5GB.
    (1Mbps = 125KB/s. 5GB = 5,000,000KB.  5,000,000 / 125 = 40,000 seconds = 11.11 hours).
    At that speed, a nightly backup may not be finished before work resumed the next day!


The role of the algorithm is to calculate and display the average (mean) temperature of seven days.

What? Is that it? A student would not need months of SD programming experience to work that one out. The ‘Display’ line even announces what it has calculated! Even a Geography student could have earned 2 marks!


  • Privacy Act 1988 – requires data holders to take reasonable steps to safeguard personal data against damage or loss.
  • Spam Act 2003 – prevents organisations from sending unsolicited emails, and requires commercial emails to (1) identify the sender, and (2) provide a functional unsubscribe facility.


  • Annotations of the Gantt chart.
  • Logs.
  • Adjustments to tasks and timeframes.

These are straight out of the study design.


  • Failure to keep the plan updated when circumstances change, for example when tasks run overtime.
  • Failure to include all necessary tasks in the plan.
  • Inadequate resourcing of the project.

Section C (case study)

First, I must point out that this case study is identical to the 2011 SD exam, except for a tiny edit and one typo fix.  The questions, fortunately, were new.

Also, it’s a shame the diagram of the CDU was messed up. The arrows are pointing to the wrong places. This is how it should have looked (from the 2011 SD exam)




Gantt chart

C1b.  See red arrows above. I can only assume that both arrows are necessary for the one mark, strictly speaking.

We didn’t get much information at all to help decide on how to do the Gantt chart. Obviously to fit 25+20 days of tasks into 40 days requires some concurrency. The first task can’t be concurrent, since programming the computer can’t logically begin until the computer has been bought and set up.

After C1a the exam tells us that testing is dependent on the programming tasks! One should have assumed that anyway, but it’s a bit late to be told key information like that after the Gantt chart is drawn!

The thing that bothers me is that as far as I can tell, Kirsten is the sole programmer on the project. How does she work full-time on two programming tasks simultaneously? It makes the whole point of the Gantt chart pretty much useless!


Timeliness – the system must report overstayers in time for parking officers to get to the car and give it a ticket.

Cost – the commercial software was said to be too expensive, which is why the PIMS is being created.

Compatibility – between the phone software and the parking officers’ phones; between the council network and the fines payment system.

Tip – I was tempted to add ‘security’ as a constraint, because obviously the council would not want hackers to be monitoring or interfering with the internet transmission of parking data. However, while it’s logical it’s not identified as being a concern in the case study (It is later mentioned in C4 but you should not use later information to answer earlier questions). Do not invent criteria that are not explicitly stated or strongly implied by the case study – even if they do seem logical. All section C answers must relate to the case study given, and not to general cases.



B – Parking officer’s phone


D – Car details

E – Parking ticket information

F – Statistics about system operation


A firewall prevents unauthorised access to a LAN. It can also filter out unwanted data packets. 

Don’t say ‘virus prevention’. Firewalls cannot detect malware.


Advantage – access logs keep a record of who accessed data, which helps an investigation find the people responsible for data damage, theft, or improper use.

Disadvantage – logs increase system workload and storage activity. They also consume storage space.


Car details, parking ticket information.


This is a weird one. The stem of the question solely refers to “storing important information”, not communicating it. Yet this question seems to be fishing for an answer like HTTPS, SSL, TLS etc. But these are not relevant to storing information; as far as I know there are no storage protocols – except maybe for Storage Area Networks (SANs).

Well, let’s assume the stem of the question is wrong, and say…



Input of the PIMS: Parking ticket details

Origin: Traffic officer’s mobile phone

Output of the PIMS: Operational statistics

Destination: City engineer

Didn’t we already do this in C3?

Why are we doing it again here?


Option A would seem better.

It’s complete in that it has all the necessary fields on the same screen – criterion 1.

Related fields are visually grouped neatly in frames (criterion 2).


Data Type Variable name Description of data
 Boolean boolOverstayed Has the vehicle overstayed? True/false
 String strOverstayMessage Details of the bay containing the overstaying car
 Date/time stmpArrivalTime Time/date of car’s arrival at a CDU.


  • Time required to create a traffic ticket.
  • Total cost of ownership per ticket issued.


  • Accuracy of overstay detections.
  • Security of stored and transmitted data.
  • Ease of use of the software.

Remember that ‘ease of use’ is now an effectiveness criterion – where it belongs – and no longer classed as efficiency.

C10. VCAA has admitted that this question should not have appeared because random and serial files are not included in the study design.

Option: Random file.

Explanation: Faster data access will be important since time is an important factor in catching overstaying parkers.

We are shown parking spot #1413 in the case study, so we can assume there are at least this number of spots to monitor, so the size of the data file may be significant. This would be an important factor in choosing random files. If there were few spots, a serial file might have done just as well.


Internal documentation that explains the purpose and operation of code makes it much easier to debug and maintain the code over time.

It does not add any size to the code and reduce efficiency because compilers ignore comments.


The question said to “State a naming convention” and only one line was provided for the answer. That implies that students were expected to name a convention (e.g. CamelCase), even though no such conventions are named in the study design.

What if kids said “CowMethod”? Would markers withhold a mark because they they had never heard of the name? By the way, CowMethod does exist because I just made it up. It is a convention whereby variables and objects are named after cows, e.g. Dairybelle, Doris, Bertha. I didn’t say it was a good convention.

Real exam questions – especially near the end of an exam – should not rely on memorising names: they should exercise students’ comprehension and judgement skills.

This question is dumb (because it only needs students to name and not explain) and invalid (because it asks for names that are not mandated in the study design).

CamelCase is a convention in which multi-word object names begin with capital letters to make the component words easier to identify and read – since another convention is to not use use spaces in object names. For example TimeOfFirstArrival.

Hungarian Notation is the practice of prefixing objects with an indicator of the object’s type or class so each object identifies its nature in its very name and is treated appropriately, e.g. txtName, lblHeading, boolMarried, sngHeight

(lbl is label, sng is ‘single precision’)


Umm, registration of car, bay number, parking officer’s ID, car make and model?

I don’t get this question. The fields are already listed in the XML; are kids meant to parrot back reg_pl, bay_num, etc ? 

When students are asked to “list three fields” – what does this mean exactly? Create names  for them? e.g. txtRego, numBay, txtOfficerID etc?

But we’ve already assessed field and variable naming at couple of times.

Or are kids meant to fill in some sample data for 3 fields? Sounds like a waste of time.

What KK is this question assessing? The wording of the question is terribly unclear.


This is another question that baffles me. How can one test the ‘purpose’ of software? the purpose of the PIMS software was to control overstaying parkers. This cannot really be tested – it will be evaluated after the system has been in use for a while.

And what does usability testing of ‘location’ even mean?
And how does one test the usability of users? What does that mean?

Dog only knows what this question is asking for.

Completely random responses are:


Create a parking ticket. Did it create a parking ticket? Yay. It has shown it can satisfy its purpose.  (Huh?)


Send the location of an overstaying vehicleto a parking officer’s phone. Did the officer find the right location? Yippee. Location is usable.  (What??)

Equipment functionality

Ask several parking officers to use their phones to enter all the (sample) data needed for some valid tickets. Are the data accurately entered in a reasonable time?


Ask users if they feel usable.

I don’t know. I give up. This question is bad.


Look over months of logs of overstayers’ data, and compare them with the logs of tickets successfully issued. If most of the overstayers were issued tickets because the overstay information has been sent to the officers quickly enough, the system should be considered successful. 

If there were many overstayers who were not issued tickets, the system has not performed well.


Data mining is the identification of an individual in multiple separate data sets and aggregating this data to form a picture of the behaviour of the individual.

Yes, data mining is named in the study design (SD U4O2 KK6).


  • Would it be violating the privacy of local residents?
  • Would it be legal to use?
  • How the hell could the data mining software be of any earthly use to the council in the first place?


-ary. Legendary.