Informatics

Informatics

Key Knowledge

Jump to   U3O1U3O2U4O1U4O2

Legend
  • ITI = Informatics.  I tend to call this subject “ITI” out of tradition for the old “ITA”, and because it’s one syllable and 8 letters shorter. Also, I really don’t like the name “Informatics”: it sounds too much like an electronic dishwasher. You may call it what you want.
  • UxOy = Unit x Outcome y
  • KK = key knowledge
  • AOS = area of study
  • SAT = school assessed task (U3O2 + U4O1 combined)
  • SAC = school assessed courseword (U3O1, U4O2)
  • RDBMS = Relational database management software


  • Section A – 20 multiple-choice questions = 20 marks
  • Section B  – short-answer questions = 30 marks.
  • Section C  – short-answer and extended-answer questions based on a case study= 50 marks
  • Download VCAA’s exam format and sample questions (PDF).
  • My sample answers to  VCAA’s exam format and sample questions .
Other Informatics (ITI) goodies
Remember
  • In a KK, all of the items listed after the word “including” are examinable.
  • Items following “such as” are only examples, and are not directly examinable.
  • All relevant items in the glossary are examinable, depending on the unit being studied. For example the Spam Act is mentioned in the glossary, but is only named as key knowledge (KK) in SD – not in ITI.

Informatics Software Tools

Unit 3 Software tools that students are required to both study and use in unit 3*

  • U3 Area of Study (AOS) 1 – A relational database management system (RDBMS)
  • Drawing or graphics software

Software tools that students are required to use, but not study in unit 3*

  • U3 AOS 2 – Appropriate tool for documenting project plans
  • Software tools to capture, store, prepare and manipulate data

Unit 4 Software tool that students are required to both study and use in unit 4*

  • U4 AOS 1 – Software tools to manipulate data for creating a multimodal online solution

Software tool that students are required to use, but not study, in unit 4*

  • U4 AOS 1 – Appropriate tool for documenting project plans

* The ‘Advice for Teachers’ slideshow explains that: ” Software that is to be “STUDIED AND USED” have explicit reference made to the relevant software functions in the key knowledge and hence the skills in using this software are assessable Software that is said to be just ‘USED’  needs to be used by students, but is not part of the key knowledge and their skills in using the software are not assessed. What is assessed is the knowledge or skills that are demonstrated through the use of the software.

Informatics U3O1

Design a solution, develop it using a relational database management system, and diagrammatically represent how users interact with an online solution when supplying data for a transaction.

ITI U3O1 KK01 – techniques used by organisations to acquire data through their interactive online solutions and reasons for their choice
ppt-icon Websites and data This KK  repeats KK 14 below – reasons why organisations acquire data using online facilities, including

  • 24 – hour customer access,
  • improved efficiencies through direct data entry by customers,
  • improvements in effectiveness, and
  • access to global markets, marketing opportunities and ongoing services

ITI U3O1 KK02 – techniques for efficient and effective data collection

ppt-icon Data collection techniques

Also see

ppt-icon Data entry controls for GUI interfaces

ITI U3O1 KK03 – characteristics of data types  

 

ppt-icon Data-Types (link fixed 7 Sep 2016)

Data types are not defined for Informatics. The glossary says

“Data types are the particular forms that an item of data can take including numeric, character and Boolean, and are characterised by the kind of operations that can be performed on it. Depending on the software being used, these fundamental types can be divided into more specific types, for example integer and floating point are numeric types. More sophisticated types can be derived from them, for example a string of characters or a date type and their names may vary, such as text data type versus string data type. “

Common data types include:

  • character (a single letter or symbol)
  • text (or string) (a series of characters)
  • number including subtypes such as integer, floating point, byte, short/long integer, etc.
  • Boolean –  logically true or false values. Note: just because an answer can have only two possible answers (like “dead or alive”) does not make it Boolean. Boolean values can only be true or false, such as : “Is dead?” or “Is alive?”
  • date/time/timestamp – stores calendar dates and/or times of day as numbers. A  timestamp data type stores both date and time in a single value.

Data types determine the storage requirements and properties of fields. For example, defining field DOB as type ‘Date’ not only lets the RDBMS allocate the exact amount of storage space, but it also notifies the database that it can perform date calculations with that field. If the DOB field had been defined as text, an operation like (DateToday - DOB)/365.25 would not be possible for calculating an age.

ITI U3O1 KK04 – physical and software security controls used by organisations to protect their data

The main controls:

Physical – locked doors/windows, swipe card door keys, biometric readers (e.g. fingerprint scanner), UPS to protect servers from power outages, air conditioning for servers.

Software – usernames/passwords, encryption, two-factor authentication, firewalls, malware scanners, SMART (detects hard disk abnormalities), backup software.

ppt-icon Data Security (link updated 11 Sep 2016)

ITI U3O1 KK05 – purposes and structure of an RDBMS, including comparison with flat file databases

A flat file database only contains one table, and there are no relationships.

A spreadsheet is a flat file database.

ppt-icon Beginning with databases
ppt-icon Purposes of databases
ppt-iconDatabase Structure, data types, naming

ITI U3O1 KK06 – naming conventions to support efficient use and maintenance of an RDBMS

The study design does not name specific conventions, but here are the accepted norms:

  • Hungarian Notation – prefixing a name with an identifcation of the object’s type, e.g. strFamilyName, intAge, lblHeading. It avoids mis – using objects, such as trying to store decimal places in an integer field, or trying to set text box properties on a label.
  • CamelCase – the use of capital letters at the start of words in a multi – word name. It makes it easier to see the individual words when spaces cannot be used in names, e.g. pageattributessection vs PageAttributesSection.

Please remember that if you are asked to describe a filenaming convention, don’t just name it. Explain how it works, and – if relevant – why it is useful.

Field and table names (and forms, reports, queries etc) should be

  • self – explanatory
  • not so long that they invite typing errors or get cut off when displayed
  • not so short that their meaning or purpose are lost
  • leave out special punctuation (such as foreign characters which may choke some RDBMSs)
  • not use reserved words that have special meaning to the software

Preferably do not use spaces or underscores in names – use dashes or CamelCase instead. Underscores are bad because they become invisible when the name is used in a hyperlink.

Most databases and programming languages forbid spaces in names since spaces indicate the end of the name.

ppt-icon Object Naming Conventions

ppt-icon File Naming Conventions

ITI U3O1 KK07 – a methodology for creating an RDBMS structure:

  • identifying entities,
  • defining tables and fields to represent entities;
  • defining relationships by identifying primary key and foreign key fields;
  • defining data types and field sizes;
  • normalisation to third level

ppt-icon Database Normalisation
ppt-icon Database Normalisation – example

ppt-icon Database Normalisation forms

Also see

ppt-iconDatabase Referential Integrity  (not assessable, but useful to know)

ITI U3O1 KK08 – design tools for describing data types and the value of entity relationship (ER) diagrams for representing the structure of an RDBMS

ppt-iconDesign Tools for Databases

ppt-iconEntity Relationships Diagram (ERD)  (updated version for 2016 with Chen, Crow’s feet and cardinality)

At long last, VCAA published its ERD exam conventions. In short, they accept both Chen (the style used in the last study design) and the ‘crowsfeet’  style. Any of the 3 styles may appear in the exam. You should also use one of these styles in your outcomes.

Note how cardinality (e.g. 1:many) is now officially included

Examples:

Chen showing only top-level elements

2016-erd1

Detailed Chen

2016-erd2

Crowsfeet notation

2016-erd3

ITI U3O1 KK09 – design principles that influence the functionality and appearance of solutions

The study design says in the glossary that:

Design principles are accepted characteristics that contribute to the functionality and appearance of solutions.

Design principles related to functionality are

  • useability, including robustness, flexibility and ease of use, and
  • accessibility, including navigation and error tolerance.

Design principles related to appearance are

  • alignment
  • repetition
  • contrast
  • space
  • balance

ppt-icon Design principles  (new for 2016)

ITI U3O1 KK 10 – design tools for representing solutions

 

ppt- iconData Dictionary design tool
ppt-iconEntity Relationships Diagram (ERD)  (updated version for 2016 with Chen, Crow’s feet and cardinality)
ppt-iconDesign Tools for Websites – for example, sitemaps, storyboards, mockups (no tools are mandated in the study design)

ITI U3O1 KK 11 – functions and techniques within an RDBMS to efficiently and effectively validate and manipulate data

Data validation checks the reasonableness of input data.

Validation can be both manual (e.g. proofreading) and electronic (e.g. running a spellchecker).

Mandated types of validation are not listed for Informatics, but they are for SD and you should know them:

  1. existence check – has the data been entered at all or is it missing? Some fields may be optional (e.g. phone number) but missing data in other fields would make further processing pointless (for example an  ID value, an address for a pizza delivery). Electronic validation can be used in databases to ensure key fields are not left empty.
  2. type check – is data the expected data type, for example number, correctly formatted date, text?
  3. range check –
    • is the data value within acceptable upper and lower limits (e.g. age cannot be negative, age must be between 12 and 18), or
    • is the entered value one of the pre-approved legitimate values in a limited list, for example one of the states of Australia, male/female/transgender. Warning – do not force a limited list range check if the list is not actually limited. Trying to create a list of acceptable titles such as Mr, Mrs, Dr, Rabbi, Princess etc will be monstrously long and bound to omit a title you have never heard of. If people must enter incorrect data because  you made a field compulsory and provided an incomplete list of options, you are damaging the integrity of your own database!

Validation cannot and does not check the accuracy of the data. If a person said on a form that they were 18 whereas they were in fact 19, no database could discover the fault. The database could, however, detect that no age was provided or that the answer was “eighteen”.

Testing checks the accuracy of outputs and solution behaviours.

ppt-iconData Validation

ITI U3O1 KK 12 – functions and techniques to retrieve required information through

  • searching,
  • sorting,
  • filtering and
  • querying data sets

ppt-icon Search, sorting, filtering

ITI U3O1 KK 13 – methods and techniques for testing that solutions perform as intended

Testing includes issues like:

  • Creating test data that includes values that thoroughly test all possible types of valid, invalid and troublesome inputs (e.g. those on the borderline)
  • Testing tables to record testing activity – a testing table’s columns may include:
    • What was tested
    • How it was tested
    • The expected result
    • The actual result
    • If the actual result was wrong, how it was fixed.
  • Alpha (informal) testing – by the developer
  • Beta testing – by people other than the developer.
    • Useless Tip: Beta is pronounced “beeta” in the UK and Australia, and bayta” in the US. Apparently the ancient Greeks pronounced the second letter in their alphabet as “bear-ta” and modern Greeks say it something like “vee-ta”. Pick one and have fun.
  • Acceptance testing (formal testing) – by the client who commissioned the product
  • User Acceptance Testing (UAT) – by a typical end user

ppt- icon User Acceptance Testing
ppt-icon PSM3-Development –  discussses informal (beta) and formal (acceptance) testing

ppt- icon Testing

ppt-iconTest data 

ITI U3O1 KK 14 – reasons why organisations acquire data using online facilities, including

  • 24 – hour customer access,
  • improved efficiencies through direct data entry by customers,
  • improvements in effectiveness, and
  • access to global markets, marketing opportunities and ongoing services

ppt-iconWebsites and data

ITI U3O1 KK 15 – reasons why users supply data for online transactions, including

  • convenience,
  • variety of choice,
  • reducing costs

ppt-icon Websites and data (again)

ITI U3O1 KK 16 – techniques used by organisations to protect the rights of individuals and organisations who supply data, including

  • security protocols and
  • stating privacy, shipping and returns policies

ppt-icon Websites and data (again)

Security protocols include:

  • SSL or TLS to encrypt web traffic.
  • Logins with usernames and passwords.
  • Requiring strong passwords.
  • Using Captcha to deter robotic logins.
  • Automatic timeout of idle connections.
  • Using secret questions to back up passwords.
  • Using multi – factor authentication, e.g. sending a code to the person’s phone to authorise transactions

Stating policies regarding privacy, shipping and returns – so consumers are well – informed about their rights and responsibilities before committing to a transaction. Also consider:

  • A contacts link for people to use in case their rights may not have been protected.
  • Sending a verification email before signing someone up to a mailing list or a site.
  • Only allowing password changes by sending an email to a user’s registered email address.
ITI U3O1 KK 17 – user flow diagrams [UFD] that depict different ways in which users interact with online solutions.

ppt-icon User Flow Diagrams (UFD) – new for 2016

The study design glossary defines a UFD as…

User flow diagrams are diagrammatic representations of the path a user travels through when using an online interactive solution to complete a task or transaction, such as making a reservation or purchasing a product. It is a diagram showing a user’s journey to complete a task. User flow diagrams incorporate user interfaces and show the multiple entry points to interactive online solutions, for example, paid advertisements, social media and search engines may direct a user to a location in the solution other than the home page.

The Advice for Teachers says…

The VCAA will not be mandating a specific style of user flow diagrams; however, it is important that the diagrammatic representations show a user’s interaction with an online solution when conducting a transaction, as well as the user interface for the page that initiates the transaction.

The Advice points to UFD examples at…

 Here’s an example of one that I prepared earlier. The different shapes indicate different operations in a traditional flowchart, but since no rules apply you can use whatever format you choose in your UFD, as long as its meaning is clear and consistent.

UFD

This one appeared in the 2016  VCAA Informatics Exam…

16i-c08_tn

 

Informatics U3O2

Remember – U3O2 is part 1 of the SAT. It finishes in U4O1.

Use a range of appropriate techniques and processes to acquire, prepare, manipulate and interpret complex data to confirm or refute a hypothesis, and formulate a project plan to manage progress.

Details about the Informatics SAT
ITI U3O2 KK01 – primary and secondary data sources (digital and non digital) and methods of data acquisition, including

  • observation,
  • interview and
  • querying of resources

Primary data is collected by the researcher for a specific purpose. It is not collected or processed by others. Primary data – if gathered properly – will be reliable and relevant to the research question. Its collection will, however, be expensive, slow, and of limited quantity. It will require considerable time, labour and skill for its processing. The main sources of primary data for the SAT would be questionnaires, surveys, and interviews. Another common method is direct observation of people acting naturally in their environment.

Secondary data has been collected by other people or organisations. It will also probably have been processed (validated, sorted, categorised, encoded, summarised) . Secondary data often includes opinions, conclusions or interpretations of the meaning of the data by other people. Secondary data may or may not be reliable or relevant. At worst, it may have been selectively chosen or misleadingly processed to support a particular point of view. Collecting secondary data is cheap, quick and easy. It is also available in huge quantities, and is the only way to collect data from the past. Secondary data is most commonly from: the internet; encyclopaedias; newspapers, magazines, TV, radio shows; reference books.

Note: sometimes the distinction between primary and secondary data is blurry.  For example, a newspaper’s editorial on politics is clearly secondary data, but when it is used as data for an original purpose (e.g. tracking changes in newspaper’s political attitudes over time) it could qualify as being primary data.

‘Querying of resources’ refers to the practice of extracting data from datasets. Many government bodies (e.g. CSIRO, data.gov.au, Victorian government data) provide public data that can be searched by citizens.  Also many organisations with large data repositories (e.g. Facebook, the weather bureau) provide an API (Application Programming Interface) that provides a gateway to let other people’s software extract data from the repository. In that way, for example, anyone can write an app that extracts today’s weather forecast from the met bureau’s dataset, or login to an online service using their Facebook credentials.

Queries take two major forms: Query by Example (QBE) and Structured Query Language (SQL). You do not need to know those terms, but they are handy to know.

QBE is the style used by GUI databases (Access, Filemaker etc) and spreadsheets where there are columns for each field, and rows that you can use to specify actions for those fields – for example selection and sorting criteria, and whether the field should be shown. Fill in the desired data into the appropriate fields and rows, and the software will – in the background – create a query that finds and manipulates the data required. For example:

QBE
Query By Example – thanks to http://www.databasejournal.com

SQL – Structured Query Language is a text-based instruction to a database that specifies all of the same requirements for data selection and manipulation that QBE generates. Often, the pretty QBE front-end simply generates SQL that the database then carries out. It’s a powerful scripting language for database queries. SQL looks like this (thanks to http://www.w3schools.com):

SELECT * FROM Customers
WHERE Country='Germany'
AND (City='Berlin' OR City='München');

Keep in mind that data extracted from an online query must still be considered secondary data. Even if it looks like raw data, it will probably have already been processed in some way, such as being summarised, sorted, categorised, and presented as percentages.

Repeat after me: Just because you found data online and extracted some with a query does not mean you created the data. It is not primary data. It is secondary. 

ppt-icon Still to come – but the info above should keep you going for a while.

ITI U3O2 KK02 – suitability of quantitative and qualitative data for manipulation including

  • comparisons (quantitative) and
  • policy formation (qualitative)

Quantitative data is objective, measurable, and based on facts, e.g. temperature readings, numbers of daily visitors to a website.

Qualitative data is subjective, based on opinion, e.g. how people feel in a hot building, whether visitors enjoy visiting a website.)

Qualitative data is often collected at the start of research to discover what aspects of the topic are important and relevant. Once these are known, more detailed and specific quantitative data can be collected in much larger quantities.

For example, a company might want to know why sales of its boots are falling. Instead of guessing the reasons and creating a questionnaire or survey asking about those reasons, they interview people about their boots, or observe people trying their boots on in stores. They believed sales were falling because of cost, but discovered that people were actually put off by the comfort and dated style.  The company could then use this qualitative data to formulate questions for a survey to fine-tune their understanding: specifically, what was uncomfortable? what styling seem dated? Without the preliminary interviews, the survey would have been asking about irrelevant factors and collecting irrelevant data.

Warning – to achieve full marks for U3O2 you must use both qualitative and quantitative data. This is buried in middle of the assessment rubric for criterion 2 and it not stated again in the highest performance descriptor.

ppt-icon Still to come

ITI U3O2 KK03 – data types and data structures relevant to selected software tools

Data types – just as boxes come in a variety of sizes, materials and shapes to suit their intended purposes (compare a shoe box with a matchbox, for example) so data structures come in a variety of types to store that type of data with minimum waste (of RAM or disk space), and processing effort. 

Common data types supported by most relational databases are:

  • number (usually subdivided into floating point with decimal fractions, and integers)
  • text, string (letters, punctuation, control codes such as carriage returns)
  • date (which store complete calendar dates in one piece of data. Timestamps store both dates and times of day in one datum. The storage is very efficient and allows powerful date/time calculations since the software can easily interpret the data)
  • Boolean (a special type that only stores logical true/false data. Note that just because a value in a field can have one of only two possible values does not make it Boolean. For example, a “Sex” field which can have values that are only be “M” or “F” is not Boolean – it is text. To be truly Boolean, the field name would have to be “Is Male?” and the value would be either TRUE or FALSE and stored as numbers, usually 0 for false and non-zero for true. “M” or “F” would be character data type.) 

Different databases offer specialist, non-universal data types:

  • Filemaker Pro – allows ‘Container’ type in which any binary media such as JPG, PDF, MP3, WMV can be stored. Filemaker only supports ‘Number’ data type, which includes all types of numeric data.
  • MS Access – offers ‘memo’ data type which allows large, variable-length text strings. Access offers more specific number types, such as byte, integer, and floating point which may store data more efficiently, but require more planning by the database creator.

When creating a database, considerable thought needs to be put into choosing appropriate data types for the data to be stored. Selecting an inappropriate data type (e.g. storing dates of birth as text) will result in slow and difficult processing later. Lack of foresight can lead to a database ‘breaking’ later in its life, such as by choosing ‘byte’ as the data type for TotalMemberCount. Since a ‘byte’ field can only store values between 0 and 255, it will work fine until the member count reaches 256 – and then the database will freak out.

The Informatics study design is rather vague about which data types are examinable. The only help is the glossary that says:

Data types are the particular forms that an item of data can take including numeric, character and Boolean, and are characterised by the kind of operations that can be performed on it.
Depending on the software being used, these fundamental types can be divided into more specific types, for example integer and floating point are numeric types.
More sophisticated types can be derived from them, for example a string of characters or a date type and their names may vary, such as text data type versus string data type.

If I were you, I’d clean my room every day to keep mum happy, and learn the data types in bold above.

As for data structures… we can only guess what VCAA means because no data structure is named in the KK.

I would only be able to suggest the obvious:

  • fields, records, tables in an RDBMS
  • lookup tables in a spreadsheet

But don’t worry. the exam can’t question you on things like double-ended priority queues or self-balancing trees because they not listed in the KK. The exam could only ask you to select a data structure and maybe explain what it is, in which case you’d want to choose one of the RDBMS or spreadsheet structures listed above.

ppt-icon Data types

ITI U3O2 KK04 – one of the following methods for referencing primary and secondary sources:

  • Harvard,
  • American Psychological Association (latest edition),
  • Chicago,
  • Institute of Electrical and Electronics Engineers (IEEE)

All four referencing methods do the same thing – they help an author acknowledge the intellectual property of other people used in the author’s work. They only differ in their style.

Harvard and APA both use a parenthetical [i.e. in round parentheses] ‘author, year’ style in the body text such as

“…one researcher (Smith, 1980) claimed that…” and there would be a corresponding entry in the reference list at the end of the document such as 

Smith, AB. The Life Cycle of Frogs. Frog Life Monthly, vol.65, num.11, 1980.

Chicago and IEEE use numbers in the body text to indicate links to the reference list, for example:

Chicago – “…one researcher 32 claimed that…”

or

IEEE – “…one researcher [32] claimed that…”

For all four methods there would be a corresponding entry in the reference list at the end of the document.

For Chicago and IEEE it would look like this:

32. Smith, AB. (1980) The Life Cycle of Frogs. Frog Life Monthly, vol.65, num.11, 1980.

For Harvard and APA, it would look something like this:

Smith, AB. (1980) The Life Cycle of Frogs. Frog Life Monthly, vol.65, num.11, 1980.

The differences between Chicago and IEEE are trivial – one uses superscript for the numbers. The other uses square brackets. Similarly, the differences between Harvard and APA are trivial – one may have a comma between the author’s surname and the year of publication whereas the other does not.

It does not matter which of the four referencing styles you use, but you must use one of the named styles, and you must use it consistently and correctly. 

Do not mix different methods together in a reference. Choose one method and get it right.

As I interpret the key knowledge,  you only need to be able to write references in only one of the four named methods.
You should not be asked to write a reference in one specific method of the examiner’s choosing.

Learn ONE style fully and use it consistently.

ppt-icon Still to come – but the stuff above is pretty good in the meanwhile.

ITI U3O2 KK05 – criteria to check the integrity of data including

  • timeliness,
  • authenticity,
  • relevance,
  • accuracy

Note – unlike the previous study designs, in this study design, “timeliness” now means both/either:

  • data is available when needed and/or
  • data is produced quickly so it’s available when required

Authenticity relates to how genuine the data is. Has it actually come from the named source? Has it been forged? Has it been distorted, for example: dishonestly edited, photoshopped, taken out of context, changed in any way to deceive the audience ? A statement that ‘there are four cows in the top paddock’ came from a person who never actually counted the cows, and just made up a number to save herself some effort or to avoid being punished.

Relevance – does the data relate to the issue being investigated? Data may be irrelevant because it’s

  • off-topic (figures about house sales when the investigation is into apartment sales, data on efficiency when a question is about effectiveness),
  • outdated (using 1993 data to investigate current rates of students leaving school early),
  • foreign (German students’ experiences with an issue involving Australian students)
  • atypical (relating to a small, unrelated or extreme sample that bears no similarity to the population in question)

Accuracy – data is an abstract representation of real-world realities. To say to people that “There are four cows in my top paddock” is a more convenient representation of a fact than physically taking all the people to your top paddock and looking at the cows. If there actually are four cows in the top paddock, then the data is accurate. If there were actually three or five cows, the data would be less accurate. If there were four hundred cows, the data would not be at all representative of the true bovine quantity, and the data would be called inaccurate.

Data might be or become inaccurate due to

  • going out of date – the state of the real world has changed (a few new cows were put into the paddock) but the data has not been updated to reflect that change. Or the number of cows had been copied from one database to a mirror site, but the mirror has not been synchronised recently with the master copy so the mirror is  no longer representative of the current size of the herd.
  • being damaged – someone accidentally or deliberately changes the text “4 cows” to “40 cows”. Or disk rot caused the recorded data to be misread by the digital system.
  • poor data collection – the cows were counted by someone glancing out of the window of a fast-moving car, or the cows kept moving about and some were counted twice.
  • coming from an unreliable source – there were not four cows: there were four goats. The data came from an idiot from the city who could not tell the difference.
  • bias – the cow-counter had some reason to misrepresent the true number of cows in the top paddock, e.g. to reduce his tax bill, or to show off his cow status to the milk maid next door.
  • faulty data processing – the wrong formula was used in the spreadsheet that added up the number of cows.
  • poor validation – the number of cows was entered into the software as “four” instead of “4”. The software was not expecting text, and – because there were no recognisable digits – converted the word “four” into a value of zero.
  • translation or conversion errors – the reader of the data did not speak English and relied on an incorrect electronic translation of the text.  Or the text said “There are 4 (four) cows” and a careless data entry person entered “There are 4 (4) cows” which was later turned into “There are 44 cows”. Or poor optical character recognition of “4 cows” became “9 cows” without being detected.
    • True story: I was listening to an audiobook on the history of space travel and was astounded to hear that “Apollo Two landed on the moon.” It took me a while to realise that the reader had seen “Apollo 11 landed on the moon” and mistook “11” for the Roman numerals “II”.  D’oh!

And let’s not argue about the difference between accuracy and correctness of data.
Teachers are still arguing about that one a year later.

ppt-icon Still to come

Unexaminable bonus : Evaluating secondary sources can be difficult. There are some logical fallacies you might want to be aware of that are often based on arguments or evidence that are irrelevant to the issue in question (e.g. a correlation may be irrelevant to causation).

ITI U3O2 KK06 – techniques for coding qualitative data to support manipulation

Qualitative data is often textual, not numeric. It often comes as comments, statements, opinions that may include important information in all sorts of ways. Different people use different words, phrases, vocabulary to say the same basic idea. Coding this information is needed to reduce an infinitely-variable input of text into values that can be averaged, totalled and understood.

For example, in an interview, you might ask, “Has your business been affected by the opening of local supermarkets?” You might receive the following answers:

  • “Yeah. Heaps”
  • “Quite a lot”
  • “I’d say the answer would have to be ‘considerably’ “
  • “Yep. Yeah. A bunch.”

How could you summarise these variable-length responses meaningfully? Human interpretation is required to boil down the essence of a response into a limited range of possible answers, such as :

  • 5. Completely
  • 4. Quite a lot
  • 3. A reasonable amount
  • 2. Somewhat
  • 1. A very small amount
  • 0. Not at all

This encoding (or coding, as the study design prefers) of the free-form textual answers allows you to process the data statistically.

The drawbacks of encoding are:

  • it requires human interpretation and judgement of the meaning of responses. Different encoders may generate different codes from the same inputs unless they are well trained and know what the codes mean.
  • encoding is labour-intensive and cannot be easily automated. It requires considerable expensive effort by trained people.

One way of judging free-form responses more consistently is by using a rubric (not examinable) . A rubric lists descriptions of inputs and assigns them numeric values. VCAA outcomes and exams are assessed like this to ensure that different markers know what responses deserve.

A rubric may look like this:

  • 5. Consistently uses very strong language to agree with the statement.
  • 4. Often uses very strong language to agree with the statement.
  • 3. Sometimes uses strong language agree with the statement.
  • 2. Rarely uses strong language to agree with the statement, or sometimes disagrees with the statement.
  • 1.  Never uses strong language to agree with the statement, or often disagrees with the statement.
  • 0. Never agrees with the statement.  Often strongly disagrees with it.

The encoding effort and the risk of error during interpretation of qualitative data is why researchers often prefer to collect quantitative data using questionnaires with fixed choices of answers that do not require human interpretation.

The value of interviews and free-form answers, however, is the richness and depth of the answers that may yield valuable information that the researchers may not have ever dreamed of including as options. This is why early research often uses limited in-depth qualitative data collection (e.g. interviews, observation) to better understand what questions to ask and what answers to allow during later larger-scale quantitative research (with questionnaires and surveys)

ppt-icon Still to come

ITI U3O2 KK07 – key legal requirements for storage and communication of data and information, including

  • privacy,
  • intellectual property and
  • human rights requirements

ppt-icon Privacy – Privacy Act

ppt-icon Intellectual Property – Copyright

ppt-icon Human Rights Requirements  –  Charter of Human Rights

Note – The Spam Act is not listed as an Informatics requirement (as it is for SD). Strange, but true.
Ignore that bit in the Informatics textbook.

The only legislation named for Informatics is in U4O2:

  • Privacy Act 1988
  • Privacy and Data Protection Act 2014 [replacing the IPA from the last study design]
  • Health Records Act 2001.

The legislation relevant to this U3O2 KK is categorised but not named.

ITI U3O2 KK08 – features of a reasonable hypothesis including a specific statement identifying

  • a prediction and
  • the variables

Variables

  • Independent variable – the factor that the researcher controls (e.g. the amount of sleep students get). i.e. the “cause”
  • Dependent variable – the factor that is affected by changes to the independent variable (e.g. students’ test results). i.e. the “effect”.

Prediction – a forecast of how the dependent variable will be affected by changes to the independent variable, e.g. “The less sleep students get, the worse their test results will be.”

A hypothesis must be able to make testable predictions. If it can’t, it’s pure speculation, or faith.
e.g. The theory of relativity can be used to theorise that a clock in a satellite will run more slowly than a similar clock on earth. This can be tested experimentally (even if it took about forty years for the necessary technology to arrive.)

A hypothesis that “After good people die, they go to heaven” can never be tested, so it fails as a reasonable hypothesis.

Another hypothesis like, “Our entire universe is just a single atom in a larger universe!” cannot yield any testable prediction, so it also fails as a reasonable hypothesis.

Vague hypotheses (the plural of ‘hypothesis’) are unreasonable, for example “Dogs are better pets than cats because they’re more loving” cannot be measured scientifically. “Better” is an undefined and vague term. How can “loving” be quantified in animals? Does a dog’s licking mean love, or he’s tasting you to see if you’re worth eating? Does a cat’s purring mean love, or self-centred satisfaction in finding a warm lap on which to sleep?

And does it mean that every breed of dog and every breed of cat follow this rule, on only some of them?

A reasonable hypothesis is very specific. It has this format:

Independent Variable (IV) causes Dependent Variable (DV) to increase/decrease because reason Z.

Examples: 
“Teaching sex education in schools increases teen pregnancy rates because students are made curious about sex.”
or
“Teaching sex education in schools decreases teen pregnancy rates because students are made aware of how sex causes pregnancy.”

A reasonable hypothesis cannot just vaguely say that one variable “affects” another. It must specify the nature of the effect.

It must specify one independent variable and one dependent variable and make sure that any other uncontrolled variables are removed from consideration during investigation. For example, if you look at two English classes and find that the smaller class has better grades than the larger class. You hypothesise that smaller class size (IV) causes learning (DV) to increase because students get more attention from the teacher.

But are there factors unaccounted for in this hypothesis? Could it in fact be that the difference is caused by:

  • the teacher of the smaller class having much more teaching experience?
  • the students in the larger class being recent immigrants from non-English speaking countries?
  • any of another dozen possible variables you have not even considered?

The hypothesis must be able to make testable predictions, for example “If the larger class were divided into two classes taught by the same teacher, each half would get better grades.”

Also, you must not introduce new variables during investigation. For example, after dividing the large class into two, you should not give one half a different teaching style. If there were any any changes observed, you could never tell whether they were due to the size reduction or the change in teaching style.

A good investigation relies on the fact that any observed changes between groups can only be attributed to the independent variable and to no other cause.

Tip: this is wise during any problem-solving mission, such as finding out why your computer has started running slowly. If you change five things and the computer runs well again, how can you tell which change fixed the problem? Try to always test only one variable at a time. Control the others.

ppt-icon Still to come

ITI U3O2 KK09 – solution specifications: requirements, including

  • data to support the prediction of the hypothesis,
  • constraints and
  • scope

In other words:

data: what data will needed to support the hypothesis (e.g. the numbers of new supermarkets openings; numbers of closures of small businesses in the same period; opinions of the small business owners regarding the effect of the supermarket opening on the closure of their businesses.

The distinction between constraints and scope in the key knowledge is a bit unclear. VCAA has not offered any example to clarify the difference.

In IT, constraints are usually factors that limit the free design of solutions, such as a limit on the total cost or time for development, the need for the solution to work on certain hardware, the requirement that the solution be very secure, or easy to use by complete idiots.  Your research may be constrained by the public availability of relevant data, the number of reliable primary sources, the software you have available for data collection or processing. 

Scope defines what is included in the research and what is not. For your SAT the scope may define how far will your hypothesis extends. For example, the hypothesis that supermarkets kill small local shops will only relate to  

  • Victoria,
  • metropolitan suburbs,
  • in the past 10 years,
  • only independent owner-operated businesses (not chain stores or franchisees)
  • only greengrocer and butcher shops.

ppt-icon Still to come

ITI U3O2 KK 10 – project management concepts and processes, including the concepts of

  • milestones and
  • dependencies

Milestones are major points of progress in a project, for example the end of the design stage.
There are no tasks or time required for the event, so milestones are shown as zero-duration, and are represented as diamond shapes. 
Milestones are used to judge whether a project is on schedule or not.

A dependency means that a following dependent task cannot be begun until a previous task has been completed.
The first task is called the predecessor, and the dependent task is the successor.
For example:
Task 1: Put on socks. (The predecessor task)
Task 2: Put on shoes. (The successor task, is dependent on Task 1. It cannot be started before the socks are on. Unless you’re really weird.)

The processes of

  • task identification,
  • sequencing,
  • time allocation,
  • resources and
  • documentation using Gantt charts

You must use a Gantt chart for the SAT.
(You can also use PERT if you like, but it’s not examinable and only the Gantt chart will be assessed for the SAC.)

  • Task ID – The first step is to list every task that needs to be done in the project. Don’t leave any out.
  • Sequencing – Getting tasks in order, based on their dependencies.
  • Time allocation – Deciding how much time each task will take. In a Gantt chart, this affects how long each task’s bar is.
  • Resources – assigning workers and equipment to tasks so people and hardware/software are available when needed, and not already busy working on a different task. Good Gantt software will warn when resources are being double-booked.
  • Documentation using Gantt charts –  put these pieces of information step by step into the Gantt chart.

ppt-icon Gantt Charts

ITI U3O2 KK 11 – file naming conventions to support efficient use of software tools

Good file naming means that names should :

  • not be so short that they are meaningless or arbitrary (e.g. “Doc1.docx”)
  • not be be so long that they cause file system errors
  • be meaningful and self-descriptive
  • put the most relevant information first
  • not contain spaces or underscores
  • use capitalisation cautiously. 
  • not use prohibited characters
  • not use characters that have special meaning for web servers and browsers
  • be consistent so names are sorted in a logical order and files can be easily found
  • preferably contain versioning information (e.g. document-v1.txt, document-v2.txt) to retain old versions and allow roll-backs in case of disaster or stupidity.

Organisations and teams typically create file naming policies that all workers need to follow when sharing documents.

More details are in the slideshow…

ppt-icon Good File Naming Conventions

ITI U3O2 KK 12 – software functions to organise, manipulate and store data

This is so vague it could take a dozen books to cover completely. It refers to the commands that can be given in different software applications to handle data. 

A typical exam question would be something like:

Jill has this pile of data. She wants a  list of her shop’s top ten best-selling products (in order of popularity). Describe a strategy she could use to do this.

Organise: define and create fields, records and tables in a database. Create worksheets, columns, rows, lookup tables in a spreadsheet. Sort data. Put data into categories, tables. Convert data to a single unit (e.g. grams/kilograms all converted to grams, minutes:seconds all converted to seconds).

Manipulate: formulae in spreadsheets or database queries to work out basic arithmetic, totals, averages, maxima or minima, ranges, correlations, standard deviations. Produce charts. Pivot tables in spreadsheets.

Store: as files (e.g. Access/Filemaker/Excel proprietary formats, universal CSV, XML, RTF formats). Store as records in files or MySQL databases.

Note: The exam cannot force you to answer questions relating to spreadsheets (even if 99% of you guys used them for U3O2). The exam is entitled to ask specific questions about RDBMS.

ppt-icon Still to come

ITI U3O2 KK 13 – techniques for identifying patterns and relationships between data

This could be a biggie. We are talking about data trends and connections between different data sets. This implies concepts like:

  • Patterns: predictable changes such as increases/decreases, unchanging plateaux, maximum and minimum values, recurring data that can be anticipated because of some other factor. For example, summer temperatures have a pattern of being generally higher than winter temperatures. Shopping expenditures show a pattern of increasing just before Christmas and Mother’s Day. Internet usage has a pattern of increasing during working hours.  Access to news sites tends to increase after disasters.
  • Connections: correlations – changes in one dataset that correspond with changes in a different dataset. Correlations can be positive (an increase in Factor 1 leads to a corresponding increase in Factor 2) or negative (an increase in Factor 1 leads to a corresponding decrease in Factor 2). Correlations are measured from -1.0 (perfect negative correlation) to zero (no correlation whatsoever) to +1.0 (perfect positive correlation).

A technique for identifying any trend within or between datasets is to reduce the huge bulk of raw data to a form where the trends are more easily visible. This is accomplished by summarising the data using statistics (e.g. averages and totals) and and visualising the data using data visualisation techniques (e.g.  graphs and infographics).

My advice – read the textbook. I really don’t have the time to repeat all of that priceless wisdom here. But here’s a summary.
Note: None of the following terms is directly examinable, but may be given in an answer to a related question.

Averages (not examinable)

a summary of a larger set of data, showing its typical value. The problem is that one can summarise data in different ways and get very different answers.

For example, six men are asked how many lemurs they have. The answers are: 7, 9, 11, 6, 13, 6, 6, 3, 11.
What value faithfully reflects the average number of lemurs men have?

  • One type of average (the mean) adds up all values, divides by the number of values = 8.
  • Another way of calculating an average is to find the most common value (the mode) = 6.
  • A third average is to sort the values and see which is in the middle when the data are sorted (the median) = 7.5.

Which of these different answers best summarises the numbers of lemurs owned by men?
All, and none of them. The very choice of statistic can distort a report.

For example, six people in a country are asked how much they earn each week . The answers are 30,30,60,170,1949.

  • Mean: 447.8
  • Mode: 30
  • Median: 60

If you were the country’s president trying to prove the well-being of your citizens, which average would you choose?
If you were trying to prove the president was starving his citizens, which average would you choose?
Strangely enough, both of you would be technically correct.

Moral – statistics can lie when you want them to. The use of statistics is just as important as the data the statistics are using.

Standard Deviation (not examinable)

Because the mean can be deceptive when the range of the data varies greatly, it is handy to know how much variance is in the dataset. The standard deviation (easily calculated by a spreadsheet, for example) is low when the data are consistently around the same value and the mean accurately describes their average value. A high standard deviation shows that the data vales vary greatly and have no consistent value. A high standard deviation indicates that the mean of the data is unreliable as a summary of the data. A high standard deviation is like a red light flashing the warning,  “DON’T TRUST THIS MEAN!”

Using the sample data above, the lemur standard deviation was 3.01, indicating the data were pretty close to each other and the mean would be quite reliable as a summary of their values.

The income standard deviation was 752.3, warning that the mean would be wildly unreliable as a summary of the data.

ppt-icon Still to come. The stuff above should be enough for now.

ITI U3O2 KK 14 – roles, functions and characteristics of digital system components used to

  • input,
  • store,
  • communicate and
  • output data and information

ppt-icon Hardware for input, storage, communication, and output (new for 2016)

ITI U3O2 KK 15 – physical and software security controls suitable for protecting stored and communicated data.

 

Remember to cover all four parts of the KK. Physical/software and stored/communicated.

ppt-icon Physical and software security

Informatics U4O1

Remember – this is part 2 of the SAT that began in U3O2.

Design, develop and evaluate a multimodal online solution that confirms or refutes a hypothesis, and assess the effectiveness of the project plan in managing progress.


ITI U4O1 KK01 – characteristics of information for educating world – wide audiences, including

  • gender inclusiveness
  • culture inclusiveness,
  • commonality of language,
  • age appropriateness
Gender inclusiveness

“In 2013, the Sex Discrimination Act 1984 was amended to introduce new protections from discrimination on the grounds of sexual orientation, gender identity and intersex status in many areas of public life.”

Sex –  “refers to the chromosomal, gonadal and anatomical characteristics associated with biological sex. “(i.e. the hardware one is born with).

Gender – “is part of a person’s personal and social identity.  It refers to the way a person feels, presents and is recognised within the community.  A person’s gender may be reflected in outward social markers, including their name, outward appearance, mannerisms and dress.”

Intersex – “refers to people who are born with genetic, hormonal or physical sex characteristics that are not typically ‘male’ or ‘female’. Intersex people have a diversity of bodies and gender identities, and may identify as male or female or neither.” 

The moral of this key knowledge is – don’t assume. Don’t exclude people because of their sex or gender. Be fair.

Culture inclusiveness

A culture is a defining characteristic of a group of people based on their shared beliefs, history, attitudes, religious or political beliefs, preferences, habits, loves and hates,  priorities, goals, etc.

An individual may belong to many cultures. When writing for a global audience, try to be consciously aware that many or most readers will belong to cultures that may be slightly or completely differently to yours.

Don’t refer to politics, sex, religion – or football. 

Commonality of language

We often use expressions and vocabulary that are bound to our cultures, but these may not be understood by some, many or all other people. Try to use generic, standard, simple English using a smaller vocabulary.

Age appropriateness

 Obviously children have different needs when it comes to information. Their vocabularies may not be as well developed. They might not understand certain concepts, such as death or menopause. Some topics may scare them, like traumatic accidents or domestic violence. They might prefer text to be illustrated. Text may need to be larger to suit their younger eyes. Swearing is not appropriate.

Then again, older people have different needs. They might not be as technologically up-to-date so new terms may need to be defined or explained. They may know a lot more about things than you and be impatient with your vain self-importance and foolish time-wasting. 

For more details, get the slideshow…

ppt-icon Inclusiveness – gender, culture, language, age

ITI U4O1 KK02 – techniques for generating design ideas

No one technique is directly examinable, but typical and common techniques include:

  • brainstorming (but its effectiveness is in doubt – see this research)
  • mind maps 
  • various diagrams like PMI (Plus/Minus/Interesting), POOCH, Venn, Fishbone
  • SCAMPER (see the slideshow)
  • forums 
  • external consultants 

ppt-icon Techniques for generating design ideas

ITI U4O1 KK03 – criteria for evaluating alternative design ideas and the effectiveness of solutions

 

You need to develop two or three design ideas for your MMOS before choosing one which will be designed in detail and then developed.

What factors will you use to choose the winning idea?
Those factors are the ‘criteria’ in the key knowledge.

Typical criteria for evaluating design ideas may be:

  • How quickly or easily can this idea be developed into a working design and solution?
  • How much will it cost?
  • Do I have the skills necessary or will I have to re-train to be able to implement it?
  • How well will it satisfy the necessary functional and non-functional requirements?
  • Will it be secure/accurate/easy to use/attractive/robust etc?

Typical criteria for evaluating the effectiveness of solutions may be (Note the word “effectiveness”. It does not include efficiency!):

  • Accuracy
  • Usability / ease of use (which was an efficiency criterion in the previous study design)
  • Security
  • Readability
  • Portability (if relevant)
  • Compatibility (with other technologies, or ability to read documents created by previous versions of itself)
  • Robustness (ability to keep working under with difficult conditions)
  • Timeliness (in the sense of “being available when needed” rather than the speed of development or processing)
  • Scalability (ability to be increased in size or performance without a major redesign, rebuild, or replacement)
  • Attractiveness
  • Being fun to use

The ‘effectiveness’ criteria list can go on endlessly.
It just means “any criterion that is not efficiency” (time/speed, money/cost, or labour/effort)

ppt-icon Evaluation criteria 

ITI U4O1 KK04 – characteristics of effective multimodal online solutions

Again, the key word in the KK is effective, meaning the quality of the MMOS or how well it does its job.

According to the specific criteria in the study design, the MMOS must

  1. Educate its audience (rather than entertain, inform or persuade). It must teach them something.
  2. Be suitable for a global audience and suitable for all countries, cultures, sexes/genders, ages, education levels, disabilities/special needs, etc.
  3. Be able to go online. Remember that your MMOS does not actually have to be put online for assessment, but it should theoretically be able to go online. So no – you can’t create a MMOS in MS Word.

The other main effectiveness criteria for a MMOS would have to be (in no particular order – as an exercise you might want to sort these in order of importance):

  • communication of its message
  • readability
  • clarity
  • accessibility (for people with disabilities or special needs, such as weaker language skills)
  • ease of navigation
  • completeness
  • accuracy of information
  • usability / ease of use
  • error tolerance  – Let users recover from errors without punishing them. Use “cancel” or “back” buttons to let them repent of their sins and return to the strait and narrow path of righteous use of your interface. Ask for confirmation before committing users to serious actions such as deleting data, printing hundreds of pages, spending money.
  • affordance (i.e. “intuitiveness” rather than “cost” which would be an efficiency criterion).
    • Affordance is not an easy concept to absorb at first. It refers to the quality of an interface suggesting how it should be used. It includes qualities such as:
      • following conventions regarding images and interactions between the user and the solution. As extreme examples, don’t reverse the behaviour of mouse buttons or put menus into a “new and improved” order you invented yourself. 
      • using clear, simple, self-explaining words and captions, such as “Click here for details”, or “Help” (rather than “User Assistance Centre”)
      • consistency, so that once the user has learnt the conventions of one page, they can also be understood throughout the whole solution. Even better, if the conventions don’t need to be learned at all because they comply with global standards (e.g. File menu is on the left, underlining indicates hyperlinks), then users can immediately start absorbing the message rather than struggling with the interface.
      • intuitiveness. It’s obvious that a button or scrollbar with a 3D effect (for example, shadows and highlights) needs to be clicked or dragged; dropping an object onto a trash icon will delete it; an icon of a chain will create a link. Use standard metaphors that people can understand without having to learn or think. (Tip: many people hate having to learn or think. Sad, but true.)

ppt- icon Still to come

ITI U4O1 KK05 – formats and conventions appropriate to multimodal online solutions

 

Format: the manner in which information is presented, e.g. the same statistical data could be presented in the format of a table, a chart, or descriptive text.

Convention:  the standard, accepted styles associated with a format. For example, if you choose the format of a table, you are subject to following the conventions of a table, such as gridlines, bold headings, left-justified text, numbers being right-justified or centred on the decimal place.) Webpages conventions include the underlining of links, and the use of thumbnailed images linked to big pictures. Conventions give users comfort and security by presenting information and controls in a traditional, predictable and comfortable manner.

ppt-icon Formats and conventions

ITI U4O1 KK06 – design principles that influence the functionality and appearance of multimodal online solutions

ppt-icon Design Principles

The study design says in the glossary that: “Design principles are accepted characteristics that contribute to the functionality and appearance of solutions. In this study the principles related to functionality are useability, including robustness, flexibility and ease of use, and accessibility, including navigation and error tolerance. Design principles related to appearance are alignment, repetition, contrast, space and balance. “

ITI U4O1 KK07 – design tools for representing a solution’s appearance and functionality, including relationships, where appropriate

This could take weeks. We’re talking about two major components: design of (1) appearance and (2) functionality.

Designing the appearance of a solution
  • page or screen mock-ups (the most obvious choice) – detailed sketches of the appearance of a page or screen, including sizes, colours, positions, alignment, spacing of objects.
  • layout diagrams (the term used by VCAA, which the Rest Of The World would call “Layouts”). A simple diagram of a page with boxes to show the component parts.
Designing the functionality of a solution

Be sure to know the ones that are in bold.

  • site maps or storyboards (for web pages, animations)
  • hierarchy charts – describing the parts and sub-parts of a complex organisation. A site map is really just a type of hierarchy chart.
  • user flow diagram (UFD) (mandated, but no particular style is mandated by VCAA. I’d use a flowchart if I were you)
  • input/processing/output chart (IPO)  – for designing formulas for spreadsheets, databases, software in SD.
  • Data structure tables
  • Data dictionary (for databases)
Designing relationships

This is a mandated type of functionality design. You need to know entity relationship diagrams (ERDs). VCAA’s interpretation of ERDs has changed since the previous study design. They now accept Chen style (the one with diamonds) that also has cardinality (1:many markers). They also accept Crow’s Feet Notation – the style with boxes (tables) containing fields with lines between the tables indicating relationships and cardinality. Both styles are acceptable, so you may be examined on either style: learn them both. 

ppt-iconEntity Relationships Diagram (ERD)  (updated version for 2016 with Chen, Crow’s feet and cardinality)

ITI U4O1 KK08 – functions, techniques and procedures for efficiently and effectively manipulating data using software tools

This is another one of those KK that could have hundreds of textbooks written about it.

I guess some relevant topics might be:

  • functions: total, average, standard deviation, correlation, maximum, minimum, lookup, calculating percentages, simple arithmetic, converting text to uppercase, searching for substrings in text.
  • techniques: sorting, searching, selecting, querying, filtering, converting data formats (e.g. MP4 to MKV, text to numbers), spreadsheet pivot tables, parsing (breaking data into subsets, for example converting “3 Nov 1998” to a day, month and year), editing text fields (e.g. stripping leading/trailing spaces), find/replace text
  • procedures* – data aggregation (combining data from different sources or fields into one dataset or new field); saving data; eliminating  duplicate or invalid data; converting data to a visual form (e.g. a graph).

*Remember, a procedure is a series of steps that accomplish a goal, e.g. backing up data. It is not a single action.

ppt-icon Still to come

ITI U4O1 KK09 – manual and electronic validation techniques

Remember – validation checks the reasonableness of inputs (in terms of existence, data type and range).

It does NOT verify the accuracy of the input data.

ppt-icon Validation 

ITI U4O1 KK 10 – functions, techniques and procedures for managing files

This could include topics such as:

  • file naming schemes and conventions – Hungarian notation, camelcase, special rules for online filenames
  • transmitting files – LANs, internet, FTP, email, cloud
  • sharing files – NAS, cloud, collaborative simultaneous editing in real time
  • file managers
  • version control – to prevent confusion between different versions of documents, allow rolling back to previous versions
  • encryption – during storage and transmission
  • access hierarchy – not everyone has equal access to files
  • backups – regular, stored offsite, tested
  • directory/folder/disk structures

See the slideshows for details…

ppt-icon File Management

ppt-icon File Naming

ITI U4O1 KK 11 – techniques for testing that solutions do what is intended

ppt-icon Testing

ppt-icon User Acceptance Testing

ITI U4O1 KK 12 – techniques for documenting the progress of projects, including

  • annotations,
  • logs and
  • adjustments to tasks and timeframes

Gantt charts are often complete and accurate when they are first created, but the real-world rarely lets a project stay on track for long. There are delays, problems, supply issues, breakdowns, illnesses and bad weather that can quickly render even the best Gantt chart a vain dream.

Project plans need to change, develop and reflect the true current state of the project. There is no shame in updating a Gantt chart, but there is foolishness to pretending that everything is on track when it is certainly not.

Gantt charts can be annotated to explain changes and forecasts to other team members, since a chart is rarely used by only one person in a project. Consider it more like a a staff bulletin informing departments and stakeholders what is happening. For example, a note might be added to explain why the last task ran overtime, and the effects it might have. Another note might warn people that if the weather on Tuesday is bad, task X will have to be postponed until Wednesday, so you should advice your staff to be prepared to start task Y instead. 

Logs are a history of past events. In a project, they may be useful to explain to management why project plans were changed.

Adjustments to tasks and timeframes may include reducing the scope of a task to allow it to be completed enough to allow a predecessor to begin on time instead of being delayed.  For example, a corridor may be painted but its wall hangings may not have been hung. The decorations could be made to wait if they were going to delay a more important task that was theoretically dependent on the entire corridor being finished (e.g. laying the carpet).

Timeframes can be also be modified by rearranging resources, such as moving people from one task to another to get a late-running task finished on time. New or reallocated resources may also accelerate a delayed task, such as using more automation (hiring a cement mixer rather than using shovels, as planned) or taking needed equipment planned for a later task (e.g. paint) and using it to finish a current task.

Project managers need to be able to adapt to changed circumstances and make the most of what is available, and use their project plan to coordinate many current and upcoming tasks.

ppt-icon Still to come

ITI U4O1 KK 13 – strategies for

  • evaluating the effectiveness of solutions and
  • assessing project plans.

Evaluation is not the same as testing. Testing proves that a solution works properly: it generates accurate information; if you click a link, the right destination appears.
Evaluation checks whether the finished and tested solution is achieving the goals for which it was originally created. My usual example is a company that wants to increase profit (an organisational goal) by creating a new or improved website (a system). If the website was had many visitors and had no errors, it means its testing had been successful, but unless it increases profit, it will fail its evaluation since profit was its sole purpose for being created.

Evaluation techniques also differ from those used during testing. Evaluation does not aim to repeat testing (e.g. take out a stopwatch and time how long it takes to produce 10,000 invoices, pulling out a power lead and seeing if the system can recover from an unexpected shutdown). Evaluation often relies on inspecting performance over time, such: as the total amount of output produced since the system was installed 3 months ago; the number of customer complains recorded about inaccurate billing. Evaluation often relies on studying performance over time – it does not record the system’s reaction to immediate events – that was the job of testing

Note – if a system is producing inaccurate output or is failing to work properly, that is a sign that testing failed, and that the system should never have been implemented and put into daily service.

ppt-icon Evaluation (PSM stage 4)

ppt-icon Evaluation criteria (tangentially-related to this KK)

Informatics U4O2

Compare and contrast the effectiveness of information management strategies used by two organisations to manage the storage and disposal of data and information, and recommend improvements to their current practices.


ITI U402 KK01 – reasons why data and information are important to organisations, including meeting the goals and objectives of both organisations and information systems

 

Systems – a combination of hardware, software, procedures, people and data that carry out a specific task within a department in an organisation, for example communications, decision making, financial management, customer service.

Organisations – the complete collection of departments and people within a commercial or not-for-profit entity.

Goals – long-term, fuzzily-defined things to be achieved by an entire organisation or system, e.g. “good customer service” for an org, “accuracy” for a payroll system. An organisation’s goals (e.g. in its mission statement) often defines the nature of the organisation and what it strives to achieve over time in everything it does in all its departments.

Objectives – specific, measurable, achievable targets that aim to achieve a larger goal, e.g. “respond to all customer enquiries within 24 hours”  is an objective that is step towards achieving the goal of achieving good customer service.
“No more than 0.1% errors in invoices” is a measurable objective aiming for the accuracy goal in the payroll system. Objectives can be proven to have been achieved or not using data rather than opinions.

Tip: objectives usually have numerical targets or limits in them, such as percentages, dollar values, time limits.
Goals are big, vague and undefined. Goals are organisational  aspirations that are hoped for, but may never be fully realised.

ppt-icon Goals and objectives  of organisations and systems

ITI U402 KK02 – reasons why information management strategies are important to organisations, including

  • maximising opportunities,
  • minimising risks and
  • fulfilling legal requirements

This is another of the vague and vanilla-flavoured KK that make everyone suddenly very sleepy.
I think the “including” dotpoints explain themselves well enough. If not, let me know.

Remember in Informatics “legal requirements” does not include the Spam Act.

ppt-icon Still to come

ITI U402 KK03 – key legislation that affects how organisations control the storage and disposal of their data and information:

  • the Privacy Act 1988,
  • the Privacy and Data Protection Act 2014, and
  • the Health Records Act 2001

ppt-icon Privacy Act 1988 (plus amendments)

The Privacy and Data Protection Act 2014 controls how Victorian state government bodies (e.g. parliament, schools, police) and their agents (bodies hired to work with a government body) use citizens’ data. It does not apply to private companies or individuals.

The Health Records Act affects all Victorian organisations – private or government, large or small – that hold any health information on individuals. It basically repeats the requirements of the federal Privacy Act relating to health information.

This act affects doctors, nurses, psychiatrists, sports trainers, masseurs, hospitals, old people’s homes, psychiatrists, psychologists, health insurance companies, etc.

Data may include health histories, medications and prescriptions, diagnoses, mental health records, lab results, x-rays  etc.

ITI U402 KK04 – ethical dilemmas arising from information management practices

Also see the next KK.

Examples:

  • A media distributor (e.g. Netflix, BBC) restricts access to their media to specific countries. Is it immoral for people in other countries – who would pay for the media, if they could – to use a VPN to gain access to the material?
  • A network manager discovers illegal activity by the owner of her company. Her employment contract has strict “non-disclosure” conditions that forbid her to tell anyone of her discover. What should she do?
  • A person discovers a website promoting illegal or immoral activities. The person knows of several easy ways to take the site down, such as raining down a DDOS attack upon it, or imploding the site with a well-placed zero-day exploit. Should the person attack the site?
  • Anu has been looking for a good domain name. He thought up a good name, but it turned out to be registered by MrX in Belgium. Anu emailed MrX, asking whether the domain name might be for sale. MrX said it might be, but the price quoted was far too high for Anu. Nevertheless, MrX and Anu struck up a friendship and corresponded and collaborated for a couple of years on various projects they had in common. In one email, MrX mentions that he needs to renew his domain name really soon because it has just expired. Anu could register the domain name. What is Anu’s dilemma?
  • Sue works in IT. Her employer is notoriously greedy, and often expects staff to work beyond their normal work hours and on weekends, saying “Well, if you’re not willing to do some extra unpaid work, there are a lot of other people who would be happy to take your job.”  Sue suffers in silence because she needs the money. One day, she discovers a bug in the company’s accounting software that would let her give herself secret bonuses and pay rises that would never be discovered. She could reimburse herself for the hundreds of hours of unpaid work she has been forced to give her employer. What should she do?
  • Naz works for a company whose terms of employment forbid the use of company computers for entertainment, time-wasting behaviour, or seeking alternative employment. Naz’s pay is so low that he cannot afford an internet connection at home. He learns of a good job at another company, but he would have to use his work computer to apply for the job. What is Naz’s dilemma, and what should he do?
  • Arnold’s company forbids the use of the LAN for non-work purposes. One day, Arnold has a brilliant idea and finishes a project several hours early. He has nothing to do for the rest of the day. He thinks he might bend the rules, since he’s been so clever and productive, and play an online game. Is he being unethical?

Do you have other suggestions of ethical dilemmas in IT?

ITI U402 KK05 – strategies for resolving legal and ethical tensions between stakeholders arising from information management practices

 

Just remember – if you are asked about an ethical dilemma, don’t try to convert it to a legal or rule-based issue to make life easier for yourself.

The whole point of ethical dilemmas is that they cannot be easily solved with a magic wand, such as a rule or law.

Ethical dilemmas are difficult because every possible reaction to them is bad for someone. They are lose/lose situations. You’re damned if you do, and damned if you don’t.

None of the following strategies is examinable!

  • It’s easier to prevent ethical dilemmas than it is to resolve them, by advertising and enforcing codes of conduct to define proper, expected behaviours.
  • Decision support frameworks  are useful to advise decision-makers how to respond to specific dilemmas (similar to judges using precedents in common law to guide their verdicts and sentencing). It reduces stress in decision makers, and leads to a fair and consistent treatment of similar offenders.
  • Education – to tell people what behaviour is expected and what is punishable. If no-one knows what they should do, it’s hard to blame them when they do it wrong.
  • Sanctions – to punish villains and ne’er -do-wells who deliberately and knowingly scoff at your authority. It also sets an example for other potential miscreants when they see you are serious about enforcing your standards.

ppt-icon Managing Ethical Dilemmas

ITI U402 KK06 – reasons for preparing disaster recovery plans, and their scope, including

  • evacuation,
  • backing up,
  • restoration and test plans

A DRP identifies steps to be performed in case:

  • the company loses a key employee
  • the company is not able to access its computer
  • information on its computer or network was lost
  • the office building was destroyed
  • information has been corrupted

…to name but a few contingencies.

What sorts of disaster might strike your valuable data?

According to a White Paper from IBM, the leading causes of data loss are:

  • Hardware or System Malfunction 44%
  • Human Error 32%
  • Software Malfunction 14%
  • Viruses 7%
  • Natural Disasters 3%

And as time goes by, the dangers increase because:

  • businesses are becoming more and more reliant on IT to stay in business
  • paper records are often not kept – all data is stored electronically
  • businesses rely on electronic communications
  • IT systems are becoming increasingly complex and hard for the average person to maintain
  • viruses and hacking ‘exploits’ are becoming more common and more destructive
  • more and more employees are being given access to corporate data, increasing the chance of damage or loss
  • few corporations know the true value their data until they lose it
  • more and more corporations are linking their computer systems to communication systems, such as LANs, WANs and the Internet, thereby increasing the vulnerability of their data to external attack.
  • the more a computer is used, the more it is relied upon. At the same time, increased use increases the likelihood of system failure.

So, just how disastrous can data loss be?

IBM reported that, “Fifty percent of companies that lose critical business systems for 10 or more days never recover.”

For most companies today, data is their business. If that data is lost or corrupted, or merely interrupted for a long enough period, the blow to the company can be fatal. Studies show truly disastrous results for businesses that lose access to data.

When businesses in the following fields lost access to their data for the given time periods, 25% suffered immediate bankruptcy; 40% went bankrupt within two years; and almost all were bankrupt after five years.

And how much does it cost to recover data? MASSIVE AMOUNTS – at least $1000 per megabyte. Data must be manually found or re-created, re-entered, validated, tested, updated. And remember that there are not many paper records nowadays – most data may never be recoverable. It takes a lot of labour and time – and normal profitable business is probably impossible to conduct until the data is restored.

Building a DRP

  • Predefine the conditions that may cause your recovery plan to go into effect: some threats are common to any system; others may be peculiar to a single organisation or location (e.g. in Australia, a bushfire plan would be critical. In Kansas, a tornado plan is important.)
  • Identify decision makers and their roles before, during and after an outage emergency
  • Inventory the resources required to bring your IT systems back online
  • Identify assumptions on backup technique, frequency and location for data vintage and retrieval
  • Prioritize and sequence the restoration actions defined in your recovery plan into a detailed timeline and checklist
  • Predefine an operation center to coordinate status, issues and assignments
  • Develop communication strategies for keeping your employees and customers informed
  • Organise your recovery plan into a flexible, easily maintained tool
  • Validate your recovery plan by conducting simulations based on real-life outage emergency declarations

Let’s imagine there is a fire at your office… You should ring the office manager, but you don’t have her home phone number. You need to ring the insurance company immediately to get the destroyed equipment replaced, but you can’t remember what company insures you or where the policy is (oh… dear. You remember: the policy was in the filing cabinet your burnt out office.) You need to rent emergency equipment to get back into business… but you can’t remember the phone number of that company either. You need to get your backup tapes to restore the file server’s data… oh no…the backup tapes were in the filing cabinet with the insurance policy. At least you can get a copy of your recovery plan and… oh dear. The only copy of the plan was stored on the file server. You really are up the proverbial brown creek…

What do you do to avoid this cruise down a smelly waterway?

You print out your draft disaster recovery plan and read it. You discover it’s out of date and does not cover many of the problems you faced in your nightmare. You get a team together from management, IT staff and office staff and update and complete the plan.

What should a good DRP achieve?

  • Provide for the safety and well-being of people on the premises at the time of a disaster;
  • Continue critical business operations;
  • Minimize the duration of a serious disruption to operations and resources (both information processing and other resources);
  • Minimize immediate damage and losses;
  • Establish management succession and emergency powers;
  • Facilitate effective co-ordination of recovery tasks;
  • Reduce the complexity of the recovery effort;
  • Identify critical lines of business and supporting function

Testing the DRP

Unless you test your DRP, you will never sleep soundly. What if the plan fails when it’s most needed? Make sure it doesn’t. Test it.

  • get a computer from somewhere and try to install all the software and hardware needed to do its job
  • perform sample data restores using test data
  • run fire drills
  • ring the key emergency phone numbers to make sure they’re still accurate
  • check the list of responsibilities staff have in case of emergency: are they still employed or even alive?
  • do staff know what their responsibilities are? Quiz them.

ppt-icon Still to come – or not.

ITI U402 KK07 – possible consequences for organisations that fail to follow or violate security measures

  • loss of trade secrets
  • potential violation of the Privacy Act, Health Records Act etc  if personal information is damaged or released
  • loss of reputation as a trustworthy organisation
  • loss of income after catastrophic data loss destroys your ability to get paid by customers or conduct business
  • prosecution by the tax office if tax records are lost
  • inability to pay wages
  • corporate death

ppt-icon Still to come. Maybe. Probably not, though.

ITI U402 KK08 – criteria for evaluating the effectiveness of information management strategies

Note two key words – “evaluating” and “effectiveness”.
The words are not vaguely similar to “testing” and “efficiency”.

Evaluating is done after testing and development and implementation of a solution. The product has already been proved to be working properly during the testing phase. Evaluation is NOT A SECOND TESTING STAGE!

Evaluation determines if/how well a solution is achieving the goals for which it was originally created. It does NOT determine whether it produces correct output or behaves properly. A website created to increase profit may work 100% accurately but still not create any extra profit: its testing says it’s perfect (no errors!) but its evaluation shows it’s a failure (no profit!).

Evaluation occurs after a solution’s rollout to its users. It may begin some months after users start using it for real. The delay is to give users time to become familiar with the solution so they can judge it knowledgeably.

Evaluation criteria are the topics you use to judge a solution’s level of success or failure. Each solution will have different criteria, based on what it vital for it to achieve.

  • A game – must be fun and playable.
  • An encryption utility – must be hard to decode by unauthorised people.
  • A website – must convey its message clearly and accurately.

Effectiveness criteria – You only need to remember that EFFICIENCY criteria comprise measurable TIME (SPEED), COST, LABOUR (EFFORT). Every other criterion is effectiveness. (There. Easy! Aren’t you glad you came to vceit.com?) Effectiveness criteria are often opinion-based and endless, and relate to how well a solution does its job in terms of: accuracy, readability, ease of use, security, robustness, attractiveness, fun, portability, etc.

Evaluation criteria are determined during the design stage of the PSM. These are used during evaluation to determine if the project has been successful or not. Lessons may be learned to improve the next project.

ppt-icon Still to come

ITI U402 KK09 – role of people, processes and digital systems in the management of data and information

This KK dotpoint could launch a dozen textbooks.
But fortunately for me it’s really boring and I couldn’t be much bothered with it.

Roles of people:

  • data entry staff
  • systems managers
  • programmers, software architects, engineers
  • project managers
  • technical support staff
  • network managers
  • technical consultants
  • web developers
  • software testers

Try this site for a few more details about ICT jobs and roles.

Roles of processes – none of these is examinable, but I’m getting to sleepy to care.

  • data acquisition – primary/secondary, qualitative/quantitative using appropriate methods
  • input – putting the acquired data into the system
  • validation – checking data reasonableness (existence, type, range) using manual or electronic means
  • processing – converting raw data into meaningful information
  • storage – including backups, archives 
  • retrieval
  • output – to printers, monitors, speakers
  • communication – usually to network nodes or users using TCP/IP, email, messaging etc
  • disposal (the recycle bin is not secure!)

Digital systems – people, hardware, software and networks, communication protocols, intranets, internet, mobile communications, VPN, wired/wireless, operating systems, users, data, architectures (thin/thick client).

Roles of hardware:

  • DATA INPUT – Keyboards, touchscreens, barcode readers, flatbed scanners,  voice recognition, custom dials and buttons on machines.
  • DATA PROCESSING – CPU, GPU
  • DATA STORAGE – RAM (during processing), HDD, SSD, NAS, cloud (secondary storage)
  • DATA COMMUNICATION – switches (within a LAN), routers (between networks), cables (CAT cable, fibre optic), wireless (Wifi, 3G/4G, microwave, Bluetooth, infrared), WAPs (wireless access points).
  • DATA OUTPUT – printers (laser, inkjet, thermal), monitors (LCD, plasma), speakers (audio), LEDs,  haptic feedback (e.g. phone vibrations).

There. Wasn’t that quick and easy?

ppt-icon Hardware components of digital systems

ppt-icon Network hardware (servers, NICs, modems, RAID, etc)

 

ITI U402 KK 10 – types and causes of accidental, deliberate and events-based threats to the integrity and security of data and information

ppt-icon Threats to data

ITI U402 KK 11 – physical and software security controls for preventing unauthorised access to data and information and for minimising the loss of data accessed by authorised and unauthorised users

We’ve done this earlier in U3O1 KK04 – physical and software security controls used by organisations to protect their data.
ppt-icon Data Security

ITI U402 KK 12 – the advantages and disadvantages of using networks and cloud computing for storing and disposing of data and information.

Here are some suggestions. There are many other valid possibilities to consider.

  Advantages Disadvantages
Networks

– Centralised storage can be accessed anywhere on the LAN – far better sharing for teams.
– Fast loading and saving – especially for huge media files and archives – compared with internet speeds.
– Easier to backup one central file store than dozens of separate workstations. 
– Data can be deleted in one location instead of finding every copy spread over many workstations’ hard disks.
– Data is fully in the control of its owner.

– If the single copy of data on the central storage location is damaged or destroyed, the only copy might be lost.
– If the network goes down, no file access is possible.
– Expensive to set up, complex to maintain and upgrade.
– Backups are the data owner’s responsibility.
Cloud – All the advantages of networks, above.
– Less need for local storage hardware that requires purchasing, space, maintenance, upgrades, backing-up.
– Scalable – extra storage or processing resources can easily and quickly be added when necessary.
– Service provider should be backing up data on your behalf.
– Data is inaccessible without a working internet connection.
– Data is not in the the direct and total control of its owner.
– Service provider may unexpectedly fail or cancel an account.
– Ongoing costs.
– Lack of certainty about how sensitive data is being protected, used or misused by the service provider.
– Uploads/downloads via internet can be very slow compared to local storage.
– Remote access by unauthorised users is easier.

ppt-icon Cloud Computing

Leave a Reply