Last Updated: 2017-04-25 Tue 10:00

CS 100 HW 6: Machine Learning and Security Upgrades

CHANGELOG: Empty

Table of Contents

Instructions

  • This is an individual assignment: you must do your own work and may not share or compare answers with other students. If you are struggling, ask questions on Piazza, seek help from the TA or professor, and review the class notes and tutorial links provided.
  • Do your work in an acceptable electronic format which are limited to the following
    • A Microsoft Word Document (.doc or .docx extension)
    • A PDF (portable document format, .pdf)

    You may work in other programs (Apple Words, Google Docs, etc.) but make sure you can export/"save as" your work to an acceptable format before submitting it.

  • Submit your work to our Blackboard site.
    • Log into Blackboard
    • Click on CS 100->HW Assignments->HW 6
    • Press the button marked "Attach File: Browse My Computer"
    • Select your file
    • Click submit on Blackboard
  • Make sure the HW Writeup has your information in it:
    CS 100 HW 6 Writeup
    Turanga Leela tleela4 G07019321
    

Problem 1: Basic Machine Learning Question (20%)

Explore the use of neural networks using the TensorFlow library via its web interface here:

http://playground.tensorflow.org/

We discussed the perceptron in class as the fundamental part of neural networks. You can use the site to create a percetron which looks like the following:

perceptron.png

A perceptron on its own is not a particularly powerful learner and may not be able to identify a pattern in complex data. There are 4 datasets on the site shown below: Circle, Xor, Gaussian, Spiral.

datasets.png

Explore which of the 4 data sets can be learned effectively using a percetron (single neuron) with a limited number of input features. Use no more that 2 input features for the perceptron. However, you can select 2 different features for each dataset to try to learn a good model.

input-features.png

What to put in your HW Writeup

  1. Can the Circle dataset be learned with a perceptron and 2 input features? If so what 2 input features are needed?
  2. Can the Xor dataset be learned with a perceptron and 2 input features? If so what 2 input features are needed?
  3. Can the Gaussian dataset be learned with a perceptron and 2 input features? If so what 2 input features are needed?
  4. Can the Spiral dataset be learned with a perceptron and 2 input features? If so what input features are needed?

Problem 2: Xor Dataset with Two input features (20%)

Consider the Xor dataset with just the X1 and X2 input features. A perceptron (single neuron) does not do too well on this data. Add some more neurons in the single hidden layer and determine if the Xor dataset can be fit better with additional neurons. Do not add any more hidden layers.

xor-testing.png

What to put in your HW Writeup

  1. Take a screenshot of the single-layer network gave the best performance. Include the results it had on the data graph on the right which shows the orange/blue areas. Make sure that you are only using the X1 and X2 features.
  2. How many internal neurons do you need to add to get reasonable performance on the Xor dataset with only a single hidden layer and inputs X1 and X2?
  3. Compare this to your Answer for Xor in question A: what enabled a single perceptron to do well on the Xor dataset that is not available here?

Problem 3: Learnable parameters in Neural Nets (10%)

A neural network designer decides ahead of time how many neurons to include in a network and how they will be arranged. This is similar to the basic design pattern of biological brains where certain large-scale structures are connected to one another.

After deciding on the basic architecture of the network, it must be trained on data in order to make its predictions. This is what happens when one presses the "Play" button on the TensorFlow web page.

We discussed the idea of learnable parameters in class and related them to neural networks. What does the neural network "learn" during training? What are its learnable parameters? What form do they take and how are they displayed in the TensorFlow interactive web page?

What to put in your HW Writeup

  1. What are the learnable parameters in neural networks?
  2. How are they shown on the Tensor Flow site?

Problem 4: Machine Learning and Ethics (20%)

Algorithms are often tauted as more objective than people as they are not biased in their decision making process. This problem explores that claim.

Consider the job hiring problem, something most college students will be involved in at some point in the near future. Companies that post available jobs often have hundreds of applicants for the job. Sorting through these applicants and their resumes to determine the most desirable candidates to interview is a time-consuming task for humans, so much so that *many companies now use automated tools to screen resumes.

One might use a Neural Network which is fed input features that are the contents of applicant resumes. Based on records the company has kept from the past about what appears on the resumes of good candidates that are hired compared to what is on the resumes of weak candidates that are not hired, the neural network can be trained to identify good looking resumes automatically. Human resources could then use the results of provided by the neural network to limit their consideration to high-scoring applicants, ignoring those candidates that are predicted to be poor choices.

A great danger of this process is training the neural network to reflect the bias already present in the way humans make these decisions. This can happen at training time. Answer the following questions to explore this issue.

  1. Suppose that the company has received many applicants from both Virginia Tech and GMU but has historically hired mostly Virginia Tech graduates and very few GMU grads. How do you think this will affect the model for "good" job candidates that is learned by the neural network?
  2. Discuss some other biases that may be learned by the neural network by training it on data that is "unfair" in some way.
  3. How might one train a more fair neural network that will not be biased? What kinds of training features/data should be included and excluded to make the model learned by the neural network more fair?
  4. Is it realistic to expect a machine learner to be any less biased than humans and the data provided by humans to train it?
  5. You may wish to examine the following article which outlines some of the issues at stake in the use of algorithms to screen resumes:

    Hiring Algorithms Are Not Neutral by Gideon Mann and Cathy O'Neil, Harvard Business Review 12/9/2016

What to put in your HW Writeup

Answers to the above 5 questions

Problem 5: Automating Jobs (10%)

Visit the following site which discusses automation in the work place:

https://features.marketplace.org/robotproof/

Take the brief quiz which asks you to predict which of several jobs may be affected by automation in the near future.

After answer the firs question about whether

  • Tree-Pruning Specialist
  • CEO
  • Real Estate Broker

are more automatable, an interactive graph is shown which looks like the following.

automatable-jobs.png

which you should use to answer the following questions. Hovering over the dots on the graph will display more information about various professions such as how automatable the appear to be and their hourly pay rate.

  1. Identify two low-wage jobs (less than $12/hour) that are at least 50% automatable
  2. Identify two low-wage jobs (less than $12/hour) that are less than 30% automatable
  3. Identify two high-wage (more than $20/hour) jobs that are at least 50% automatable
  4. Identify two high-wage (more than $20/hour) jobs that are less than 30% automatable
  5. Describe some patterns in what makes a job more or less automatable

Read the remainder of the article and determine:

  1. What profession has 98% fewer workers in it today than 200 years ago?
  2. What accounts for the great change in this profession?

What to put in your HW Writeup

Answers to the above 7 questions.

Problem 6: Security Privacy Upgrades (20%)

Many tradeoffs between security, privacy, and convenience are at play as software systems store vital but sensitive information. Practical skills for citizens of the 21st century include the following:

  • Recognize when a software system is more or less secure
  • Be able to change the settings between secure and insecure when there is a choice
  • Reason about the advantages and disadvantages of each mode and select an appropriate option In this

In this problem, you will seek to upgrade your personal computing security in 2 ways. You are free to choose an area to explore to upgrade your security

By the end of it you will be able to exercise more control or use a new tool to increase your security or privacy and explain why doing so is or is not a good idea.

Below is a list of some trade-offs relating to computer systems which you probably use. The Secure/Private column lists a way that software can work or a skill that you might possess which is more secure or private than the alternative listed in the Insecure/Public column.

Tradeoff Secure / Private Insecure / Public
Login to personal computing device (laptop, phone, etc) Password required Automatic, no password
Encrypted Files You can choose to create a file that is password encrypted using a tool such as GnuPG All your files are unencrypted
Hard Drive Whole hard drive is Encrypted Whole hard drive Unencrypted
Data Backup You back up your hard drive regularly on another drive or web service You do not regularly back up your data
Web Site Logins Manual, log in each time Automatic, browser remembers passwords
Sending E-mail You can send E-mail encrypted or have a public key associated to verify you as the sender No encryption or public key identification used
Home Wireless Access Your wireless router requires a password for access Your wireless router is open and requires no log in
Firewall A firewall to prevents unauthorized connections to your computer No firewall set up
Anti-virus Anti-virus software running None set up
Web Passwords You use a passwrd manager such as LastPass to generate random passwords for web sites You use the same password on many web sites
Web form information (address, credit card) Re-type information every time Info is saved by browser, automatically filled in
Mobile Phone GPS Location Your phone does not reveal your physical location to your service provider Your physical location is revealed for use by applications
Online Financial information You re-type credit card and bank account information for each transaction An online company such as Amazon or Mint saves credit card and bank accounts
Online Privacy You can use a tool to determine which online advertisers or third-party sites are notified of your visit to web sites You do not have knowledge of any tools which reveal when advertisers are watching.

Select two ways to upgrade your security which you currently do not know how to do. For instance, your laptop might already be password protected but you don't know how to change this so that you can login without a password. Research how to do so and discuss below. If you already know how to change the password settings on your personal computer, you might research how to encrypt single files using a software tool you download. You are also free to choose something not on the list above to explore so long as you clear it with your instructor.

Make sure to track any sources such as web tutorials or instruction manuals you find useful in exploring your security upgrades.

After finishing the two different security upgrades, answer the following questions.

Security/Privacy Upgrade 1

  1. What security/privacy did you upgrade? Describe it in a few sentences and identify what it is supposed to protect.
  2. Give a detailed set of instructions on how to change settings to the secure/private mode. These instruction should pass the "parent" test: if your mother or father had the same computer as you were working on, she or he should be able to follow the instructions to change the settings.
  3. Include references sources you used such as online tutorials or built-in help manuals. Put these instructions into your own words; do not copy and paste them. If you figured out how to change the settings without using outside sources, say so and comment on how difficult the process was.
  4. What are at least two advantages of using the insecure/public mode?
  5. What are at least two advantages of using the secure/private mode?
  6. Now that you have learned how to change between the secure and insecure modes, describe what mode your software or system was in initially and what mode you decided to leave it in, and explain why you made that choice.

Security/Privacy Upgrade 2

  1. What security/privacy did you upgrade? Describe it in a few sentences and identify what it is supposed to protect.
  2. Give a detailed set of instructions on how to change settings to the secure/private mode. These instruction should pass the "parent" test: if your mother or father had the same computer as you were working on, she or he should be able to follow the instructions to change the settings.
  3. Include references sources you used such as online tutorials or built-in help manuals. Put these instructions into your own words; do not copy and paste them. If you figured out how to change the settings without using outside sources, say so and comment on how difficult the process was.
  4. What are at least two advantages of using the insecure/public mode?
  5. What are at least two advantages of using the secure/private mode?
  6. Now that you have learned how to change between the secure and insecure modes, describe what mode your software or system was in initially and what mode you decided to leave it in, and explain why you made that choice.

What to put in your HW Writeup

  • Answers to the 6 questions for Security Upgrade 1
  • Answers to the 6 questions for Security Upgrade 2

Author: Chris Kauffman (kauffman@cs.gmu.edu)
Date: 2017-04-25 Tue 10:00