Introduction to Information Security – Experience

I did the Introduction to Information Security course (CS6035) as part OMSCS Summer 2017 semester.

The course was a good overview of various aspects of Information Security. It broadly covered topics like system security, network security, web security, cryptography, different types of malware etc. The course was lighter in terms of work load compared to the other subjects I’ve taken so far. I really liked the projects which were thoughtfully designed to give the students hands-on experience in each of these topics.

The four projects that we had to do were:

  1.  Implementing Buffer Overflow in a given vulnerable code. This required brushing up on C basics,  understanding how process memory allocation works internally and some playing around with gdb.
  2.  Analyzing provided malware samples using Cuckoo, an automatic malware analyzer and sandbox to identify behaviors such as registry updates, keyboard and mouse sniffing, remote access, privilege escalation etc.
  3. Understanding and implementing the RSA algorithm in python, identifying the weakness in using smaller length keys (64 bit) and decrypting an RSA encrypted message by exploiting this weakness.
  4. Exploiting vulnerabilities in a target (sample) website using Cross-Site Request Forgery (XSRF), Cross Site Scripting (XSS) and SQL injection.

Apart from the projects, there were 10 Quizzes to be completed, one per week throughout the course. The various exploits discussed in the course are fairly easy to be introduced in a codebase if you are not aware of these. Unfortunately, these are pretty common even now, many years after they were first discovered.

Hence, no matter the type of software development one is into (mobile, web, DB, relatively low level languages like C, embedded device programming, bare metal etc.), these exploits and their counter-measures are a must-know.

 

Advertisements

Machine Learning – Experience

I recently completed CS 7641 – Machine Learning as part of my OMSCS coursework. The course was really enjoyable and informative.

The course was taught by Professors Charles Isbell and Micheal Littman. Both are really awesome. Contrary to most other courses on the topic, they have managed to make the course content easy to understand and interesting, without losing out on any of its essences. All videos are structured as conversations between the Profs where one acts as the teacher and other as the student – very effective.

All the course videos are available publicly on Youtube – link. Also, I would recommend watching this funny Capella on ML based on Thriller by the Profs – link. 🙂

The course was a literature survey and general introduction into the various areas in ML. It was primarily divided into 3 modules:

  • Supervised learning – where we are given a dataset with labels (emails classified as spam or not). You try to predict the labels for future data based on what you’ve already seen or ‘learned’.
    • Techniques include Decision Trees, K-Nearest Neighbours, Support Vector Machines (SVM), Neural Networks etc
  • Unsupervised learning – all about finding patterns in unlabeled data. Eg: Group similar products together (clustering) based on customer interactions. This can be really helpful in recommendations etc.
    • Randomized Optimization, clustering, feature selection and transformation etc.
  • Reinforcement learning – the most exciting one (IMHO). This overlays many concepts we usually consider as part of Artificial Intelligence. RL is about incentivizing machines to learn various tasks (such as playing chess) by providing different rewards.
    • Markov Decision Processes, Game Theory etc.
    • I found the concepts in GT such as the Prisoners Dilemma, Nash Equilibrium etc. and how they tie into RL interesting.

All of these are very vast subjects in themselves. The assignments were designed in such a way that we got to work with all of these techniques at least to some extent. The languages and libraries that we use were left to our choice, though guidance and recommendations were provided. Through that, got the opportunity to work with Weka, scikit-learn and BURLAP.

Overall, enjoyed the course really well. Hoping to take courses like Reinforcement Learning (link) to learn more about the topics in upcoming semesters.

The Pragmatic Programmer

After having it on my to-do and wish list for about a year, I finally ordered and read ‘The Pragmatic Programmer‘. It was a really interesting read. I was able to relate to many of the chapters in it. The book talks about how programmers can rise from journeymen to masters.

The book contains many (70 to be precise) one line nuggets of programming wisdom. The authors themselves have made these available online here. Coding Horror (Jeff Atwood) also has a handy quick reference to many of the ideas mentioned in the book – link.

Even though the tips by themselves are great, I would recommend reading the whole book rather than reading them in isolation. What makes the book great is the way the authors presents the ideas in easy-to-understand ways, often using small stories and analogies wherever applicable. Some of the interesting ones below:

The Broken Window Theory (wiki):

Consider a building with a few broken windows. If the windows are not repaired, the tendency is for vandals to break a few more windows. Eventually, they may even break into the building, and if it’s unoccupied, perhaps become squatters or light fires inside.

This is how human psychology works. The same is applicable in terms of software quality. If we introduce entropy into the system (in the form of poor code, lack of unit or integration testing, poor review practices etc.), it will spread rapidly and destroy the system. The opposite can also happen where once we establish an immaculate system and great practices, individuals would try not to be the first to lower the standards.

The Stone Soup

The story can be read here. The authors have lessons from both sides of the story:

Tip: Be a Catalyst for Change

Like how the soldiers (or travellers as per the wiki) influenced and brought about change gradually, if we show people a glimpse of the future, they will be more willing to participate.

Tip: Remember the big picture

Villagers fall for the stone trick since they failed to notice gradual changes. This can happen to our software systems and projects as well. The next point is related.

The Boiled Frog

If a frog is put suddenly into boiling water, it will jump out, but if it is put in cold water which is then brought to a boil slowly, it will not perceive the danger and will be cooked to death.

The story is often used as a metaphor for the inability or unwillingness of people to react to or be aware of threats that rise gradually. Gradual increases in CPU/memory utilisation or service latencies which eventually bring down systems come into mind here. Gradual feature-creep and/or project delays which eventually add up to failed projects are also examples.

Some of the programming pearls of wisdom that I found most compelling were:

The Requirement Pit 

Requirements are often unclear and mixed with current policies and implementation. We must capture the underlying semantic invariants as requirements and document the specific or current work practices as policy.

Tip: Abstractions live longer than details

The Law of Demeter for Functions (wiki)

An object’s method should call only methods belonging to:

  • Itself
  • Any parameters passed in
  • Objects it creates
  • Component objects

Following this law helps us write ‘shy’ code which minimises coupling between modules.

Listing other tips below:

  • DRY principle – Don’t Repeat Yourself. Avoid duplication of code or documentation.
  • Orthogonality – Decouple systems into independent components.
  • Always use version control (even for documents, memos, scripts – for everything)
  • Use Domain Specific Languages (DSLs) and Code Generators to simply development
  • Ruthless testing – Test early, test often, test automatically
  • Use prototypes and tracer bullets wherever and whenever possible

 

AI for Robotics – Experience

I studied AI for Robotics class as part of the Summer’16, OMSCS program. It was a really interesting and challenging experience. It was taught by Prof. Sebastian Thrun who lead the self-driving car project in Google. It was his team from Stanford which won the DARPA Grand Challenge in 2005 where they drove a car (Stanley) over 212 km of off-road course and came first. Incidentally Prof. Thrun is a co-founder at Udacity and was it’s CEO until recently.

The class consisted of two portions: 

  • a series of lectures combined with small programming tasks
  • two open-ended projects related to self-driving cars

The whole course centers around the use of probabilistic models to predict the various parameters involved such as the location of the robot car, the location of various landmarks, obstacles, moving targets such as other cars, pedestrians etc. The Prof also has an aptly titled text book ‘Probabilistic Robotics’ to go along with the course (though I couldn’t make much use of it).

The lectures covered the following topics:

Localization

Noise is an essential part of robotics.

There will be noise in the robot motion. Eg: If we instruct the robot to move 5 meters, the robot might end-up moving only 4.8 meters due to tire slipping or uneven surface.

There will be noise in sensor measurement. Eg: If the sensor readings tell us we are 3 meters from the car ahead, the actual distance might be 2.7 meters.

How can a robot car navigate the road safely given all these noises? That is exactly what localization addresses. The term refers to various techniques which help us ‘see-through’ the noise and identify the underlying motion model of the robot. The following localization techniques were taught in class:

  • Kalman filters: These work best for linear motions. The predictions are Gaussian distributions here and hence will be uni-modal i.e. the prediction will only tell which is the highest probability location of the robot (no info on 2nd or 3rd highest probability location etc). However, there are extension of the standard KF such as the Unscented KF and Extended KF which address the mentioned limitations.
  • Particle filters: These seem best suited for localization since they work for non-linear motions and support multi-modal distributions.
localizing
Localization in action: Hex bug path in black and localized particle in blue

Search

Self-driving cars need to find the optimal path to their destination as well. The technique used for finding the most optimal path without exploring the entire state space is A* algorithm. Those who have learned AI in under-grad might be familiar with the approach. It involves the use of a heuristic function which gives a score for all possible movements based on how far the new state is from the goal state.

Control Theory

Humans drive cars smoothly. If we ask a robot to move on a particular course, by default it will either over-shoot or under-shoot its goal and then correct itself. This is because of the inherent delay in the move-sense feedback cycle. This keeps repeating leading to a zig-zag motion and overall unpleasant (and potentially dangerous) driving experience. There is a whole domain of control systems on how to smoothen out the robot motion as it approaches it’s desired course.

The technique we learned is the PID controller. This controller adjusts the steering angle of the robot at all points of its motion based on various proportional, differential and integral terms computed in relation to its CTE or cross track error (the lateral distance between the robot and the reference trajectory). 

Screen Shot 2016-08-09 at 9.06.33 PM
Here A represents robot motion without any controller and B represents one with PID controller.

 

Runaway robot

The first project was a set of 4 interesting challenges (plus a bonus challenge for the extra smart ones) where we need to locate a robot (aptly named 404) which ran away from an assembly line and capture it using a hunter bot. This was an individual project. It requires some level of ingenuity to some up with a working solution since the lessons from class were not directly applicable here.

the_chase
Hunter bot (blue) chasing the runaway bot (black). The red dots are future predictions with which the hunter tries to capture the bot.

Hex bug motion prediction

The second project was a team project. Here we were given coordinates of random movements of a hex bug for 2 minutes at 30 fps (frames per second). We need to predict the last 2 seconds i.e. 60 frames of the bug’s motion. This was an open ended problem and we could use any technique from inside the class or outside. We were a team of 4 and explored various techniques including clustering trajectories, creating a markov model and finally ended up using PF to solve the same.

hexbot_predictions
Predictions of hex bug path using various approaches against actual bug path (in black)

Overall, enjoyed the class a lot!

Technical meetups in Ohio

The tech community in Ohio is very active and diverse. Thanks to Meetup.com, I’ve been able to discover quite a lot of interesting meetings in the neighborhood.

STARTUP Columbus “Startup Saturdays” – Monthly Meetup – July 27th

Conducted on the last Saturday of every month. I’ve written an entire post about this here. Nice experience.

Columbus CodeJam – July 31st

A casual meetup of people interested in coding. Got to meet a .Net developer, a ruby developer etc among other programmers. Chit chat over pizzas & coke . Also made a good friend – Yemane Abebe, an electrical undergrad from OSU. He was interested in learning web development. We met up in later days to do some website development.

Angular JS meetup – Aug 7th

Hosted by Command Alkon. Got to meet people who actively use Angular JS for production level development. Though the discussion was very technical, they were newbie friendly and gave many pointers to start out with Angular. Also Pizzas, beer n coke.
Like we have WAMP, there is a whole JS based stack  – MEAN stack. Some of the resources I came to know from the meetup:

  • Angular Seed – a skeletal application for starting off with Angular.
  • Yeoman – a collection of tools to help you in scaffolding apps, manage packages, build & test them etc. Helps you quickly create apps using AngularJS,  HTML5 Boilerplate, jQuery, Modernizr, Twitter Bootstrap etc.
  • Lineman – similar to Yeoman, but comes with various settings preconfigured.
  • Batarang –  debugging tool for AngularJS
  • Egghead.io – detailed video tutorial series on AngularJS. Also http://www.davemo.com/
  • Sample contact app – maintained by one of the developers from the meetup

 Python DoJo – Aug 9th

This was an interesting meetup as well. It is conducted every friday at 6PM (planning to be a regular). I met Kenneth Wee, co-founder at ZoopShop. Also many interesting python coders. They were keen to help me jump-start my Python adoption. Provided me with references, tutorials, books and in fact a laptop to try things out during the meetup. Had a really awesome time. There was an after-meetup party as well but I couldn’t stay for it as it was getting late and the place was a bit far off. Key resources I came to know:

  • IPython Notebook – A standalone python server that provides a complete coding environment with features to even share our work, plot advanced graphics etc.
  • ReadTheDocs – easy documentation for everything.
  • OverAPI – collection of cheat-sheets for lots of languages.
  • PyVideo – video archive of python related talks
  • Project Euler – an interesting set of mathematical & programming questions. Makes a great compliment to IPython Notebook for learning python. Presently in the process of trying it out.
  • VirtualENV –  A tool to isolate various Python environments & avoid thus avoid version conflict for packages.
  • PEP8 –  styling guide

The Lean Startup

I recently had the opportunity to read the book ‘The Lean Startup‘ by Eric Ries.

It was a really interesting read. The author is a very seasoned entrepreneur and leverages his experiences to define a set of guidelines which have collectively come to be known as ‘The Lean Methodology’ which can helps startups of all shapes and size achieve their goal of success.

Eric’s blog ‘Startup Lessons Learned‘ is very famous among entrepreneurial circles.

Eric defines a startup as – A startup is a human institution designed to deliver a new product or service under conditions of extreme uncertainty. Eric starts by explaining how startups can make sure they are progressing – validated learning. Many times, startups spend developing features that do not add value to the consumer. At times, they spend much time adding lots of features before launching. This can lead to a lot of wastage – in terms of time & human potential. The worst part is that startups sometimes fail to identify whether the features they have added are impacting their growth in any way.

The key here is to measure progress in a more real sense – in terms of customer-centric lessons learned rather than vanity metrics that might be false indicators. Startups make a lot of assumptions about the market, its value & growth hypothesis etc. According to Eric, every startup decision needs to be considered as an experiment. These leap-of-faith assumptions need to be rigorously tested. The best way is to build an MVP (Minimum Viable Product) that helps us  get consumer feedback. The idea here is to go through the Build-Measure-Learn feedback loop as fast as possible. Any feature does not help learn about consumer insight in measurable terms is a waste. The main 2 things a startup needs to validate are its value hypothesis and its growth hypothesis.

This involves the concept of Continuous Deployment where you build & deploy fast, get consumer feedback and improve. Eric also suggests a method called Innovation Accounting to keep track of your progress. This involves using cohort analysis – using tests that can help us objectively measure whether a feature has impacted customer behavior positively – split-user tests, user-activity tests etc. All the tests need to satisfy the 3 A’s – Actionable, Accessible & Auditable. For eg., instead of looking at the gross growth rate, Eric suggests studying the compounded growth rate which is  the Natural growth rate – churn rate (attrition rate). Here churn rate – fraction of customers who fail to remain engaged with company’s product.

Eric heavily draws from his own startup experience as a CTO of IMVU as well as the lean & just-in-time (JIT) manufacturing methods (like Kanban) followed by Toyota & other successful industrial companies. He mentions the use of the ‘Five Whys‘ to drill down to the basic (mostly human) cause behind every seemingly technical problem. Another major insight is regarding the question of whether to pivot or preserve. Most startups face this question at some point of their life. Successful startups usually have success stories which highlight the persistence of their founders as the reason of their success – which can be misleading at times. Eric suggest that founders should be open to change and take decision based on measurable data that suggest whether they are failing or not to gain traction. He defines the startup runway (time till take off) as the number of successful pivots that can be performed without running out of cash reserves. He gives the example of the startup Wealthfront as a classic lean startup which has had a number of timely pivots before hitting  the gold pot. Eric has categorized and methodically analysed most types of pivots that we see in the industry as well.

In the last chapters, Eric talks about how to ensure sustainable innovation in large corporations as well. Some good reads suggested at the end of the book (in my to-read list) are The Four Steps to The Epiphany  by Steven Blank and The Innovator’s Dilemma by Clayton M. Christensen.

Startup Columbus – Startup Saturday

This Saturday (27th July 2013), I participated in Startup Columbus ‘Startup Saturday‘ monthly meetup. I came to know about the event from meetup.com.

The event was hosted at the Dublin Entrepreneurial Center. There were 10 participants in the meeting. It was scheduled from 9.30 am to 12 pm (though it got extended to 3 pm). The meeting was presided by Alex Jonas, organizer at Startup Ohio and Ohio Games Incubator. We started off by giving a quick 2 min. intro about ourselves. All other participants were employed and from various age groups. All were in various stages of their entrepreneurial journey – there was a guy Tim who is successfully running 2 startups already and had come to discuss about his third startup (25+ years experience). Then there was Victoria who had lots of experience (17+) in all kinds of administrative and marketing tasks – she had helped with the operations for many early stage startups.   There were also people who were simply interested in knowing more about entrepreneurship and had come for meeting up new people.

After the intro, we were each given the opportunity to ask about any specific problem/ assistance/guidance that we needed. The whole team would then discuss and come up with various solutions. Problems like pricing strategy, marketing strategy, increasing & retention of user-base etc. were discussed. A participant Chintan (possibly Indian) had come up with a project called Qlyer. He wanted advice on gaining more traction. Another participant Naina had an idea but didn’t know where to start.  I personally didn’t have any problem to discuss. I told about the ideas and projects that I am part of. Everyone was really supportive and came up with lots of suggestions. Contrary to my expectation, the idea of DialBlood and TinyMail were well received. Another participant Ron, who had a Ph.D in Geography and is presently running his own mapping solution startup  was interested in our project ‘SMS based vehicle locating system’. The team suggested that we should look into possibilities of future collaboration.

After the discussions, we were taken for a tour of the DEC by Alex. DEC presently houses 90+ startups. In the same building, there was also a data center. Alex explained the story behind some of the recent incubatees. After the tour, the meet up was officially over. (~ 12.30pm). A few of us stayed back and discussed about various startup related topics. I got the opportunity to talk to Alex in person for some 2 hours. It was a nice interaction – he told me about various initiatives that he had started, about the state of entrepreneurship in Columbus and Ohio in general, about upcoming events etc. He also agreed to introduce me to a few people who are part of TechColumbus, another incubator. I told him about Startup Village and inquired about possibilites of partnering SV with DEC or TechColumbus or other incubators here so that startups in both places can be mutually benefited. Alex was interested in the idea and told me that he’s consider the options.

Lastly, Dave another participant who was a linguistic expert as well as a Karate teacher (and ofcourse a startup enthusiast) was kind enough to drop me back home. That was especially helpful since public transport is less frequent in Ohio esp. in weekends – I had only a single bus for returning, that too once in an hour and the nearest bus stop was a half an hour walk.

To conclude, the meetup was a very enriching experience.