Machine Learning – Experience

I recently completed CS 7641 – Machine Learning as part of my OMSCS coursework. The course was really enjoyable and informative.

The course was taught by Professors Charles Isbell and Micheal Littman. Both are really awesome. Contrary to most other courses on the topic, they have managed to make the course content easy to understand and interesting, without losing out on any of its essences. All videos are structured as conversations between the Profs where one acts as the teacher and other as the student – very effective.

All the course videos are available publicly on Youtube – link. Also, I would recommend watching this funny Capella on ML based on Thriller by the Profs – link. 🙂

The course was a literature survey and general introduction into the various areas in ML. It was primarily divided into 3 modules:

  • Supervised learning – where we are given a dataset with labels (emails classified as spam or not). You try to predict the labels for future data based on what you’ve already seen or ‘learned’.
    • Techniques include Decision Trees, K-Nearest Neighbours, Support Vector Machines (SVM), Neural Networks etc
  • Unsupervised learning – all about finding patterns in unlabeled data. Eg: Group similar products together (clustering) based on customer interactions. This can be really helpful in recommendations etc.
    • Randomized Optimization, clustering, feature selection and transformation etc.
  • Reinforcement learning – the most exciting one (IMHO). This overlays many concepts we usually consider as part of Artificial Intelligence. RL is about incentivizing machines to learn various tasks (such as playing chess) by providing different rewards.
    • Markov Decision Processes, Game Theory etc.
    • I found the concepts in GT such as the Prisoners Dilemma, Nash Equilibrium etc. and how they tie into RL interesting.

All of these are very vast subjects in themselves. The assignments were designed in such a way that we got to work with all of these techniques at least to some extent. The languages and libraries that we use were left to our choice, though guidance and recommendations were provided. Through that, got the opportunity to work with Weka, scikit-learn and BURLAP.

Overall, enjoyed the course really well. Hoping to take courses like Reinforcement Learning (link) to learn more about the topics in upcoming semesters.

The Art of Thinking Clearly

I recently completed reading The Art of Thinking Clearly (link) by Rolf Dobelli. I found the book an interesting, concise and useful read on the many biases of the human mind.

If you go through the reviews of the book on Goodreads (link) or anywhere else online, you are likely to end up with mixed reviews. The negative ones mostly criticizing the author of plagiarism. In fact, N.N. Taleb, the bestselling author of many books including The Black Swan (link) has gone ahead and written a detailed account of the instances where his ideas were plagiarized by Dobelli – link.

Interestingly, these happen to be the exact reasons why I ended up reading Dobelli’s book! Let me explain myself a bit here though that would mean slightly digressing from the subject of this post.

Reading Summary Books

On multiple accounts, I had considered buying books such as Taleb’s The Black Swan, Fooled By Randomness (link) or Nobel laureate Daniel Kahneman’s Thinking Fast and Slow (link) etc. to understand more about the human mind and it’s blind spots. Each time, the sheer size of these books have made me put off the task to a distant future.

It might very well be the case that these books (among others) were the first ones to discuss many of the ideas mentioned in Dobelli’s book and that they discuss these ideas with much more rigor. But for a casual reader like me who is looking for a high-level overview of the core essence of these books without taking the actual effort of reading these, books like Dobelli’s are the best options.

On a related note, I recently came across this extremely nice Youtube channel – link that takes this concept of reading summaries a step further by presenting 5-10 minute illustrative videos that summarise the essence of various famous self-help/philosophical books.

The Art of Thinking Clearly

Getting back to the book, there were two other negative points mentioned in the reviews that I was careful to watch out for while reading:

  1. In an effort to come up with ‘100‘ limitations of the human mind, Dobelli has added many somewhat obvious/insignificant ones also to the list. This can make the real insights hard to separate out for the casual reader esp. since they are given in no particular order.
  2.  Some of the anecdotes used are contextually inappropriate.

Keeping all these in mind, I’ve been able to get some good insights out of the book. A few of the ones that come to my mind include (not including bias definitions for brevity):

  • Confirmation bias (link) being the mother of all biases. This explains why you will never be able to convince someone in arguments where the topic has inherent uncertainties which are open to interpretations. Political discussions on the internet seem to be a good example.
  • Swimmer’s Body Illusion (link) – Also answers the question ‘Does Harvard make you smarter?’
  • Action Bias (link) – where we feel doing something is more productive than doing nothing, even though what we do might be counter-productive.
  • Effort Justification (link) – where we tend to value something acquired with more effort as more valuable rather than objectively valuing the utility of the item.
  • Illusion of Attention (link) – This one was an eye-opener. Particularly the observation that drivers talking on the phone are as susceptible to accidents as a drunken driver, even if you are on hands-free.
  • Survivorship Bias (link) – probably explains why people fail to understand the risks involved in starting startups and overestimate the chances of success.

The entire list of biases can be found here – link.

 

The Pragmatic Programmer

After having it on my to-do and wish list for about a year, I finally ordered and read ‘The Pragmatic Programmer‘. It was a really interesting read. I was able to relate to many of the chapters in it. The book talks about how programmers can rise from journeymen to masters.

The book contains many (70 to be precise) one line nuggets of programming wisdom. The authors themselves have made these available online here. Coding Horror (Jeff Atwood) also has a handy quick reference to many of the ideas mentioned in the book – link.

Even though the tips by themselves are great, I would recommend reading the whole book rather than reading them in isolation. What makes the book great is the way the authors presents the ideas in easy-to-understand ways, often using small stories and analogies wherever applicable. Some of the interesting ones below:

The Broken Window Theory (wiki):

Consider a building with a few broken windows. If the windows are not repaired, the tendency is for vandals to break a few more windows. Eventually, they may even break into the building, and if it’s unoccupied, perhaps become squatters or light fires inside.

This is how human psychology works. The same is applicable in terms of software quality. If we introduce entropy into the system (in the form of poor code, lack of unit or integration testing, poor review practices etc.), it will spread rapidly and destroy the system. The opposite can also happen where once we establish an immaculate system and great practices, individuals would try not to be the first to lower the standards.

The Stone Soup

The story can be read here. The authors have lessons from both sides of the story:

Tip: Be a Catalyst for Change

Like how the soldiers (or travellers as per the wiki) influenced and brought about change gradually, if we show people a glimpse of the future, they will be more willing to participate.

Tip: Remember the big picture

Villagers fall for the stone trick since they failed to notice gradual changes. This can happen to our software systems and projects as well. The next point is related.

The Boiled Frog

If a frog is put suddenly into boiling water, it will jump out, but if it is put in cold water which is then brought to a boil slowly, it will not perceive the danger and will be cooked to death.

The story is often used as a metaphor for the inability or unwillingness of people to react to or be aware of threats that rise gradually. Gradual increases in CPU/memory utilisation or service latencies which eventually bring down systems come into mind here. Gradual feature-creep and/or project delays which eventually add up to failed projects are also examples.

Some of the programming pearls of wisdom that I found most compelling were:

The Requirement Pit 

Requirements are often unclear and mixed with current policies and implementation. We must capture the underlying semantic invariants as requirements and document the specific or current work practices as policy.

Tip: Abstractions live longer than details

The Law of Demeter for Functions (wiki)

An object’s method should call only methods belonging to:

  • Itself
  • Any parameters passed in
  • Objects it creates
  • Component objects

Following this law helps us write ‘shy’ code which minimises coupling between modules.

Listing other tips below:

  • DRY principle – Don’t Repeat Yourself. Avoid duplication of code or documentation.
  • Orthogonality – Decouple systems into independent components.
  • Always use version control (even for documents, memos, scripts – for everything)
  • Use Domain Specific Languages (DSLs) and Code Generators to simply development
  • Ruthless testing – Test early, test often, test automatically
  • Use prototypes and tracer bullets wherever and whenever possible

 

AI for Robotics – Experience

I studied AI for Robotics class as part of the Summer’16, OMSCS program. It was a really interesting and challenging experience. It was taught by Prof. Sebastian Thrun who lead the self-driving car project in Google. It was his team from Stanford which won the DARPA Grand Challenge in 2005 where they drove a car (Stanley) over 212 km of off-road course and came first. Incidentally Prof. Thrun is a co-founder at Udacity and was it’s CEO until recently.

The class consisted of two portions: 

  • a series of lectures combined with small programming tasks
  • two open-ended projects related to self-driving cars

The whole course centers around the use of probabilistic models to predict the various parameters involved such as the location of the robot car, the location of various landmarks, obstacles, moving targets such as other cars, pedestrians etc. The Prof also has an aptly titled text book ‘Probabilistic Robotics’ to go along with the course (though I couldn’t make much use of it).

The lectures covered the following topics:

Localization

Noise is an essential part of robotics.

There will be noise in the robot motion. Eg: If we instruct the robot to move 5 meters, the robot might end-up moving only 4.8 meters due to tire slipping or uneven surface.

There will be noise in sensor measurement. Eg: If the sensor readings tell us we are 3 meters from the car ahead, the actual distance might be 2.7 meters.

How can a robot car navigate the road safely given all these noises? That is exactly what localization addresses. The term refers to various techniques which help us ‘see-through’ the noise and identify the underlying motion model of the robot. The following localization techniques were taught in class:

  • Kalman filters: These work best for linear motions. The predictions are Gaussian distributions here and hence will be uni-modal i.e. the prediction will only tell which is the highest probability location of the robot (no info on 2nd or 3rd highest probability location etc). However, there are extension of the standard KF such as the Unscented KF and Extended KF which address the mentioned limitations.
  • Particle filters: These seem best suited for localization since they work for non-linear motions and support multi-modal distributions.
localizing
Localization in action: Hex bug path in black and localized particle in blue

Search

Self-driving cars need to find the optimal path to their destination as well. The technique used for finding the most optimal path without exploring the entire state space is A* algorithm. Those who have learned AI in under-grad might be familiar with the approach. It involves the use of a heuristic function which gives a score for all possible movements based on how far the new state is from the goal state.

Control Theory

Humans drive cars smoothly. If we ask a robot to move on a particular course, by default it will either over-shoot or under-shoot its goal and then correct itself. This is because of the inherent delay in the move-sense feedback cycle. This keeps repeating leading to a zig-zag motion and overall unpleasant (and potentially dangerous) driving experience. There is a whole domain of control systems on how to smoothen out the robot motion as it approaches it’s desired course.

The technique we learned is the PID controller. This controller adjusts the steering angle of the robot at all points of its motion based on various proportional, differential and integral terms computed in relation to its CTE or cross track error (the lateral distance between the robot and the reference trajectory). 

Screen Shot 2016-08-09 at 9.06.33 PM
Here A represents robot motion without any controller and B represents one with PID controller.

 

Runaway robot

The first project was a set of 4 interesting challenges (plus a bonus challenge for the extra smart ones) where we need to locate a robot (aptly named 404) which ran away from an assembly line and capture it using a hunter bot. This was an individual project. It requires some level of ingenuity to some up with a working solution since the lessons from class were not directly applicable here.

the_chase
Hunter bot (blue) chasing the runaway bot (black). The red dots are future predictions with which the hunter tries to capture the bot.

Hex bug motion prediction

The second project was a team project. Here we were given coordinates of random movements of a hex bug for 2 minutes at 30 fps (frames per second). We need to predict the last 2 seconds i.e. 60 frames of the bug’s motion. This was an open ended problem and we could use any technique from inside the class or outside. We were a team of 4 and explored various techniques including clustering trajectories, creating a markov model and finally ended up using PF to solve the same.

hexbot_predictions
Predictions of hex bug path using various approaches against actual bug path (in black)

Overall, enjoyed the class a lot!

Knowledge based AI – Experience

I enrolled for the Spring’16 batch of OMSCS program offered by Georgia Tech university and Udacity. As my first course as part of the program, I chose Knowledge base AI, taught by Prof. Ashok Goel and David Joyner. The video sessions of the course can be freely accessed through Udacity website here.

The class was very interesting and insightful. I thoroughly enjoyed the 3 main projects we had to do throughout the class. The overall class was focused on systematically studying human-level intelligence/cognition and seeing how we can build that using technology. As a measure of accessing human cognition, the class used Raven’s Progressive Matrices (RPM).

Our class came into the internet limelight after the course when Prof Ashok revealed that one of out TA – Jill Watson was actually a bot. Covered widely by press – Washington Post and WSJ.

With regard to the course content, we were introduced to the broad areas that come under human level AI research such as:

  • Semantic Networks, Frames and Scripts – useful knowledge representations
  • Generate & Test and Means-End Analysis – two popular problem solving techniques
  • Production systems – rule based systems which are useful in AI
  • Learning By Recording Cases and Case base reasoning – techniques to learn based on past examples and to adapt them as per requirement
  • Incremental concept learning – contrary to approaches like ML where we feed millions of examples to train models, human cognition deals with incremental data inputs. Incremental concept learning describes how we can use generalizations and specializations to make inferences from these inputs.
  • Classification – mapping percepts to concepts so that we can take actions
  • Formal Logic – techniques from predicate calculus such as resolution theorem proving are useful in some cases of reasoning. However human cognition is inductive and abductive in nature, whereas logic is deductive.
  • Understanding – Humans always deal with ambiguity. Eg: Same word can have different contextual meanings. Understanding is how we leveraging available constraints to resolve ambiguities.
  • Common sense reasoning – about modeling our world in terms of a set of primitive actions and their combination.
  • Explanation based learning and Analogical reasoning – deals with how we can extract ‘abstractions’ from prior knowledge and transfer these to new situations
  • Diagnosis and learning by correcting mistakes – Determining what is wrong with a malfunctioning device/system. Learning by correcting these mistakes.
  • Meta-reasoning – reasoning about our own reasoning process and applying all the above techniques to the same.

As part of the projects, we built small ‘AI agents’ that process RPM problem images and use various techniques to solve them. These were quite challenging and required fair amount of programming. For me, it was a good learning opportunity for building something non-trivial in Python. We used Python image processing library PIL and various image processing techniques like connected component labelling.

Considering the vastness of the topic – human level intelligence and that I intent to specialize in ‘Interactive Intelligence‘, I found the course really interesting and informative. Enjoyed it!

I’m adding a nice poster created by my classmate Eric on the high level topics we studied in this class.

kbaipostersmall

 

 

 

 

Google summer of code 2014 – Experience

This blog has been long pending. Infact, I took part in GSoC 2014 and GSoC 2015 application process has already started. But I’ll share my experience anyway.

I interned with Raxa. They are into building web and mobile applications to help small clinics and hospitals go online. Their applications are build on top of the OpenMRS platform. Raxa was fully open source earlier but have moved to a hybrid model presently. All the GSoC projects are open source and available through GitHub.

I started late in the application process. I didn’t have specific plans for applying to GSoC this time. The list of organizations had been announced and application deadline was pretty close. One fine day, I thought I’ll just glance over the organizations that have been accepted and see if there are any that meets my areas of expertise. Raxa was planning to enable patients without smart phones to access their medical services using phone calls via IVR (Interactive Voice Response) and via SMS. This required knowledge of an open source telephony server called Asterisk. I had spend a decent chunk of my third year working on a social initiative/startup called Dial Blood which was built on Asterisk. Much of my final year was spend working on Findauto which was an SMS based auto rickshaw booking startup. Hence it made a lot of sense for me to apply for this project.

My application emphasized on why I’m the right person to work on this project and contained a week-by-week systematic and clear breakdown of how I’ll go about completing the project. The key features I proposed to enable were:

  • Appointment scheduling via SMS or call
  • Calling/submitting queries to doctors

One feedback I got on my application was that there was scope for more to be done in three months. They wanted to make sure I was productivity engaged through out the three months with additional tasks to take up if I manage to finish these early. IVR can be at times seem complicated to the rural audience Raxa was primarily trying to appeal to. Hence instead of asking people to press digits (press 1 for English, 2 for Hindi..), I thought of using voice recognition and natural language processing to hear their response and act accordingly. The inclusion of this task made my application reasonably strong.

I applied only to Raxa and thankfully made it when the results came out. They had selected four interns – one girl from Srilanka, one IIT Delhi grad who had previously interned with them outside GSoC and another student who was pursuing his masters degree in computer science, apart from me. They all were working on really interesting projects which included mobile application development to machine learning. More details can be found here.

We also used to have weekly meetings with the whole Raxa team. Our progress was evaluated each week. I was assigned a mentor. I could reach out to him anytime for any assistance. My project was a continuation of last year’s GSoC project. I had some difficulty in the code ramp up because of limited documentation. Apart from that, Asterisk part went smoothly. For the voice recognition part, I was planning to use Google’s voice recognition APIs. But Google had deprecated the free version by then. Other popular alternatives like Spinx had limitations working with Asterisk because of some frequency mismatch issues. This was a blocker for me. My mentor helped me identify a NLP startup (Wit.AI) working in this domain. They had really powerful APIs available for free. I used those in my GSoC project and managed to implement the feature successfully.

Another main task that I did was research on what’s the best way to take the service into production. Various options including Amazon EC2, own servers and third party services were explored. I also contacted various telephony service providers to inquire about PRI lines and their pricing. We were not able to take the service into production because of some constraints Raxa had. But I was code complete and production ready by end of summer.

We had a final demo day where all 4 of us presented our projects. We all managed to complete GSoC successfully. It was a really good experience for me to work on such a big project end to end under strict time constraints. Not to mention the stipend and awesome goodies.. 🙂

My GSoC project code link – https://github.com/Raxa/voice/tree/master/Asterisk

Wiki link – https://raxaemr.atlassian.net/wiki/pages/viewpage.action?pageId=50724873

– Written using WordPress for Android

Where do I start (programming)?

One of the most common questions people new to programming face is “Where does one start?
The number of programming languages and platforms out there has made it slightly overwhelming for newcomers to get started. I’ll try to answer this question with my (relatively small) experience of over 6 years dabbling in various technologies. My answer would be slightly biased to a particular platform/language (as you will notice soon). Latest trends have seen Atwood’s Law become a reality.

It says that any application that can be written in JavaScriptwill eventually be written in JavaScript.

Everything got it’s Javascript equivalent including JS based servers such as Node.JS and MVC frameworks such as Sail.JS. The Javascript based MEAN (MongoDB, Express.JS, Angular.JS, Node.JS) stack is gaining popularity over WAMP/LAMP (Windows/Linux, Apache, MySQL, PHP/Python) stacks.

Over the last few years, the notion of app stores have caught up rapidly. Apart from the largest app stores from Apple (for iPhone) and Google (Android), almost all players such as Microsoft, Blackberry and Amazon have come up with their own app stores. We’ve also seen the inception of various new mobile platforms such as Ubuntu for Mobile, Firefox OS and Tizen (by the Linux community). They’ll have their own stores of course.
Now what does all these mean to a developer trying to commercialize a software product? You can’t simply develop for one major platform (read Windows desktops) and expect the money to flow in. That era is over. It would be a serious waste of development time to port your application to the native code for each of these platforms as well.
Web is THE only solution here. All platforms presently support Web apps natively i.e. we can run our web-based application on these OSes like native applications without the aid of any browser. The latest web standards like HTML5 specifically address such usage by facilitating direct access to various mobile hardware such as camera and GPS to web applications. In fact Firefox OS is specifically designed to support web apps.

The trends mentioned above must be apparent to anyone closely following the Tech industry. But I’ve seen many of my friends interested in learning programming sitting down with C/C++ only to lose interest soon afterwards and complaining how boring it is. A basic knowledge of these languages (from the 1980s) might be helpful in understanding the basic concepts of programming. But these are NOT a must have for a modern-day programmer. Nobody even uses these in the industry anymore (mostly). Similarly don’t think web development isn’t part of mainstream programming. Surprisingly a vast majority of people including computer science students have that incorrect notion. Most companies now predominantly work with web technologies. It’s our syllabus that has got it wrong.
If you are seriously interested in programming, here is what I suggest:

  • Learn basics of programming (C/C++, ideally don’t spend more that a few weeks)
  • Learn HTML and CSS => build static websites
  • Learn Javascript => build interactive websites
  • Learn a server-side language => build web applications (PHP, Python, JSP, ASP, Ruby Javascript etc)
  • Learn a server-side framework => build better and bigger web apps
  • Optionally learn mobile app development for a mobile platform (Java for Android, Objective C for iOS etc)
  • Optionally learn one industry standard core programming language like a Java or C# in-depth. This is esp. relevant for computer science students looking for jobs/internships.
  • Optionally learn to use a few CMSs ( Drupal, WordPress, Joomla etc.)

That’s it. You are good to go.
In the server-side, there are vast number of languages to choose from. If you choose to go with Javascript for server side as well, you get to avoid learning another language though you’ll be skipping over a major era in the evolution of web technologies.
There are various frameworks for most popular languages out there like Zend, CodeIgniter and Laravel for PHP,  Django for a Python, Ruby on Rails (ROR) for Ruby, Sail.JS for Javascript etc. Such frameworks take care of basic things needed by all web apps such as security so that developers can focus on their application logic.
You should choose a framework that suits the requirements of the product that you’re building.

My answer is mostly addressing students interested in learning programming for working on some idea that they have or for the fun of it. If you are a computer science student who’s looking to take your programming knowledge to the next level, I suggest you take a different path.
Hone your knowledge of various data structures (stacks, queries, linked lists) and algorithms (sorting, searching, recursion etc). I would recommend ‘Introduction to Algorithms‘ by Corman as a must-read and primary reference. In case of Kerala university, algorithm analysis and design is in S7. By the time you would have missed out on lot of good placement/internship/coding-competition opportunities.
So start as early as you can (atleast by early second year). Try to participate in monthly cookoffs at Codechef, do their practice questions, attempt problems in various websites like TopCoder, target and prepare for international coding competitions like ACM ICPC etc. Trying to attempt these one or two weeks before college placements isn’t going to get you the best results. Also sign up for some computer science oriented community like the IEEE Computer Society or ACM or Computer Society of India (CSI) to stay abreast of the latest developments in various domains and to stay connected to the industry.

I hope this will help students get a better idea on how to get started with programming.
The views expressed here are my personal opinion. Please feel free to share your suggestions in the comments. 🙂