Category Archives: Programming

Not Everything Appears as It Seems – Lessons on Computing from an ex-Strategy Consultant

How to confuse a spouse new to programming!  A two-step guide!

My wife looks over my shoulder a bit confused.  I’m working on a final project for my Node.js class (7th iteration) and she’s been watching since the first iteration.  This is what she sees:

She asks me calmly!

“Why does it seem like every time I look at your Taskitty application [Task tracking App] it seems to look the same or occasionally looks worse?  I feel like you are procrastinating.  Maybe stop playing video games or watching TV.”

I’m a bit stunned.  I’ve been putting a ton of work in the class.  Easily a dozen or so hours per week on top of my typical work schedule.  Then I realize something spectacular.  1. The entire class is iterative and 2. the professor doesn’t care how it looks in the browser (front-end web development).  He wants us to focus on learning different frameworks and is fine with us replicating the same overall and feel.

The best analogy.  Imagine talking to a neighbor and finding out he’s put a ton of work into his house recently.  You look at the exterior and think “It looks like it’s actually deteriorating a bit.  A little bit of the siding is coming off and some of the wood seems a bit rotten”.  He than mentions that all his time has been focused on interior design and a kitchen remodel.  It’s that aha moment.

My wife was watching the exterior of the website, but was clueless that the kitchen and bathroom had been remodeled.  In fact, in my case the entire internals had been replaced without her even noticing.

The house before remodeling!

In the above cases, both version 4 and the final version of Taskitty are quiet large.  The first version uses a framework called Express that uses routers and templates to create the presentation.  The best way to describe it is the diagram below:

Your browser requests a web page from my server.  The server uses a router to find a template for the web page.  It fills in the blank parts of the template using information from a database and also cookies/session information.

Two important things to note:

  1. Everything happens on  my server, each web page has it’s own template with it’s own fill in the blank variables.
  2. When my server is done it sends the entire newly formed html document to your browser via the internet.
  3. Every web page requires you to contact my server first and all the processing happens on my server.

What changed inside the house!

The final version looks similar on the surface, but how it arrives to the same conclusion is completely different.  Here is the diagram of how it works:

In the above case, the browser requests a page from my server.  My server sends a single web page bundled with a ton of components and JavaScriptThe single page application (SPA) sets up a router on your browser using JavaScript, which than selectively picks and chooses components to create a web page.  It can also independently call the database for more information.  Once the router has finished picking the components and getting data from the database it modifies the single HTML web page that was sent and renders it for you in your web browser.  You can than use the router to reconfigure the web page as many times as you desire.

There are some huge implications here:

  1. When you first contact my server, you have to download not just a web page, but all the components and logic of the website.  That can be a very large initial download.  So the first connection is a bit slow.
  2. Once the web page and website logic exists on your browser, the only time you have to reach out to my server is to contact the database.  You can almost run the application without an internet connection.  That can really speed up the experience since internet connections are slow compared to your computers resources (usually).
  3. By shipping components, I can re-use them across web pages.  This safes a lot of time.

The house walls are still an ugly Green!

My wife of course doesn’t have access to the code or knows how the internals work by looking at what the website generated.  She just sees the same old web page unchanged iteration after iteration.  The only thing she might notice is that the later web page feels faster and seems to function smoothly.  She can easily take this for granted, because both sites are relatively fast.

I think this is one way a developer’s experience can be divorced from an end user.  Sometimes we make huge changes, but things that are visible don’t change drastically.  So it appears no work has happened.

The “SQL” strikes back!

About 3 months after the class, my wife decides to be ambitious and start a class on Excel, SQL and Tableau.  One day, I watch her getting frustrated.  She tells me:

“I keep trying out these SQL commands in this database.  Every single time it tells me the function doesn’t exist.  I’ve tried different things and googled the message for the last 4 hours.  Nothing seems to solve it.”

Trying to be a helping hand and having done SQL for almost a decade.  I decide to look into the issue for her.  The first thing I ask is what SQL database she is running.  She tells me the class is in PostgreSQL.

She types SQL into a web page.  Runs it.  It works.  She opens up a SQL client on her computer.  Types the same thing in.  It tells her the function doesn’t exist.

My first gut reaction is she’s running an older version of the database.  So I tell her to type: select version().  That should tell me what version of postgreSQL she is running.

Version() doesn’t exist!

Yes, the database tells me even the Version() function doesn’t exist.  Now, I’m on high alert.  It’s either a very old database from several decades back, which is nonsensical.  The firm she’s taking the course from is less than a decade old.  Alternatively, she’s using another database.

I google the error message by itself.  The first thing that pops up on Google.  SQLite.  So I ask my wife to try:

select sqlite_version();

She tells me it works!  What does it mean?

My wife’s website was in PostgreSQL and the one on her laptop was SQLite.  Postgres and SQLite have different functions for the same tasks.  So the SQL from postgres will often error out in SQLite.  The same is true in reverse.

Mystery solved!

Except, my wife is now more confused than ever.

  1.  She now realizes there are different SQL databases.
  2. She doesn’t understand why they would teach Postgres and give a SQLite test database for practice.

I don’t blame her.  It’s very confusing.

Possible reasoning?

Having spent a lot of time in Postgres and having taken a few dozen courses/tutorials in Python.  I realized I might know the answer.  SQLite is a database with a small footprint often used in IOT devices.  You can easily download it as a single file.  Postgres on the other hand requires an installer, setting up a database/users and creating tables.  You can definitely bundle it up and ship it as a single file if you are clever about it (docker etc), but it’s much more complicated.

The problem in this courses case is communication.  They should have made it clear that the databases weren’t the same.  That the reason they use SQLite for practice is it’s much easier to set up.  The only thing the students would have to do is google SQLite functions when using it.

Two things, both different, both look similar!

Here ends the tale of two confusingly superficially similar things, which are actually completely different.  Lesson things that look the same might be completely different.  Communicating those differences can have a huge impact on user experience and prevent you from looking like a couch potato.

Web Certificate – 7 Days Left

 

7 Days until the Start of the End

Another personal update!

I have 7-days left of my month long “vacation” from school.  This is the start of the end of my Harvard Extension School – Front-end Web Development certificate.  I will take a class that perfectly sums my last year: 50% Data Science and 50% web development.  The class is called: building interactive Web Applications for Data Science.  I’m concurrently working towards a Masters in Analytics at Georgia Tech, which will become the main focus afterwards.

What I’ve built so far!

CSCI E-12 – Fundamentals of Website Development:

  1. The final project combined flexbox, media queries, jquery/javascript and light boxes.  I had to do some wire-framing and learn a bit about UX.

CSCI E-33A – Web Programming with Python and JavaScript: 

  1. Pokemon Listing website  – pokemon list and coding documentation containing basic HTML, CSS, SCSS and JavaScript
  2. Bookstore with Search:  Book listing for book store with search done in a mix of SQL, Flask and HTML/CSS.
  3. Slack Clone (Chat): Chat messenger that looks exactly like Slack using a mix of Flask and socket.io.
  4. Pizza Shop – E-commerce: This is a Pizza shop application with a shopping cart built with a mix of Django and Django forms.
  5. Property Management App: This is a property management app.  Search houses in Zillow, add them to your properties and watch monthly mortgage payments/interest auto-calculated including payment schedule.  Built using Django and calling Zillow API.

CSCI E-31 – Web Application Development using Node.js

  1. Task Tracking Application: this was incrementally built starting with a static Express application in Node.js, adding MongoDB as the data source, restful apis for task data and finally converting the front-end to use Angular.js.

What’s Next?

building interactive Web Applications for Data Science combines my interest in Data Science and Data Analytics with recent skills I’ve acquired in web development.  The class focuses on Flask deployed to AWS via Docker.  It requires the student to build Machine Learning models and use D3.js that interact with the Web Application.  This should be interesting.

This ties into 2 other classes I’ve taken at Georgia Tech: CSE – 6040 – Intro to Computing for Data Analysis and ISYE 6501 – Intro to Analytics Modeling.  First class explores Python, Pandas, Numpy and Machine Learning implementation.  Later focuses on analytics techniques and introduces around a dozen different type of models.  6501 coding in R.

Fall Term!

Fall term is approaching.  I will be focusing exclusively on my Masters of Analytics outside of work.  Two courses I will be taking: Data Visual Analytics and Simulations.  The former involves a survey of different data-related technologies: Python, D3.js, Spark/Hadoop, SQL-databases.  The later is theoretical with a focus more on the math.  It involves building a few simulations in Arena and Python.  The Data Visual Analytics course is renown for being time-intensive and difficult.  More so, those with little programming experience.

Other Technologies!

Typically, I don’t talk much about things outside of class!  That said, I did get to pick up and implement docker within a codebuild environment.  Build a continuous integration and development system.  That was pretty cool.  I’ve also been reading up on new data technologies: Spark, Presto etc.  During my 1-month break have implemented a Spark cluster on an EC2 instance that contains Jupyter and used my “micro” cluster to try out Spark’s official Machine Learning documentation.  I’ve been wanting to try it out.

Overall, I’m personally trying to push myself towards trying new things out.  Just implementing more things from scratch.  It’s part of stretching some of my new found skills in Linux Administration gained over the last few years.  Also a general interest in learning more about Devops technologies on the side.

Always stay true to yourself and explore this world!  Life is to short to wait for it to pass you by!

Best,

Chris

2020 – Education Overload

Georgia Tech

Georgia Tech

Education overload: 2019-2020.  That should be a life headline.  Before 2 years ago, I picked up new technologies by: reading a few books, scouring the internet for tutorials and than building a bunch of prototypes.  It worked.  But! I’ve always wanted to try graduate school and I wanted to do it on a topic I struggle with: Math.

why math?

math blackboard

math on blackboard

This is part past history and part recent struggle.

In college, I rapidly switched between: chemistry, mathematics, international relations and than economics.  I even threw in some art courses: drawing and painting.  The class that “got away” was always math.  It’s my ultimate “what-if”, had I stuck with it where would I be today.  I’m not sure I would be more successful, but it still interests me.

The second part is recent struggle.  I’ve read a few books on different math topics over the last 10 years: probability theory, statistics, machine learning and game theory.  I’ve even taken an intermediate statistics course, I skipped the pre-requisites due to cockiness (which quickly resolved itself over the next few classes).  The problem learning was never consistent and sustained.  There was always the next best opportunity, Computer Science, staring at me when I struggled with a math equation or theory.

Graduate school kills two birds with 1 stone: that nagging what-if question and the natural instinct to pick up an easier topic in computer science when the math gets hard.

stupidity’s name is ambition…

coffee

coffee

My coping mechanism to experiencing change is to rapidly increase the pace of change.  I partially blame my coffee addiction: rapid heart-beat, high energy, high tempo.  Love that coffee high.  Love how change generates an adrenaline rush.  Same problem, different skin.

I bought a house December 2019.  I got married the following June.  Once those two events kicked off, I got an adrenaline rush from the change.  I folded, applied for both the Computer Science and Analytics Masters at Georgia Tech.  Meanwhile, I couldn’t stay still and picked up two courses in web development over Fall/Summer 2019 at Harvard Extension School and got Linux certified.

Ambition leads to the: “Education Plan”.  A “5-year” style plan reformatted into a 3-year format.  Now, I can keep myself busy for the next 3 years.  Fun.

the “Plan”

grand plan

the grand plan

What did the education plan do?  It formalized the Masters in Analytics as a 3-year goal by breaking it down into 2 classes per semester (that in task master driven manner I’ve kept to).  Converted the 2 Harvard Web Courses into a Front-end web development certificate due Summer 2020.  Finally, listed a bunch of “optional” certificates to pursue.  You know, just in case I had enough free-time.

how’s that been going for you?

I’m finishing up my 3rd course at both Georgia Tech and at Harvard Extension School.  So far, I’ve done since May 2019 (currently March 2020):

Georgia Tech Master of Analytics:

  1. Business Fundamentals for Analytics
  2. Computing for Data Analytics
  3. Introduction to Statistical Modeling

Harvard Extension School:

  1. Fundamentals of Website Development
  2. Web Programming in Python and JavaScript
  3. Web Application programming in Node.js

All I can say, It has been busy.  I’ve been studying 10-20 hours every week since May 2019.  My only break has been December for 2-3 weeks.  It’s been a relentless march of progress.  I’ve been tired, stressed and also had fun.  Overall, I feel like all my free-time has been sucked out of my life.  I’ve really started cherishing relaxed weekends and time with friends/family more.

how to cope?

stress

how to deal with stress?

Golden rules for studying 2 courses at a time, working full-time and also having time for family/friends (as well as 2 work out sessions a week): do as much as possible as early as possible, schedule things in advance, schedule vacation during school holidays and take frequent breaks.

1.  Do things as early as possible

Your worst enemy is procrastination.  Procrastination leads to late nights studying, finishing up projects or work assignments.  Late nights are not sustainable.  You will sooner or later pay for them by either: resting/recovering for long times (you don’t have time), limiting time with friends/Spouse (stress/psychological relief) or experiencing increased amounts of stress (bad for dealing with people).  The trick: be super proactive about everything.

2.  Scheduling things in advance

Try to put things on the book or on your schedule.  This will help motivate proactive behavior.  If you know you will hang out with friends on the weekend, you’ll try to crush that assignment tonight.  You won’t wait until Monday and panic.  It helps keep you more accountable.

3. Schedule vacation during school breaks

Vacations are a nightmare if you are doing part-time school.  A bunch of problems can arise.  1.  you can easily lose motivation when a beautiful mountain range is right outside your door waiting to be hiked.  2. You’ll get FOMO, when is the next time you’ll be on this tropical island?  Why bother studying? 3. Your internet will fail you or be blocked by the government (what happens in China stays in China).  4. You’ll want to spend time with family since you only see them every few years (my parent/s live in Germany/China).  All 4 cases are worse when you are actively studying in school.  Let a vacation be a true vacation, do it when you don’t have work or school.

4.  Take frequent breaks

When studying, take a bunch of breaks.  Go study for a few hours, than go watch a movie, take a long walk or exercise.  Long stretches of studying will “compress” your brain and make it harder to study in the future.  Don’t try to cram too much at once, because it will get exhausting.  This kind of goes back to part 1 of this list: do things as early as possible.  The earlier you do something, the more opportunities you have to take a break or spend time with friends/family.

last, but definitely not least

Having support and understanding friends, family and spouse is crucial.  Don’t neglect those important people in your life (again step 1). Recognize how they make you stronger and help you out on your day-to-day journey.

I’m especially thankful to my loving wife for putting up with long study hours, making the occasional dinner and believing in me.  She also frequently checks on me and gives me an excuse to take a break!  I think this experience would be much tougher without her.

Fall 2018 – Readings and Online Courses

Dear Reader,

I decided to write quickly about 6 programs/books I took: AI Programming with Python (Udacity), Google Cloud Platform Architecting (Coursera), Creating Kivy Apps by Dusty Philips, Linux Kernel Development by Robert Love, Terraform: Up and Running by Yevgeniy Brikman and the Design of Everyday Things by Donald A. Norman.

Udacity: AI Programming with Python

This is the entry-level course in AI for Udacity.  Udacity is an online course-ware provider that focuses on providing content on cutting-edge topics: AI, Deep Learning, Self-driving cars and Node/React.  It also has classes on data analysis and marketing.  AI program starts by teaching you python, brushing up on some linear algebra and finishing off with 4 sections on AI: Two theoretical units on gradient descent and neural networks, one on PyTorch and a final project that classifies flowers.  PyTorch is Facebook’s AI framework built in Python.

AI Class

The course is well prepared overall and should be manageable for someone with intermediate programming skills.  Best in Python.  It’s a bit on the expensive side for online material: $500-$1000, but Udacity tends to make up for the price tag with more interesting content and topics.  Overall the quality is good.  I’ve heard FastAI is a free alternative.

My Repo: https://github.com/Silber8806/udacity-AI-programming-with-python

Coursera: Architecting with Google Cloud Platform

Offered by Google Cloud

Google Cloud Platform, GCP, is googles answer to AWS, Amazon Web Services.  Overall, the platform has many of the same services as Amazon including Cloud Functions, which looks to be AWS Lambda for Google Cloud.  The only glaring issues I’ve noticed so far: 1. No equivalent to SNS for e-mail and some UI issues in the console concerning Google Cloud Storage (S3).  The first part is painful in that Google by default blocks the smtp ports forcing you to use a 3rd party provider (with limited support in stackdriver).

Google outsourced most of it’s training for GCP to coursera.  The courses are broken into 6 parts.  The first course covers all of the services on a 40,000 foot level, 4 courses proceeding this are in-depth technical reviews of virtual machines, networking, storage, database technologies, container services and autoscaling solutions.  It also includes a host of managed services including the very neat sounding: “Spanner”, a supposedly horizontally scaling RDBMS service.

The last course, Reliable Cloud Infrastructure: Design and Process,  was the most interesting.  It provides a practical overview of how to construct infrastructure as well as going over Google’s philosophy on designing scalable software.

Something that really stood out about this program is the labs.  They provide you a temporary Google Cloud Account for your tutorials.  You get to practice on the actual Google console and shell prompts.  This program is reasonably priced at $49.99 per month.  You can take the material without labs for free, but I’d recommend the labs!

Creating Apps in Kivy:

Kivy is Python’s Mobile Application development framework.  It’s used to make cross-platform applications for: Android, IPhone and Ipads.  It’s something I’ve wanted to try out for a while just out of curiosity.  I picked up Creating Apps in Kivy by Dusty Phillips, a book on the subject.

Creating Apps in Kivy

The book covers two applications: A weather application using an open weather api and a video game involving spaceships.  The weather app takes the block of the book with significant amount of time spent on UI and UI interactions.  Kivy generally is split into two files: a main file in python that is event-driven and a .kv file that describes the component layout.  A single chapter is dedicated to graphics.  The best part of the graphics section was developing animated snowflakes that fell from a single component.  The book also covers databases and advanced UI components like a carousel.

Overall, I’m happy with the quality of this book with the exception of 2 things.  It was hard to read the indentation between pages and there were a few chapters where I had to add imports or features he didn’t mention to get the examples to run.  For this reason, I’d recommend being somewhat well versed in Python before attempting the sample projects in the book.

Github Repository: https://github.com/Silber8806/kivy_tutorial

Linux Kernel Development (3rd Edition) by Robert Love:

I started reading this book after my colleague at Salesforce.com, a DevOps Engineer, began talking to me more about the Linux Kernel and strange behaviors in Bash.  I read a previous book to get a better understanding of the OS and am really happy that I skimmed through this book.

Kernel Book

Linux Kernel Development covers the Linux source code by going through different subsystems and explaining how it works.  It’s a deep-dive into things like: process management, memory, I/O, Virtual File System, Cacheing, Timesharing… It’s inspirational to see how much thought has gone into the Operating System and how many improvements are mentioned.  You really feel like you are standing on the shoulders of giants after reading this book.

Linux Kernel Development feels like a technical reference and I’d recommend reading a book on Operating Systems before getting too deep into it (or taking courses in C).  I skimmed through this book, because I use Linux daily and feel a bit clueless about what is happening under the covers and wanted to learn more about it.

Terraform: Up and Running by Yevgeniy Brikman:

Terraform: Up and Running covers… Terraform, the cloud provisioning software by HCL corp.  .  Terraform is a framework for deploying IT services to AWS, GCP, Azure and Digital Oceans in an automated fashion.

Terraform is cool in that you write your infrastructure as a series of templates.   These templates are in the form of a declarative language called HCL, which describes your cloud infrastructure typically in a 1-to-1 fashion with the services you have deployed.  What’s special about Terraform is that after you apply a HCL template to your cloud environment, it writes the current state into a file (immutable data structure).  Every additional change begins with your last state and modifies it.  A lot of devops systems are not aware of their previous configurations.  Terraform reminds me a little of Redux, but applied to Infrastructure instead of web development (I might be wrong).

Things I noticed from the book.  They suggested deploying these state files to S3, expect strong concurrency controls around the Terraform library and advice that you don’t mix other devops tools with Terraform.  That way if a user manual creates an account that Terraform isn’t aware of, it won’t create an error (since that manually created user isn’t part of Terraform’s state).  It seems Terraform works best as an isolated service and the “main” provisioning tool for a specific subspace of your infrastructure.

I skimmed through this book as I was interested in Terraform and have heard of a lot of adoptions in industry of this technology.  I didn’t get to test the code in this book and can’t really speak about it.

The Design of Everyday Things by Donald Norman:

The Design of Everyday Things by Donald A. Norman is a book about usability and user design.  It’s the book that I’ve been listening to on Audible, an audio book app by Amazon.com.

The book goes over the design of everyday things: ovens, doors, cars, computers, phone systems, radios and details both design flaws and positive things about different objects.  He focuses a lot on understanding differences between knowledge in the world vs in the mind, how we map controls (buttons etc) to actions and how overloading them can lead to confusion and how to constrict user settings to prevent too many options flooding the users consciousness.

He’s a big proponent of mapping controls to physical aspects of the world.  For example, a car seat control should move the seat forward if a switch is moved forward and backwards if a switch is flipped backwards.  A switch shouldn’t move forward and backwards if a button is pressed left or right as that will confuse a user.  He also favors designs were cues exist in the world that indicate what to do and where each control (button) does one and only one thing.  A button on a watch should only turn off the alarm not adjust the alarm or answer phone calls as well.  If these aspects of designs can’t be applied, he suggests standardization across an industry.  A water facet for example should have cold water knob always to the right of the hot water knob.

This is a great book.  I would suggest reading it on Kindle or as a paperback book. My big issue with Audible and Audio books in general is that I typically listen to them while walking or multi-tasking.  That means I lose focus on certain sections (last 2 chapters).  Overall, I really enjoyed the content I heard from this book.

Best,

Chris

Data Camp: Week Adventure into Python Data Science

Image result for data camp

Introduction:

Noah, a former TechTarget colleague, mentioned  datacamp.com  to me, while we were discussing an issue with remotely hosted files (probably rdp associated) and pandas DataFrames.  On February 10th, I decided to give it a try and today, 6 days later, I finished up the Python Programmer Track (10 classes)!

What is Data Camp?  Data Camp is an online education company that offers data science specific courses.  They focus mostly on two lines of technology: R and Python.  They offer some auxiliary courses in other topics: SQL, Linux and git (as well as a few derivatives).  All the courses are hosted on their online platform, which overall is beautiful to interact with.  A bit more about the topics:

Data Camp Topics:

R – Statistical Programming Language

A statistical programming language inspired by LISP and developed by professors at the University of Auckland.  R has become famous amongst statistics and machine learning researchers where they prototype cutting-edge research and provide it as packages to CRAN, a online index for hosting R code.

Python – General Purpose Language

A general purpose programming language developed by Guido van Rossum.  It’s a dynamically typed, interpreted language that has been used for rapid prototyping and scripting.  Python as a general purpose programming language that can be used for: web development, networking, devops, data analysis and machine learning.  Python has gained popularity over the last few years with increased interest in it’s scientific computing platform.  Pandas, a popular Data Science framework, was inspired by R’s data frame.

SQL – Structured Query Language

Structured Query Language, SQL, is a domain-specific language typically associated with relational database management systems (RDBMS aka SQL databases), but has been coopted by other solutions (BI platforms, Hadoop and Spark).  SQL is easy to learn and revolves around a few common entities: Servers, Databases and Tables (Views etc).  Tables can be viewed as tabular data where rows are entities (employee) and columns are attributes of the entity (salary).  Most SQL is used to retrieve data hosted in tables mediated by RDBMS.

Context

R and Python are used to manipulate data in a procedural way, typically through the use of things like Data Frames (R), Pandas (Python) or Numpy (Python).  These libraries create tabular, matrix, vector or scalar data often rely on vectorized operations to do computation.  Both R and Python have extensive data visualization frameworks to create graphs.  They also have great libraries for statistics, machine learning and artificial intelligence.  SQL is primary associated with data retrieval and is used to get “data” back from a hard drive (through the database).  Most database are more limited in functionality when it concerns more finite data manipulation, graphing and machine learning (though extensions do exist).  They make up for it by being able to store large quantities of data without relying extensively on memory (volatile and limited).  Generally, most software engineers and analysts use both a programming language like (R/Python) and SQL (to access data).

Data Camp:

Summary:

Data Camp breaks down Python/R courses on career tracks: Python Programmer, Data Analyst and Data Scientist.  Each track is composed of a number of courses: 10, 13 and 20 respectively.  Topics covered: basic programming, data manipulation techniques, graphing, statistics, machine learning, network analysis and ai (1 course).

Each course is composed of 3-5 segments.  Each segment has a set of lectures followed by exercises.  You can expect around 3 lectures and 10 exercises per segment.  Most courses build on themselves intuitively beginning with the basics and gradually building up in complexity.  I was surprised to find lectures on generators, closures and how they relate to data frames within the first 4 classes.

Lectures:

datacamp lecture

Lecture portion of the website is beautiful.  Presentations have nice transitions and the website background is not distracting.  Lecturers were clear and easy to follow.  You get a sense that al to of effort was put into the curriculum.  There were multiple lecturers in the 10 courses I took (around 5-6).  Some of these lecturers work for esteemed companies like Anaconda, published books on the topic they spoke on or had a background in software engineering/consulting.  Overall, the lectures were high quality with very few mistakes.

Practice Sets:

datacamp practice problems.

The practice problems were conducted in what looks like a modified ipython notebook embedded in the website.  They have 4 panels, the 2 on the left: exercise and instructions and 2 on the right: Scrapt.py and ipython Shell.

The exercise and instructions provide guidance on how to complete the exercise.  The Exercise section explains the topic.  The instructions tell you what steps need to be completed before submission is excepted.

Script.py is where you write your code.  The run code button let’s you execute script.py and see the output in the IPYTHON SHELL.  It’s pretty interactive.  When you submit answer, it checks if the solution matches the instruction section.  For the most part, there are few cases where code I submitted was marked wrong when it was in fact right.  That’s great!

What if you get stuck on a programming problem?  There is a button called hint.  This provides some extra guidance typically in the form of small chunks of code.  If you press the hint button, a new button called show solution will appear.  Clicking the show solution button will overwrite the SCRIPT.PY window with the correct solution, which you can then run and submit.

In rare occasions, you might end up with a multiple choice question.  They offer an interactive shell in this case.  It always proceeds the more free-form question variant.

Gamification:

Data Camp has one really great feature that I liked.  Each exercise has an amount of xp that you can collect.  Each lecture is worth 50xp, multiple choice is worth 50xp and problem sets are worth 100xp.  If you click the show hint button, the problem set xp is reduced to 70xp.  Show solution gives you 0xp.  When logging in, you will see the total xp you got that day as well as how many days in a row you have utilized data camp (streak). This is a great way to motivate you to do the exercises.

Another layer of gamification comes from certificates you can collect (and post on linkedin) as well as career tracks you can complete.

Cost/Summary:

I think data camp is very friendly to beginners interested in learning about data analysis in Python and R, both marketable skills.  I think the course layouts, lectures and practice problems are well thought out.  I would suggest that beginners also read books and online documentation on subjects like Pandas and Numpy.  I found data camp focused more on practicing skills and less on implementation details (which is a good thing).

Data Camp is currently on sale for $180/year (usually $300/year).  You can also buy it as a monthly subscription for $30/month.  Data Camp is similar to Udemy.  There are 3 advantages Data Camp has over Udemy:

  1. Practice problems make up a larger percent of the curriculum.
  2. $30/month you have access to any one of 100+ courses.  The Udemy equivalent is $15/course.  A Udemy course is equivalent to 3 Data Camp courses (in material)
  3. Data Camp specializes in data science and has really thought about how to naturally progress through the data science material.
  4. If you like structured programs, this is better.

Advantages of Udemy:

  1. Overall larger selection of content, variety and topics.  If you want to practice Python, but want to tackle multiple topics like: web programming, networking, penetration testing or a specific subset of machine learning.  It might be a better option.
  2. There is no subscription fee.  It’s fixed cost.  If you are not sure you want to invest a lot of time into python programming this is a better choice.
  3. Lectures are not as uniform.  You might find some lectures that are more theoretical.  Others are more practical.  That means you can try out different lecture styles to see what works.

Best,

Chris