Learning Python: The NDSC Edition
Ages 18-21

Part 2 - Intro to Data Science

Prerequisites: Part 1 - Python Intensive. If you have prior coding experience, we may be able to waive this requirement. Please contact us to find out more.
Graduates of this course gain the skills and experience which will enable them to take part in the National Data Science Challenge Competition organised by Shopee. Please note that you have to register for this competition separately on the competition website by 9 February 2019.

Data Science involves using computers to extract insights from data, enabling us to make better decisions in every imaginable situation. When done right, data science is the foundation for many of the cutting edge technologies emerging around us today like self-driving cars, chatbots, robo-trading systems and world-champion beating chess AIs.

As a discipline, Data Science lies at the intersection of Computer Science and Statistics. For this coding course, we will assume that you have a mathematical/statistical background equivalent to having taken one of the following:

  • the H2 Mathematics subject at the GCE"A" levels
  • the Mathematics HL subject for the International Baccalaureate (IB)
  • a course in basics statistics at one of the local Polytechnics. Examples of courses which satisfy this requirement include:

If you do not have this background, you may want to consider our 1-day Computational Statistics for Machine Learning course where we will cover this material (please contact us to find out more).

Our offering features small class sizes with an instructor-to-student ratio of 1:5. Instructors hold Masters and PhD degrees in engineering and statistics from world-class universities including Imperial College, Stanford and UC Berkeley and have worked in quantitative, data driven roles at top research institutions and Fortune 500 companies. Instructors are full-time at SG Code Campus and have had extensive experience teaching coding in a small classroom setting - each has individually taught coding to more than 100 unique Secondary School and Pre-U students in 2018, delivering a range of theoretical and applied courses in C++, Java, JavaScript & Python.

The course targets a Pre-U audience - if you are currently a university student or a working professional, you might find some of the training programmes offered by the other partners listed on the Shopee NDSC website more appropriate.


  • Introduction to data science, data analysis and machine learning
  • Introduction to the different categories of data
  • Learn how to process data using NumPy - a scientific computation library in Python
  • Learn the different summary statistics to aggregate data to gain intuition on data

  • Introduction to the Pandas library in Python - a set of tools for fast data transformation
  • Learn techniques to clean and process raw data into a form that is suitable for data analysis
  • Introduction to Matplotlib and Seaborn, a set of plotting tools to convert data into a more visual, intuitive form
  • Learn about the different graphs and plots to convey information
  • Learn to discern the different contexts to which only certain visualization tools are appropriate

  • Introduction to statistical modeling with Linear and Logistic Regression
  • Explore use-cases with linear/logistic regression
  • Evaluation of prediction results with confidence intervals

  • the bias-variance tradeoff
  • parameter tuning with a simple holdout set
  • tuning with the n-fold cross validation technique
  • Explore different metrics for tuning parameters in varying contexts: regressions v.s. classification problems

  • alternative models for classification and regression: N-nearest neighbours, regularised Linear Models, CART, Random Forests, Boosted Decision Trees, Naive Bayes, Support Vector Machines, Splines
  • model selection using a test-train split

  • Explore real life machine learning and data science examples using Kaggle

Lead Instructors
Cheng Wei
M.Sc. Statistics, Imperial College London
B.Sc. (First Class Hon) Mathematics, Imperial College London

Cheng Wei is an avid Machine Learning practitioner - as a former Data Scientist at PwC, he applied his magic to building smart applications in areas as diverse as healthcare, language recognition and tax efficiency. Armed with serious Python, R, and SQL skills forged in academia and industry, as well as a penchant for geeking out at the every opportune moment, he strives to help teach and inspire the next generation of budding Machine Learning experts at SG Code Campus!

M.Sc. Management Science & Engineering, Stanford
B.A. Mathematics & Economics, UC-Berkeley

Ian's love for programming began when he dived into the study of Optimization Algorithms and Machine Learning as a graduate student at Stanford. As a quant risk manager at Barclays, Ian got to further sharpen his craft programming risk engines and coding up financial models in Python, R, VB.NET and the NAG numerical library (FORTRAN). Since starting SG Code Campus, Ian has gone on to pick up JavaScript, Ruby and Clojure while building the first few iterations of this website.

Supporting Instructors
Ph.D. Bioengineering, UC Berkeley - UC San Francisco
B.S. Bioengineering, UC Berkeley

Germaine is SG Code Campus’ resident Doctor and chief brainiac - after graduating with both her undergraduate degree and doctorate in Bioengineering from Berkeley, Germaine then placed her considerable talents and intellect to working in scientific research at A*STAR, where her work in bioinformatics, biostatistics, data analysis and robotics saw her applying her coding and engineering skills gained in school to heavy practical use. She believes that the key to engineering solutions to the world’s most complex problems is a multi-faceted approach where hard engineering skills are paired with an awareness and empathy for the people whose lives we seek to impact. It is this ethos that Germaine seeks to instil in each and every student that passes through our doors.

B.Sc. (Hon) Mathematics & Statistics, University of Melbourne

Siu's passion for coding has its humble beginnings in high school, when she taught herself the basics of Java to make simple text-based games just to amuse her friends. Hungry for more, she pursued her interest in Computer Science throughout her degree, and is still fascinated by how even simple algorithms can solve seemingly complex problems.
Siu has a penchant for breaking problems down into a simpler, solvable parts, which explains her love of math and cryptic crosswords, as well as her decision to specialise in Operations Research - a discipline that deals with the use of advanced analytical methods to help organisations make better decisions. An adventurer at heart, Siu enjoys learning by exploring (she is currently teaching herself how to play the piano!) and especially enjoys picking up new skills that allow her to create novel things. Although snakes are her favourite animal (she finds it funny how they're so smooth), she tends to favour Java over Python as a programming language, because she is first and foremost a coffee lover.
In joining the Code Campus team, Siu hopes to encourage more young people to develop greater confidence with technology, as well as the same sense of playful experimentation that drives her.

This class was last held in January - March 2019.
We will next be holding this class in Q2 2019.


Frequently Asked Questions