Computer Science Notes

30 Nov 2021

Github flow for conducting research projects

The Software Industry has well-defined standards and procedures which are heavily based on tools such as Gitlab. However, in research sometimes we follow a more relaxed and not structured way. At LABSIN we have recently begun to apply software industry approaches to our daily work. The match is not perfect since research could be different in some way. But, the benefits are clear. [9min read]

19 Oct 2021

NO, Data Science is not just cleaning and transforming data!

Decent programming skills, strong math and stats knowledge, and amazing visuals are not enough for a data science position in the industry. These are just necessary tools you will need for doing your daily tasks, but you don't have to lose the ultimate goal "to provide valuable information to decision-makers" (Duh!). This is how you can make a difference and companies know it. [5min read]

12 Sep 2021

SHAP values with examples applied to a multi-classification problem.

We can not continue treating our models as black boxes anymore. Remember, nobody trusts computers for making a very important decision (yet!). That's why the interpretation of Machine Learning models has become a major research topic. SHAP is a very robust approach for providing interpretability to any machine learning model. For multi-classification problems, however, documentation and examples are not very clear. [8min read]

03 Aug 2021

How confident is Random Forest about its predictions?

Given a prediction on a particular example, how sure is Random Forest about it? For answering this question it is necessary to look beyond usual performance metrics and dive into the swampy waters of the confidence interval estimation for statistical learning algorithms 😖. [6 min read] (updated 11/21/22)

29 Jun 2021

Deploying a simple ML model with Plumber 101

Sometimes notebooks are not enough and you will need to deploy your machine learning model into company infrastructre. The task involves a lot of Software Ingenieering knowledge, BUT with Plumber package for R you can do the basics with not so much pain 😉. [6 min read]