CS Notes is a simple blog to keep track about CS-related stuff I consider useful.

26 Jul 2020

Tools of the Week.

by Harpo Maxx

This week during my usual social network journey, I ran into these interesting tools:

  1. An Medium article about UMAP, an algorithm for dimensionality reduction similar to PCA, but supporting no-linear relations. Unlike T-SNE, UMAP can preseve the logical structure of the data. Packages available for Python and R. Link to the original research paper here.

  2. Catboost, a gradient boost decision tree algorithm (similar to XGBoost) claiming a good performance with default parameters. I think it could be useful for a first attempt when dealing with tabular data. My personal current approach is the good ol’ Random Forest (Breiman, 2001). Link to the research papers here.

  3. Another Medium article about SHAP Values, an algorithm for dealing with the black-box nature of several machine learning algorithm by analyzing the contributions of features in classification problems. A link to an article with a general description of the method here