Tools of the Week.
This week during my usual social network journey, I ran into these interesting tools:
-
An Medium article about UMAP, an algorithm for dimensionality reduction similar to PCA, but supporting no-linear relations. Unlike T-SNE, UMAP can preseve the logical structure of the data. Packages available for Python and R. Link to the original research paper here.
-
Catboost, a gradient boost decision tree algorithm (similar to XGBoost) claiming a good performance with default parameters. I think it could be useful for a first attempt when dealing with tabular data. My personal current approach is the good ol’ Random Forest (Breiman, 2001). Link to the research papers here.
-
Another Medium article about SHAP Values, an algorithm for dealing with the black-box nature of several machine learning algorithm by analyzing the contributions of features in classification problems. A link to an article with a general description of the method here