Today I learned

  • Oauth 2

    Up until today OAuth2 was always very opaque to me. Sure, I was using short-lived token to e.g. make API requests but I never tried to understand the mechanism itself. And actually it’s not that complicated from a conceptual point of view. [Read More]
  • Data Transformation Design Pattern

    When transforming data, I have often been using a pattern like the following. I’ve created a class that is responsible for different data transformation steps. Generally speaking, structuring data transformations using classes makes sense as classes are a neat way to group related functions together. For another developer who has... [Read More]
  • Thrive Vs Protocol Buffers Vs Avro

    When systems exchange data there are some obvious choices like csv, JSON or XML. The advantage of these are the human readability and wide adoption amongst different languages and frameworks. [Read More]
  • B Trees

    B-trees are the most common way of index implementation in almost all relational database systems. [Read More]
  • Spark Distributed Write Pattern

    When writing data with Spark, it’s easy to get confused as there are different ways to do so. The different ways do different things under the hood and depending on what you want it might be good or bad. This illustration helps to understand better what is going on when... [Read More]
  • Hiplot

    Today I learned about a new python module called HiPlot. It uses a simple yet powerful visualization technique to uncover patterns and relationships in a data set. Uses cases are for example to identify the relevant features for a ML algorithm, to obtain decision rules for classification or to analyze... [Read More]