Lessons learned running data pipeline on AWS

  • Confoo
  • Montreal, Canada
  • Mar, 2019

Before doing any data science, machine learning or AI, you need to get your data right. As the volume of data grows, having a reliable, available and scalable data pipeline becomes a challenge. In this talk we will share our learnings from running a data pipeline in AWS infrastructure using technologies like Apache Spark, gRPC, Protocol buffers.

