Before doing any data science, machine learning or AI, you need to get your data right. As the volume of data grows, having a reliable, available and scalable data pipeline becomes a challenge. In this talk we will share our learnings from running a data pipeline in AWS infrastructure using technologies like Apache Spark, gRPC, Protocol buffers.
Page: /