Session: Challenges of Spark Application coexisting with NoSQL databases

CapitalOne is first US bank to exist out of on-premises and moved completely on Cloud. Over this process of modernizing our application in CapitalOne Card Rewards, we developed ground up custom transactions processing application on open source technologies like Apache Spark, MongoDB, Apache Cassandra etc. This application currently processes millions of customer transactions daily providing them millions of miles, cash and points everyday. In process of building our application, we came across many challenging issues to have Spark application process data from MongoDB and Cassandra backend to serve customers. This talk is going to focus on few of those issues, what is the impact of those issue and how to mitigate them.To call out specifically following are list of issues this talk will focus on.

  • How Cassandra Key sequence is important and how it impacts in querying
  • How Cassandra batching helps and works well with Spark partitions
  • Importance of Cassandra Data Modeling and its implications after MVP/Deployment
  • How to manage Mongo Connection (at JVM level)
  • Implications of using MongoSpark connector on its Partitioner

All the issues highlighted are faced by us in our application. This talk will focus on what are these issues in Spark/Mongo/Cassandra app

Session Speakers:

Gokul Prabagaren

Gokul Prabagaren is an Engineering Manager at CapitalOne – Rewards Org, specialized in Distributing computing. Developed distributed Cloud Native applications based on Spark, Cassandra and Mo [Read More]

Nagesh Kumar Vinnakota

Kumar Vinnakota, a software engineering manager at Capital One, is a highly skilled professional with expertise in architecting and delivering secure, customer-centric solutions that are highly sca [Read More]