Loading…
GraphConnect 2018 has ended
Thursday, September 20 • 11:45am - 12:25pm
Harnessing the Power of Neo4j for Overhauling Legacy Systems at Adobe

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
At Behance (leading online platform to showcase and discover creative work, division of Adobe), one of our most popular user-facing features has always been the activity feed. The activity feed feature allows users to follow their favorite creatives and curated galleries based on preferences. When a user follows a creative, they receive alerts and an updated feed whenever that creative logs an activity within Behance, e.g. when they ‘appreciate’ a project, publish a project, comment on a project. Users can also follow pre-curated galleries, selected by the Behance curation team, which highlight a creative theme, e.g. graphic design, photography, illustration.

While we’ve had this feature for a long time, the first two implementations (MongoDB, then Cassandra) plagued our tech stack and made the feature extremely difficult to maintain and scale out because of the requirements, frequency of access, and scale of the data itself. Behance has over 10 million registered users, which leads to millions of activity feed loads daily.

This talk will discuss how we planned and executed our transformation from a Cassandra-based architecture to a leaner, more robust, and more mantainable Neo4j-powered system. I will highlight our process and the design of a system using graph architecture in certain situations that are optimal for replacing old and bloated systems with a more straightforward and mantainable graph database. I will touch on indicators that companies and individual developers can look out for when a graph database might be a good solution to replace an aging or poorly designed legacy system.
The specific goals for our system were the following:
  1. Ensure the new infrastructure significantly reduced human maintenance costs and required minimal effort and resources to keep running. Our ops lead for the activity feature, Ko, was very, very sad when it came to the work he had to do with our Cassandra cluster. It required constant babysitting and his sprints were frequently injected with random Cassandra-related stability and debugging tasks. We thought Neo4j Causal Clustering would be great for reliability and maximum uptime.
  2. Reduce the complexity of the system as a whole. I thought we could get this project done with much fewer than 48 nodes (the size of our Cassandra cluster).
  3. Significantly improve the performance of writes while keeping reads fast. Our Cassandra architecture used “fanouts” which meant that many of our writes were slow since we had to fan out each action a user took to all of the people who followed them (in order to create each activity item for each activity feed).
  4. Increase the flexibility of the system, in turn making it easier to add new features. We stopped improving our activity feed feature because our Casandra implementation and schema were so rigid and made it almost impossible to build on top of.
  5. Reduce data storage size. I thought we could avoid most of the repeating data we needed to store in Cassandra since we could generate users’ feeds dynamically instead of storing each feed item for every single user.
Those are the goals we set out before we began the project, and I think we accomplished them all. I’ll explain some of the challenges we faced during the process of buiulding the new system, including how we dark tested all of the functionality, came up with an effective disaster recovery/backup process, introduced Neo4j into Adobe as an enterprise solution, and more.
Here’s some of the wins we’ve seen already from the project:
  • Our Neo4j activity implementation has led to a great decrease in complexity, storage, and infrastructure costs. Our full dataset size is now around 40 GB, down from 50 TB of data that we had stored in Cassandra. We’re able to power our entire activity feed infrastructure using a cluster of 3 Neo4j instances, down from 48 Cassandra instances of equal specs. That has also led to reduced infrastructure costs. Most importantly, it’s been a breeze for our operations staff to manage since the architecture is simple and lean. Our ops guy Ko has finally gotten his life back!
  • We’ve seen some significant performance improvements with the new implementation. We’ve been able to cut the time from sign-up to initial activity experience from 1.4 seconds to 400 milliseconds on average. In addition to the speed improvement, users’ initial curated feed is now much more complete.
  • We’ve seen the most significant improvements in writing activity data. For example, one of our write job processes (when a project was featured in a curated gallery) used to take 12 minutes on average to run and consumed significant application resources. Now, on average, that write operation takes 106 milliseconds.
I think results like these are easily obtainable when the graph database format is used to optimize the way related data is stored. The benefits of the graph format make it possible to overhaul legacy systems for improved maintainability, simplicity, and ability to constantly innovate on features and products.

Speakers
avatar for David Fox

David Fox

Software Engineer, Adobe
Application developer/data engineer with 10 years of experience developing high-performance backend systems and working with a large variety of databases alongside massive datasets. Software engineer at Adobe.


Thursday September 20, 2018 11:45am - 12:25pm EDT
1. Westside Ballroom - North
  Session