GraphConnect 2018 has ended
Back To Schedule
Thursday, September 20 • 11:00am - 11:40am
Ingesting Data into Neo4j for Master Data Management

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
A graph database is a natural fit for Master Data Management use cases such as building a 360 degree view of the customer. Real-world entities such as customers, products and support tickets, and the relationships between them, can be directly modeled in the graph, allowing analysis and visualization of the combined data set.

Before any of this can happen, though, the data has to be gathered from a diverse set of sources, and ingested into the graph database. Source data may be located in flat files, in any of a wide variety of formats, in relational databases, in cloud-based platforms, or even on a message queue.
Extracting data from its source, transforming it into the required structure and format, and loading it in the graph database (the ETL process) can be a major undertaking. Writing custom scripts, or using traditional ETL tools, results in a brittle solution that fails in the face of changing data structures and requirements such as ingesting streaming data.

In this session, you will learn how to build robust data pipelines to load batch and streaming data into Neo4j. We’ll look at the quirks of different data sources, and examine real-world use cases such as extracting customer data from the cloud and combining it with product data from an on-premise relational database. We’ll cover technologies such as Salesforce, Kafka, Amazon Web Services and Change Data Capture (CDC).

avatar for Pat Patterson

Pat Patterson

Technical Director, StreamSets
Pat Patterson has been working with Internet technologies for over two decades, building software and working with developer communities at Sun Microsystems, Salesforce and StreamSets. At Sun, Pat was best known as the community lead for the OpenSSO open source project; as a developer evangelis... Read More →

Thursday September 20, 2018 11:00am - 11:40am EDT
3. Julliard
  Session, Master Data Management