Where we will be!
Back to All Events

Data Science and Business Analytics - Graph Isomorphisms in GraphX & GraphFrames; and NLP Walkabout

  • Oracle 500 Eldorado Boulevard Broomfield, CO, 80021 United States (map)

IMPORTANT: If your Meetup.com name is not your real first AND LAST NAME, please e-mail that info to [masked]

Security at Oracle is requiring us to provide a list of attendees in advance. Also bring with you a GOVERNMENT PHOTO ID (drivers license, or passport, etc). You may be turned away if they do not have your full name in advance or if you don't bring an ID.

RSVPs will close 1pm Monday, May 9, 2016, in order that I can provide the attendee list to security at Oracle.

On the evening of the event, just come to building 1 and check in with security, then proceed down the hall to the "conference center". We'll be in the nice 75-person conference/training room again this time.


6:00pm Pizza and networking  
6:30pm Announcements  
6:40pm Finding Graph Isomorphisms in GraphX and Graphframes, by Michael Malak
7:15pm NLP Walkabout, by Rob Oberbreckling  
8:00pm adjourn

This event is being held in cooperation with two other Meetup groups -- you only need to RSVP to one of them:

• Graph Nerds of Boulder

• Boulder/Denver Spark


Finding Graph Isomorphisms in GraphX and Graphframes - Abstract

This is a dry run of my upcoming presentation to Spark Summit https://spark-summit.org/2016/events/finding-graph-isomorphisms-in-graphx-and-graphframes/ It will not be live-streamed.

Identifying graph isomorphisms is one of the most powerful graph techniques, and has a wide variety of applications. In this presentation, you’ll see how to find simple graph isomorphisms in GraphX, and how the exciting new GraphFrames from AMPlab — intended for inclusion in Spark 2.x — allows the use of SQL and a subset of Cypher (the query language from Neo4j) to find more complex graph isomorphisms. Applications covered include finding missing data from Wikipedia (using the YAGO3 data set), which is a form of graph mining, and fraud detection. Also covered will be, due to its newness, a brief overview of GraphFrames, its performance over GraphX due to Catalyst and Tungsten, and how to use it to query graphs using SQL and the Cypher subset. 

Michael Malak - Bio

Michael Malak is the lead author of Spark GraphX In Action and has been developing Spark solutions at two Fortune 200 companies since early 2013. He has been programming computers since before they could be bought pre-assembled in stores. 


NLP Walkabout - Abstract

(Live stream available)

Natural Language Processing (NLP) is a broad topic encompassing many different problems requiring very different areas of expertise. And in some associated domains, even the initial problems are difficult to define. This talk will survey a few selected NLP topics at a high (and hopefully entertaining) level. This is not a rigorous how-to, but rather an informal walkabout to appreciate some of the different challenges within the field of NLP. 

Rob Oberbreckling - Bio

Rob Oberbreckling’s interests include applying software development and data science to problems in cognitive science, natural language, and social science.  Rob has also led efforts in retail embedded IOT sensor development, digital video and audio stream analysis for the NFL, and spoken and written communications analysis for automated essay grading and for predicting human behavior.    Rob is currently a Consulting Member of Technical Staff at Oracle.