Dimensional Reduction: Lifting the curse

Tutorial Leader

Tapan Shah

Research Scientist/Engineer at GE Global Research

Collecting massive dimensional data provides great opportunities and also the challenge of sieving out the irrelevant. Dimensional reduction techniques provide the mathematical tools to extract the key variables which accurately model the “underlying” phenomena.

This tutorial is about understanding the principles of dimensional reduction techniques, the simplest linear projections to complex manifold methods. The session will include hands on real world examples.

Pre-requisite: Basic matrix theory and probability

Real Time Analytics of Streaming Big Data

Tutorial Leader

Krishna Kumar

Big Data Architect, Pointillist

Data analysis and data-driven applications are becoming increasingly important and long query times associated with batch frameworks such as HADOOP are becoming less and less acceptable.

The tutorial deals with solutions for Real time Analytics of streaming Big Data at sub-second latency and would cover techniques for high scale data ingestion, processing and Identity Stitching. The tutorial will be based on frameworks like Apache Kafka, Apache SAMZA, Infinispan, Undertow Webserver and Druid with a working demo connecting these technological stacks.