Last two days I had been at the Fifth Elephant Conference on Big Data Analytics. Just sharing a quick summary on my notes from the conference.
The general notion is that Bangalore is 10 years behind Silicon Valley, but actually we aren’t really bad! Thanks to Indian Internet companies like Inmobi and Flipkart for being brave to invest in technology and and also reap the benefits of doing so. A couple of life sciences companies – Strand and Map My Genome caught my admiration and left my colleague and I hoping to do some work in the life science space. Apart from them, there were representations from Cisco, Helpshift, Aerospike, Indix, GE, Target, AmEx and Myntra. Aerospike was quite impressive and looks like they have a sound tech and marketing team. Was also nice to see Nexus Venture Partners put up a booth with a bunch of startups they had funded.
Big Data analytics is majorly being applied in the following fields:
- Fast look up and low latency in e-commerce and bidding web sites.
- Building a recommendations engine using machine learning.
- Click stream analytics and user behaviour analysis.
- Life sciences – my thumbs up for this!
Firms have started moving towards in memory system for their real time use cases.
Lot of interest and work happening in the machine learning and predictive analytics space – I was embarrassed with my little knowledge in this space 😦
Scalability and operational challenges were more highlighted than usability.
A person from tech leadership pointed out that the big data analytics space is messy with too many solutions and they would prefer some sought of a standard that would work well for a longer time. Cannot agree more!
Internet of Things is getting traction – GE mentioned investing significantly in IOT.
The tech community looked to be proud mixing and matching numerous technologies to come up with their tech stack.
The popular one in Q&A is “this” vs “that”. Problem with too many choices!
The popular tools in the Big Data Analytics stack
Hbase – Was surprised to see wide adoption. Flipkart uses Hbase for their catalog and also for storing events that trigger their notifications.
Redis – Lot of companies using Redis for fast storage
Kafka – Emerges as the undisputed leader for pub-sub at scale.
Elasticsearch & Solr – Almost everyone knew ES, and use it as well.
HDFS – Of course!
MongoDB – Preferred NoSQL DB
Mahout – Used by quite some Data Scientists
Some months back I came across a question if the hype around Big Data will become like that of CRM and die off in a while – Definitely NO!