#microtip – Center a div horizontally and vertically

A small tip that can be handy say when you are creating a landing HTML page. The goal is to position or align a div in the centre of the browser. There are many misleading and non working example when you google this, so pasting a simple solution that works.

HTML

<div class="outer-container">
 <div class="inner-container">
 <center>
 <h3>Hello Center!</h3>
 </center>
 </div>
</div>

CSS

.outer-container {
 position: fixed;
 top: 0;
 left: 0;
 width: 100%;
 height: 100%;
}

.inner-container {
 position: relative;
 top: 40%;
}

JsFiddle: http://jsfiddle.net/31L9jdsz/

Advertisements

CORS for play framework 2.3.x java app

CORS is Cross Origin Resource Sharing and allows a browser to access APIs from a different domain. CORS is better than JSONP as it can be applied to Http POST, PUT, DELETE etc. Also using CORS is simpler as there is no special set up required in jQuery UI layer.

Configuring CORS for Play 2.3 java app is different from older versions. The following needs to be done:

1. All API responses from the server should contain a header: “Access-Control-Allow-Origin”, “*”. We need to write a wrapper for all action responses.

2. Server requests like POST, PUT make a preflight request to the server before the main request. The response for these preflight requests should contain below headers:

“Access-Control-Allow-Origin”, “*”
“Allow”, “*”
“Access-Control-Allow-Methods”, “POST, GET, PUT, DELETE, OPTIONS”
“Access-Control-Allow-Headers”, “Origin, X-Requested-With, Content-Type, Accept, Referer, User-Agent”

For achieving #1, do the following in Play: If you don’t already have a Global.java, create one in your default package.

play.*;
import play.libs.F.Promise;
import play.mvc.Action;
import play.mvc.Http;
import play.mvc.Result;

public class Global extends GlobalSettings {

 // For CORS
 private class ActionWrapper extends Action.Simple {
 public ActionWrapper(Action<?> action) {
 this.delegate = action;
 }

 @Override
 public Promise<Result> call(Http.Context ctx) throws java.lang.Throwable {
 Promise<Result> result = this.delegate.call(ctx);
 Http.Response response = ctx.response();
 response.setHeader("Access-Control-Allow-Origin", "*");
 return result;
 }
 }

 @Override
 public Action<?> onRequest(Http.Request request,
 java.lang.reflect.Method actionMethod) {
 return new ActionWrapper(super.onRequest(request, actionMethod));
 }

}

For #2, First make an entry in routes. A preflight request is of Http type OPTIONS. So make an entry like below.

OPTIONS /*all controllers.Application.preflight(all)

Next, define the preflight method in your Application controller. And CORS is all setup!

package controllers;

import play.mvc.*;

public class Application extends Controller {

 public static Result preflight(String all) {
 response().setHeader("Access-Control-Allow-Origin", "*");
 response().setHeader("Allow", "*");
 response().setHeader("Access-Control-Allow-Methods", "POST, GET, PUT, DELETE, OPTIONS");
 response().setHeader("Access-Control-Allow-Headers", "Origin, X-Requested-With, Content-Type, Accept, Referer, User-Agent");
 return ok();
 }

}

Back to learning Grammar with ANTLR

This post is going to be about language processing. Language processing could be anything like an arithmetic expression evaluator, a SQL parser or even a compiler or interpreter. Many times when we build user facing products, we give users a new language to interact with the product. Say, if you had used JIRA for project management, it gives you a Jira Query Language. Google also has a language to search as documented here – https://support.google.com/websearch/answer/136861?hl=en. Splunk has it’s own language called SPL. How to build such a system is what we will see in this post.

A test use case

I always believe to learn something we need to have a problem to solve that can serve as a use case. Let’s say I want to come up with a new language that’s simpler than SQL. Say I want the user to be able to key in the below text:

Abishek AND (country=India OR city=NY) LOGIN 404 | show name city

And this should fetch name and city fields from a table where the text matches “Abishek” and Abishek could be either in some city in India or gone to New York. We also need to filter results that contain the text LOGIN and 404 as we are trying to trace what happened when Abishek was trying to login but landed with some error codes. Say the data is in a database, what we need here is a language parser to understand the input and then a translator that can translate to SQL so that we can run the query on DB.

What is ANTLR?

From antler.org, ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. And ANTLR can greatly help solve our use case pretty quickly. There are few more similar tools like javacc etc, but I found ANTLR to the well documented and top project in this space.

The first step: Grammar

When we want a parser, an approach many take is to go and write the parser from scratch. I remember doing so in an interview where I was asked to write an arithmetic expression evaluator. Though this approach works – it’s not the best choice when you have complex operators, keywords and many choices. Choices are an interesting thing – If you know Scala you will realise 5 + 3 is the same as 5.+(3). Usually there is more than one way to do things, in our example we could either say “LOGIN AND 404” or just say “LOGIN 404”. Grammar involves identifying these choices, sequences and tokens.

ANTLR uses a variant of the popular LL(*) parsing technique (http://en.wikipedia.org/wiki/LL_parser) which takes a top down approach. So we define the grammar from top down – fist look at what the input is – Say a file input can have a set of statements – statements can be classified into different statement types based on identifying patterns and tokens. Then statements can be broken down in different types of expressions and expressions can contain operators and operands.

In this approach a quick grammar I came up for our use case is like below:

grammar Simpleql;

statement : expr command* ; 
expr : expr ('AND' | 'OR' | 'NOT') expr # expopexp
 | expr expr # expexp
 | predicate # predicexpr
 | text # textexpr
 | '(' expr ')' # exprgroup
 ;
predicate : text ('=' | '!=' | '>=' | '<=' | '>' | '<') text ; 
command : '| show' text* # showcmd
 | '| show' text (',' text)* # showcsv
 ;
text : NUMBER # numbertxt 
 | QTEXT # quotedtxt
 | UQTEXT # unquotedtxt
 ;

AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT' ;
EQUALS : '=' ;
NOTEQUALS : '!=' ;
GREQUALS : '>=' ;
LSEQUALS : '<=' ;
GREATERTHAN : '>' ;
LESSTHAN : '<' ;

NUMBER : DIGIT+
 | DIGIT+ '.' DIGIT+
 | '.' DIGIT+
 ;
QTEXT : '"' (ESC|.)*? '"' ;
UQTEXT : ~[ ()=,<>!\r\n]+ ;

fragment
DIGIT : [0-9] ;
fragment
ESC : '\\"' | '\\\\' ; 

WS : [ \t\r\n]+ -> skip ;

Going by top down approach:

  • We can see than in my case, my input is a statement.
  • A statement comprises of an expression part and a command part.
  • Expression has multiple patterns – it can be two expressions connected by an expression operator.
  • Expression can be internally two expression without an explicit operator between them.
  • An expression can be predicate – A predicate is of the patter <text> <operator> <text>
  • An expression can be just a text. Eg: We just want to do full text search on “LOGIN”.
  • An expression can be an expression inside brackets for grouping.
  • A command has a command starting with a pipe, then a command like “show” followed by arguments.

Creating the Lexer, Parser and Listener

With ANTLR, once you come up with the grammar, you are close to done! ANTLR generates the lexer, parser and listener code for us. Lexer helps with breaking our input into tokens. We usually don’t deal with the Lexer. What we will use is the Parser – The Parser can give us a parsed expression tree like shown below.

Parsed Tree

ANTLR also gives you a tree walker than can traverse the tress and gives you a base listener with methods that get called when the traverser is navigating the tree. All I had to implement the translator was to extend the listener and overwrite the methods for the nodes I am interested in and use a stack push the translations at each node. And that’s all, my robust translator was ready pretty fast. I am not going to post about ANTLR setup and running guide here, because that’s quite clear in their documentation. But feel free to reach out to me incase of an clarifications!

Adding dependency from local maven repository to a Play project

I wanted to add a REST API layer to my java project and decided to use Play framework for the REST APIs. This way, in future I can also use Play to serve static web content.

Play framework uses SBT & Ivy whereas I had used Maven for my Java project. To be able to add the jar as dependency and get it working I had to do the below.

PS: I am using Play 2.3.x

1. Add the dependency in build.sbt and also a resolver pointing to my local Maven repository.

libraryDependencies ++= Seq(
 "my.group" % "my-project" % "1.0-SNAPSHOT",
 javaJdbc,
 javaEbean,
 cache,
 javaWs
)

resolvers += Resolver.mavenLocal

2. Do an “activator update”

3. Do an “activator eclipse”

4. Refreshed the project in eclipse.

The main trick was step#2 to update.
You could also use resolver like below:

resolvers += (
 "Local Maven Repository" at "file:///"+Path.userHome.absolutePath+"/.m2/repository"
)

Apache Spark’s missing link for Realtime Interactive Data Analysis

Spark and Interactive Data Analysis

Interactive data analysis is a scenario when we have a human asking a data question and he expects an answer in human time. Another characteristic of interactive data analysis is that usually a series of questions are asked – an operations analyst investigating site traffic might first want to group by geographic location and then drills down to other demographics like device type, user agent and finally filtering by a suspicious IP. A main requirement here is the ability to cache the data as multiple queries are fired on the same data set – this is where Apache Spark fits naturally. Spark’s RDDs can be cached in memory with graceful fallback which is many times faster than reading from disk and selecting the informative data set every time.

Adding a “Realtime” scenario

The word “realtime” has become a little confusing lately. There are two kinds of realtime here: First, the data needs realtime ingestion and being available for action or querying immediately. Second, a user asks a query and looks for an immediate answer in real time. The second case is same as interactive analysis, the first case is what we’ll focus now.

So, the use case I wish to solve with Spark is realtime and interactive analysis. At first look, Spark looks great with Spark SQL for simplifying access, Spark Streaming for realtime data and the core Spark for data on a Hadoop compatible source. The catch here is how to view and query both streaming data and historical data as a single RDD. In many cases like log files, of click stream events we have a realtime data stream and historical data which functionally is a single table. However, Spark Streaming and Spark design is similar to lambda architecture where you have a separate speed layer and separate batch layer and querying on merged view is a challenge.

The workaround I find here is to keep ingesting the data in realtime into Hadoop and keep recomputing the RDDs for each query or at a particular frequency, but this takes away the advantages of caching RDDs for future queries. I do understand this is an intentional design limitation of RDDs. Well, a problem or a limitation is an opportunity to improvise and I am looking to prototype a solution for this use case. Will be glad to hear any ideas in this space.

Existing solutions: In-memory DBs

Druid Architecture

Druid Architecture

The existing solution for the use case we have been talking about is to use in memory DBs like MemSQL (not open source) or Druid. These DBs are columnar and designed ground up for analytics. However, point to note is that these in memory DBs expect structured data. So we cannot ingest a plain text log file directly into these systems and extract fields for querying like how we do with Spark. However, if dealing with structured data, these in-memory DBs should be a great fit.

Thanks,
Abishek, LogBase

Powering my daily commute with analytics

For the last couple of days I have been wondering what I could do to save on my commute time. Everyday I travel ~25km, and in Bangalore traffic it takes aways 90mins of my time. Of course with experience we learn which route is better and when traffic is lighter, but it would be good if I had data backing and help save few more minutes.

Bangalore Traffic

Bangalore Traffic

I wanted to be able to answer questions like:

  • Can I start a little late on Mondays than Friday?
  • What is the optimum time to start my commute?
  • Which route is good for which day?
  • If it rained an hour back, how does it affect my commute?
  • How is the traffic different for different months?
  • At what rate is the traffic slowing me down every month?
  • When should I take out my car instead of motorcycle?

… and more

I started thinking of doing a small hobby project – a small analytics platform for helping me collect data and analyze. Initially thought of building a Rasperry Pi data logger, but decided to start with a cheaper version using my android phone. There are existing android apps that track your location and can plot a map or speed chart – But I wanted raw data so that I have the flexibility to come up with my own queries.

Data Collection

App User Interface

App User Interface

I wrote a native android app that I can start when I start my commute every day and stop at the end. The app takes the location data from the device GPS and stores it in it’s embedded SQLite database. Later whenever my device is connected to the internet, the app allows me to post the collected data to a cloud storage.

Cloud Storage

I wanted to store collected data in a Cloud Database so that data sync is available easily. I wrote a very simple REST API that allows posting JSON and hosted the server in Heroku. This API receives logged data from the mobile device and in turn persists in MongoHQ who generously gave out 512MB of free storage. For now I felt this is sufficient and wanted lower cost than scalability.

Analytics Interface

My requirement here is to be able fetch data from my Mongo cloud, slice and dice it on the fly, be able to plot charts and do some statistical analysis in an interactive way. I used Python and it’s a great fit here – it has a mongodb client, matplotlib for plotting charts, almost no learning curve compared to R, IPython and notebook interfaces along with plenty of modules for data analytics.

Test Run

I took a short ride out in the night – so there wasn’t traffic. But I did halt a minute in-between and came back. There is still some scope for calibrating the logging frequency as I found out my logger showed 1.2km in total against my bike showing 1.6km. But quite a good test run to start with as it did capture my halt!

Tripmeter after test run

Tripmeter after test run

Data logged in MongoHQ

Data logged in MongoHQ

Dist Vs Time Plot

Dist Vs Time Plot

The plan is to keep collecting data, try some predictive analytics with this and keep exploring insights.

If you would like to collect and play with your commute data feel free to fork my project on GitHub:

https://github.com/cyberabis/logstr

https://github.com/cyberabis/logstr-server

Setting up an Ember project using Yeoman in Ubuntu 14.04

1. We need C++ compiler.

sudo apt-get install build-essential libssl-dev

2. Install git, bower needs this for fetching dependencies.

sudo apt-get install git

3. Install NVM. Find latest version of NVM from https://github.com/creationix/nvm.

curl https://raw.githubusercontent.com/creationix/nvm/v0.13.1/install.sh | bash

4. Install node through NVM. Do not install node through apt-get nodejs as this will result in npm install commands always requiring root user access and causing “EACCESS” errors. Latest version of node at this time was 0.10.30. After installation close and re-open terminal.

nvm install 0.10.30

5. Install compass, this is also required by the grunt command. This requires ruby to be present.

sudo gem install compass

6. Install libfontconfig – this is required by PhantomJS. 

sudo apt-get install libfontconfig

7. PhantomJS is required by Grunt, so install that via npm.

npm install -g phantomjs

8. Install Yeoman. Don’t do this with “sudo”!

npm install -g yo

9. Install ember generator. Don’t do this with “sudo”!

npm install -g generator-ember

 10. Create a project director and cd to that directory.

mkdir my-proj
cd my-proj

11. Create the project scaffolding.

 yo ember

12. Do a bower install in the test directory

cd test
bower install

12. Build the project.

cd ..
grunt

13. Start the application. Checkout the app at localhost:9000 (default port is 9000).

grunt serve

14. I found that the path to NVM is set up temporarily and it gets lost when we exit the shell. So next time, you may find that yo, grunt commands are not in path. To fix this make a permanent entry in the profile file.

Example: export PATH=/home/vagrant/.nvm/v0.10.30/bin:$PATH

Your ember project is initialized and checked out! If you are running the app in a vagrant instance, make sure you replace the server config in Grunt.js

connect: {
            options: {
                port: 9000,
                // change this to '0.0.0.0' to access the server from outside
                hostname: '0.0.0.0'
            },
...