Tech Goodies

Wednesday, January 11, 2017

Microservices for better scale

I was reading some article and which triggered the question in my mind w.r.t business services particularly "Microservices":

What do I need from CAP theorem, should System be AP system (consul, eureka, etc) or CP system (zookeeper, etcd, etc). How to decide about it?
Figure out how to run, manage, and monitor these systems at scale. How to plan for it?

Some of the points were answered once I built a on the small microservice demo.

Following slide would help:

Microservices forscale from Deepak Singhvi

Download source code from github.

Eureka (Discovery Server/Service):
Eureka developed by Netflix is a REST based service that was primarily used by them in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers.

Eureka also comes with a Java-based client component,the Eureka Client, which makes interactions with the service much easier. The client also has a built-in load balancer that does basic round-robin load balancing.

Zuul (Gateway/Proxy and Load Balancer):
Zuul is a JVM based router and server side load balancer by Netflix. And Spring Cloud has a nice integration with an embedded Zuul proxy.
There are many usages of Zuul I found following would be very helpful:

Authentication
Dynamic Routing
Service Migration
Load Shedding
Security

For proxyservice have a service and enable zuul proxy @EnableZuulProxy and define the routes.

Routing to be configured in configuration file:

Any request which catalogservice would be routed to serviceId catalogservice which in this case is catalogservice registered with Eureka.

Spring Cloud has created an embedded Zuul proxy to ease the development of a very common use case where a UI application wants to proxy calls to one or more back end services. This feature is useful for a user interface to proxy to the backend services it requires, avoiding the need to manage CORS and authentication concerns independently for all the backends.

There are few pre created filters and custom filter (see PreFilter.java) can also be created easily.

Ribbon (Load Balancer):
Ribbon is a client side load balancer which gives you a lot of control over the behaviour of HTTP and TCP clients. Feign already uses Ribbon

Feign (Web Service Client):
Feing offers a solution to dynamically generate clients from the interface. Additional benefit is that we can keep the signature of the service and client identical. Just declare an interface request mappings :

Above example would be invoking the catalogservice's (CatalogController) getItems().

The Ribbon client above would discover the physical addresses for the "catalogservice" service. If application is a Eureka client then it will resolve the service in the Eureka service registry. If Eureka is not used, than configuring a list of servers in your external configuration would also help ribbon to find and load balance appropriately.

:-)

Wednesday, October 12, 2016

Visualisation using d3.js based Sunburst with Apache Zeppelin

Zeppelin provides few default visual components (pie, bar, stacked, area, line chart, etc).
If users want either they can add a new default component or create visualisation using AngularJS interpreter.

I tried to create d3 based Sunburst for preparing a report in Apache Zeppelin.

It is easy and quick.

Apache Zeppelin display system adds additional div(s) and which creates some blank area on the screen.

You can experience this as there is a blank area between sunburst and breadcrumbs in the bottom.

********

Notebook for the above visualisation is available here which can be imported into Apache Zeppelin.
This contains the AngularJS source for sunburst visual.

Content which I downloaded from NSE historical data section and transformed it for demo purpose i.e. nsecombinedreport.csv can be downloaded from here.
This report is for the various Instrument Type, Security and the amount traded for a day.

Sunday, May 15, 2016

Focusing on implementing govt policies using the big data tool zeppelin

It was good to know from the goverment that it published lots of data collected over the period of time at https://data.gov.in/

I picked and amenities data about the villages from https://data.gov.in/catalog/village-amenities-census-2011 to do some analysis.

I believe govterment is doing sufficient analysis to find where and with what force it should use its machinery to promote its schemes.

I have been doing some analysis using the Apache Spark and eco system around it. But was interested in a quick visualization, which would help to understand the data quickly. A possible use would be using R as I wanted to build the reports quickly. I explored some of the capabilities of R and Shiny App in my earlier post of Custer Analysis of banking data.

Recently I came to know about a fantastic tool, its a web based notebook, with the in-built support for Apache-Spark, with a support of multiple langues like Scala, Python, spark sql and so on and most important that this it is opensource.

"Zeppelin"

I picked one of the csv from the the whole data, and which is for one of the district in Karnataka state is Gulbarga and started doing some analysis.

Loading the data into the dataframe/table

It is easy to accomodate spark sql also in the notebook paragraph/sections.
Following is a very simple query to show the population spread in the villages of Gulbarga district.

Goverment make policies and spend money on that, and find the effectiveness of it based on the result. We can use the collected data to understand where should be the maximum penetration of the schemes, i.e. find the villages which needs the goverment schemes most. One of the example where goverment can initiates its policies to reduce the gap of male-female ratio, we can understand from the data available, where should be the more focus.

Changed the minbenchmark to 80% and same got updated on the fly

I stated to analyse this data to check for the education facilities in the villages which is in progress, would be publishing that information in later posts.

Installation details:
a) Zeppelin was deployed on Ubuntu VirtualBox with Windows as host.
b) Set your java home (1.7) before starting Zeppelin.
c) To start execute 'zeppelin-daemon.sh start' in the ZEPPELIN_HOME\bin