Difference between revisions of "Node Architecture For Enterprise"

From AccountIT
Jump to: navigation, search
(Guiding Principles)
Line 16: Line 16:
 
- We should never have two different technologies for solving the same problem.
 
- We should never have two different technologies for solving the same problem.
  
= Open Source + Java =
+
= Play Framework =
 
+
In general, for all technologies, we like to choose open source for third party platforms and components, and to use Java for our own development. These choices are based on the following rationale:
+
 
+
    The open source licensing model is highly attractive for lowering costs; in particular when we are looking at the white-label and private cloud offerings. Not only is the software cheaper (if not free), there is usually less overhead in managing licenses as policies change or load scales up.
+
    Open source lowers vendor lock-in and risk. Open source projects often see more competition because the software is reused or cloned in various products and packages, and because the vendors/communities are far smaller than traditional vendors like IBM, Oracle and Microsoft.
+
    Open source generally offers a wider variety of frameworks and middleware, giving us the freedom to pick and choose those that suit our requirements the best. This is even more so with the fundamental changes seen in for example databases (with NoSQL) or peer-to-peer clustering, where the traditional big technology vendors are lacking behind.
+
    Java is a strong and mature platform that is widely used for backend enterprise systems such as ours. It has a huge community and is far less tied in with specific vendors than e.g. Windows/.NET.
+
 
+
Note, that because Java is used for our own development, it does not mean that third part components have to run in a Java runtime.
+
 
+
= HTML5 + JavaScript =
+
  
 
The most important exception to using Java is when developing the application GUI layer. All such GUIs are created as HTML5 + JavaScript for running in client browsers  Supporting these technologies we use jQuery and the third party Kendo framework from Telerik. We do not want to use any other GUI technologies, including Flash, Silverlight and non-browser platforms.
 
The most important exception to using Java is when developing the application GUI layer. All such GUIs are created as HTML5 + JavaScript for running in client browsers  Supporting these technologies we use jQuery and the third party Kendo framework from Telerik. We do not want to use any other GUI technologies, including Flash, Silverlight and non-browser platforms.
  
 
Web GUIs are based on static pages retrieving data through REST services, providing the best method for creating state-of-the-art RIA ("rich internet applications").
 
Web GUIs are based on static pages retrieving data through REST services, providing the best method for creating state-of-the-art RIA ("rich internet applications").
 
= Karaf =
 
 
In general, all applications and components developed by us are hosted within the Apache Karaf OSGi container/application server which provides a common environment for deploying, managing and monitoring the software. On top of the application server, the various ETI processing components use the Apache Camel integration frameworks for building their functionality. Camel is a general-purpose framework for implementing common enterprise integration patterns.
 
  
 
= Messaging =
 
= Messaging =
Line 43: Line 28:
 
RabbitMQ will also be used to integrate into and between the supporting business systems, such as CRM, ERP, support and sales systems.
 
RabbitMQ will also be used to integrate into and between the supporting business systems, such as CRM, ERP, support and sales systems.
  
= Databases =
+
= Datastore =
  
 
We are moving into no-sql database models, because these support our data requirements the best. Several database models and technologies must be chosen, to support our wide variety of data. The diagram below indicates which database models are best applied where. In case we use third party applications, the database model is dictated by the application vendors.
 
We are moving into no-sql database models, because these support our data requirements the best. Several database models and technologies must be chosen, to support our wide variety of data. The diagram below indicates which database models are best applied where. In case we use third party applications, the database model is dictated by the application vendors.
Line 69: Line 54:
  
 
For non-java applications like RabbitMQ and RIAK, there is a need for a java client that can collect relevant information regarding the applications health state which can be exposed using jolokia and Hawtio.
 
For non-java applications like RabbitMQ and RIAK, there is a need for a java client that can collect relevant information regarding the applications health state which can be exposed using jolokia and Hawtio.
= Areas Still to Be Looked at =
+
= Deployment =
  
 
     A very interesting technology to look at it the Apache Chukwa log collection and analysis framework . This clustered framework is ideal for logging from many nodes and even many IB appliances, which is required for both system and business level monitoring. Alternatives include Apache Flume, Scribe (developed at Facebook) and Fluentd.
 
     A very interesting technology to look at it the Apache Chukwa log collection and analysis framework . This clustered framework is ideal for logging from many nodes and even many IB appliances, which is required for both system and business level monitoring. Alternatives include Apache Flume, Scribe (developed at Facebook) and Fluentd.

Revision as of 08:19, 16 March 2014

This page describes the overall technology stack used within the IB appliance, and to some degree also within the supporting business systems.

Overview

The diagram below gives an overview of the technology used in the stack. Or rather stacks in plural, since the various node classes will use different technology stacks.

Node.png

Guiding Principles

The main guiding principle, for choosing technologies is that:

We want to have has few technologies as possible, yet for each problem we want to have the best possible technology.

- We want few technologies, because the fewer we have, the easier they are to master. At the same time we want the best technologies, because they solve our problems in the most efficient way.

- We should never have two different technologies for solving the same problem.

Play Framework

The most important exception to using Java is when developing the application GUI layer. All such GUIs are created as HTML5 + JavaScript for running in client browsers Supporting these technologies we use jQuery and the third party Kendo framework from Telerik. We do not want to use any other GUI technologies, including Flash, Silverlight and non-browser platforms.

Web GUIs are based on static pages retrieving data through REST services, providing the best method for creating state-of-the-art RIA ("rich internet applications").

Messaging

Messaging is provided at two levels: Internally in computing nodes, and externally between computing nodes. The first is provided by Apache Camel (via ActiveMQ) as mentioned above. The second in a distributed cluster-setup by RabbitMQ.

RabbitMQ will also be used to integrate into and between the supporting business systems, such as CRM, ERP, support and sales systems.

Datastore

We are moving into no-sql database models, because these support our data requirements the best. Several database models and technologies must be chosen, to support our wide variety of data. The diagram below indicates which database models are best applied where. In case we use third party applications, the database model is dictated by the application vendors.

<< Figure of database types >>

So far Riak CS has been chosen for key-value-oriented data. Riak CS is chosen because it provides an attractive peer-to-peer distributed clustering setup and supports very big files. As per our guiding principles we should not have any other technologies for this type of data, unless there is a very specific benefit from doing so.

The technology for other database models has still not been chosen. However the following candidates are being considered:

   Cassandra, Neo4j or traditional relational database for search-oriented data
   Neo4j or traditional relational database (star-schema) for analysis-oriented data; unless the analysis is done in third party systems that provide their own databases
   Riak CS or workflow frameworks/systems for process-oriented data; again third party systems may be the better solution in this case

All of our existing systems use traditional relational databases, mostly Oracel, and we may still keep that technology, even just for legacy reasons. In that case the Oracle-compliant EnterpriseDB is considered an attractive candidate.

Monitoring

With a system landscape consisting of nodes in clusters monitoring occurs at two levels:

   At node level with each node exposing monitoring information through a monitoring agent. A surveillance / monitoring tool used by IT operations can collect the information provided by the monitoring agent
   An aggregated view of the cluster of nodes, displaying the overall state of the cluster of some node. The monitoring aggreate is hosted by the "Management Node"

The external inteface of a monitoring agent is provided by http://jolokia.org, which exposes JMX MBean information via a JSON over HTTP. Jolokia is packaged within Hawtio, which additionally provides a Web UI to the JMX MBean information that jolokia provides. Thus each node as well as the aggreated information is available via a web-browser through Hawtio.

For non-java applications like RabbitMQ and RIAK, there is a need for a java client that can collect relevant information regarding the applications health state which can be exposed using jolokia and Hawtio.

Deployment

   A very interesting technology to look at it the Apache Chukwa log collection and analysis framework . This clustered framework is ideal for logging from many nodes and even many IB appliances, which is required for both system and business level monitoring. Alternatives include Apache Flume, Scribe (developed at Facebook) and Fluentd.
   Another area that needs a solid technological foundation is network communication stack frameworks for implementing the ETI gateways.
   Automatic scaling and deployment needs to be looked at.
   Tools and APIs for monitoring, metering and managing the appliance also needs to be looked at. Such tools could include HAWTIO, JMX etc. Integration into underlying operations frameworks such as Nagios or Tivoli is also necessary. Requirements are not fully known at this time.
   Authentication, integration into Microsoft Active Directory (AD), LDAP, OAuth and SAML. Requirements are not fully known at this time.