Node Architecture For Enterprise
This page describes the overall technology stack used within the IB appliance, and to some degree also within the supporting business systems.
Contents
Overview
The diagram below gives an overview of the technologies used in the different nodes.
Guiding Principles
The main guiding principle, for choosing technologies is that:
We want to have has few technologies as possible, yet for each problem we want to have the best possible technology.
- We want few technologies, because the fewer we have, the easier they are to master. At the same time we want the best technologies, because they solve our problems in the most efficient way.
- We should never have two different technologies for solving the same problem.
Play Framework
The AccountIT application will be based on the Play Framework. Play provides a UI framework that has support for HTML5 and JavaScript making it possible to provide a modern web-application feel. The framework provides a developer friendly environment making it easy to develop and test - this is the primary reason for choosing this framework. Development can be done in both Java and Scala - giving possibility to mix Object Oriented with Functional Programming.
Messaging
Messaging is used to build a asynchronous system based on the CQRS pattern. The commands represent update requests, while query represent search requests. RabbitMQ will also be used to integrate between AccuntIT and supporting business systems (such as CRM, ERP, support and sales systems). But also between supporting business systems.
Datastore
We are moving into no-sql database models, because these support our data requirements the best. Several database models and technologies must be chosen, to support our wide variety of data. The diagram below indicates which database models are best applied where. In case we use third party applications, the database model is dictated by the application vendors.
<< Figure of database types >>
So far Riak CS has been chosen for key-value-oriented data. Riak CS is chosen because it provides an attractive peer-to-peer distributed clustering setup and supports very big files. As per our guiding principles we should not have any other technologies for this type of data, unless there is a very specific benefit from doing so.
The technology for other database models has still not been chosen. However the following candidates are being considered:
Cassandra, Neo4j or traditional relational database for search-oriented data Neo4j or traditional relational database (star-schema) for analysis-oriented data; unless the analysis is done in third party systems that provide their own databases Riak CS or workflow frameworks/systems for process-oriented data; again third party systems may be the better solution in this case
All of our existing systems use traditional relational databases, mostly Oracel, and we may still keep that technology, even just for legacy reasons. In that case the Oracle-compliant EnterpriseDB is considered an attractive candidate.
Monitoring
With a system landscape consisting of nodes in clusters monitoring occurs at two levels:
At node level with each node exposing monitoring information through a monitoring agent. A surveillance / monitoring tool used by IT operations can collect the information provided by the monitoring agent An aggregated view of the cluster of nodes, displaying the overall state of the cluster of some node. The monitoring aggreate is hosted by the "Management Node"
The external inteface of a monitoring agent is provided by http://jolokia.org, which exposes JMX MBean information via a JSON over HTTP. Jolokia is packaged within Hawtio, which additionally provides a Web UI to the JMX MBean information that jolokia provides. Thus each node as well as the aggreated information is available via a web-browser through Hawtio.
For non-java applications like RabbitMQ and RIAK, there is a need for a java client that can collect relevant information regarding the applications health state which can be exposed using jolokia and Hawtio.
Deployment
A very interesting technology to look at it the Apache Chukwa log collection and analysis framework . This clustered framework is ideal for logging from many nodes and even many IB appliances, which is required for both system and business level monitoring. Alternatives include Apache Flume, Scribe (developed at Facebook) and Fluentd. Another area that needs a solid technological foundation is network communication stack frameworks for implementing the ETI gateways. Automatic scaling and deployment needs to be looked at. Tools and APIs for monitoring, metering and managing the appliance also needs to be looked at. Such tools could include HAWTIO, JMX etc. Integration into underlying operations frameworks such as Nagios or Tivoli is also necessary. Requirements are not fully known at this time. Authentication, integration into Microsoft Active Directory (AD), LDAP, OAuth and SAML. Requirements are not fully known at this time.