OpenG2P Deployment Architecture

Complete information and guide on deployment of OpenG2P components

OpenG2P’s offers a production-grade, Kubernetes-based deployment architecture designed to deliver secure, scalable, and reliable deployments of OpenG2P modules. Built on a robust Kubernetes orchestration framework, it supports multiple isolated environments—such as Development, QA, Demo, Staging, Pilot and Production —within a single organisational setup, enabling seamless management across the entire software lifecycle.

This deployment architecture ensures secure access for internal development teams and has been rigorously tested, earning an A+ rating in third-party penetration testing, underscoring its strong security posture. By leveraging the same deployment model for development as well as production, it facilitates an easy and efficient transition from development to production environments, significantly reducing complexity and risks.

The deployment is offered as a package of instructions, scripts, Helm charts, utilities and guidelines. enabling system implementors to rapidly deploy OpenG2P securely thereby saving time and resources substantially and by eliminating the need to build production-grade deployment setups from scratch.

The deployment is cloud agnostic - it does not use cloud specific components - completely suitable for on-prem setups.

Deployment architectures

Depening on availability of compute resources and scale of your deploment we recommend the following deployment architectures:

Architecture
Descripion
Purpose

Single-node

All components including Kubernetes, Wireguard, Nginx, NFS run on the same machine. Multiple environments run in separate Kubernetes namespaces. PostgreSQL runs at Docker within each namespace.

Development | Pilots

Well suited for getting started with OpenG2P for creating development sandboxes like dev, qa etc. This setup can also be used for small scale pilots.

Three-node

The storage server is separated from the compute server (Kubernetes). PostgreSQL server runs on a separate "storage node" that contains large volumes of SSD storage with high througput disk I/O. The NFS also runs on this node. Thus, there is a separate of concerns between compute and data.

Pilots | Small scale production

For pilots and small scale production setups, specifically where I high uptime is not critical. If systems are predominantly used by administrators and some down time of services and portals is acceptable, then this architecture would be sufficient.

Full-scale

Multiple separate nodes for each of Wireguard, Nginx, Kubernetes nodes, NFS, PostgreSQL.

Large scale production

Full scale production deployment for following senarious:

  • Multiple applications need to be supported on the cluster and clear separation of concerns is important.

  • Fail safety is critical — certain services must continue to run without interruptions. This typically will be the case with registration portals/beneficiary portals that have to be kept up and down time is not acceptable.

  • Scale is high in terms of compute requirements.

  • Fine grain access control for various resources of the system.

  • "Circuit breakers" for traffic control and attacks.

circle-exclamation

Single-node

  • Single virtual machine running all services

  • One Kubernetes cluster hosting both Rancher and OpenG2P services

  • Nginx, Wireguard, NFS server running outside the Kubernetes cluster but on the same node

  • Multiple environments like dev, qa, demo etc. as Kubernetes namespaces

  • Access to each environment (namespace) can be controlled via private access channels. (The node needs multiple network interfaces to support the same).

  • SSL termination (HTTPS) happens on the Nginx. The traffic further to Ingress gateway is HTTP.

  • Firewall is outside the purview of this deployment.

  • Git repo and Docker Registry are assumed externally hosted (public or private). For on-prem hosting you will need more resources to host the same as in Three-node setup.

  • As this deployment is based on Kubernetes, the system can be easily scaled up by adding more nodes (machines) as in Full-scale setup.

Three-node

  • Separation of concerns - storage and reverse proxy on separate nodes

  • PostgreSQL server runs on the Storage Node.

  • Only one environment like Pilot or Prod is expected to run on the cluster. Sharing same PosgreSQL server for multiple envirornments is not recommended. If you would like to do the same, make sure names of all databases are different for different environments.

  • NFS server runs on the storage node

  • Storage node is expected to have larger SSD disks and not very high compute capability, while Compute node must have high compute power and RAM. See Resource Requirements.

  • Storage Node can be managed - in terms of access, scale up and backups indendently.

  • Local Git repo and Docker Repositories may be hosted on Storage Node.

  • Access to each environment (namespace) can be controlled via private access channels. (The node needs multiple network interfaces to support the same).

  • SSL termination (HTTPS) happens on the Nginx. The traffic further to Ingress gateway is HTTP.

  • Firewall is outside the purview of this deployment.

Full-scale

  • For multiple applications, large scale rollout where availability, real-time response is critical

  • The Rancher cluster is separated from OpenG2P cluster as Rancher can manage multiple clusters.

  • Organization wide Keycloak runs on Rancher cluster

  • NFS server is hosted on a separate "Storage node".

  • PostgreSQL (although not shown in the diagram) is also hosted on separate servers for production deployments. The same may be run on the above Storage node. Thus PostgreSQL and NFS may run on the same node if load can be handled.

  • Multiple environments can run within OpenG2P cluster (as in single-node and three-node architectures

  • Miniumum number of OpenG2P cluster nodes recommended is 3 nodes — this is for fail safety of Kubrenetes "master" node.

  • More nodes may be added to the cluster as per scaling requirements

  • Wireguard and Load Balancer (Nginx) run on separate nodes for better separation of concens and management.

  • While OpenG2P departmental apps typically don't need such robust infrastructure, it's essential if you want fast-response, beneficiary-facing websites with zero downtime.

Role of various components

The deployment utilizes several open source third party components. The concept and role of these components is given below:

Component
Description

Wireguard

Wireguardarrow-up-right is a fast secure & open-source VPN, with P2P traffic encryption that can enable secure (non-public) access to the resources. A combination of Wireguard, Nginx and Isto gateway is used to enable fine-grained access control to the environments. See Private Access Channels.

circle-info

If you have your own VPN setup, Wireguard is not required. However, it is expected that the implementers take care of setting up secure access; OpenG2P only provides guidance for Wireguard.

The terms Wireguard, Wireguard Bastion and Wireguard Server are used interchangeably in this document.

Nginx

Nginx as a reverse-proxy for incoming external (public) traffic. It serves as HTTPS termination and together with Wireguard and Istio Gateway it can be used to create private access channels. Nginx isolates the internal network such that traffic does not directly fall on the Istio Gateway of the Kubernetes cluster. Nginx node needs to have public IP for public facing portals.

Istio

Istioarrow-up-right is a service mesh that provides a way to connect, secure, control, and observe microservices. It is a powerful mesh management tool. It also provides an ingress gateway for the Kubernetes cluster. See note below.

Ingress Gateway

The Ingressgatewayarrow-up-right component of Istio enables routing external traffic into Kubernetes services. Istio can be configured to do much more. Seen note below.

Rancher

Rancher provides advanced cluster management capabilities. It can also manage several clusters.

Keycloak

Keycloak provides organisation-wide authorisation and offers single sign-on for all resources.

NFS

Network File System (NFS) provides persistence to the resources of the Kubernetes cluster. Although on a single machine installation we can directly use the underlying SSD storage, we prefer to use NFS, keeping in mind scalability in case more nodes (machines) need to be added to the cluster.

Prometheus & Grafana

For system monitoring. Learn more >>

FluentD

For collecting and shunting logs of services to OpenSearch. Learn more >>

OpenSearch

For indexing and search of data. Primary used for logs and reporting framework.

PostgreSQL

Primary database of OpenG2P platform. For production deployment, PostgreSQL is installed on the VM directly (natively) while for sandboxes, PostgreSQL is installed on the Kubernetes cluster inside a namespace using PostgreSQL Docker.

circle-info

Why Istio? What are the benefits of using Istio in OpenG2P setup?

  • We can have advanced traffic management setups like load balancing, retries & failovers, and fault injection for testing resilience.

  • We can use advanced deployment strategies like canary deployments and A/B testing, where Istio can route higher percentage of traffic to specific service versions.

  • We can enable security features like mTLS encryption for service-to-service traffic. Istio can also provide an authentication & authorization layer for services.

  • We can also define policies related to access control & rate limiting. One can define which services are allowed to access other services or limit the rate of requests accepted by a service.

  • More importantly Istio provides comprehensive observability features. We can visualize & monitor service-to-service traffic real-time, with tools like Kialiarrow-up-right, which would help identify performance bottlenecks and diagnose issues.

Base infrastructure

In all the architectures above there is a base infrastructure (comprising of Kubernetes, Nginx, Wireguard, NFS etc) over which specific environments are installed. Refer to the base infrastructure installation instructions here.

Environments

An environment is an insolated setup for a specific purpose like development, testing, staging, production etc. In OpenG2P's deployment model each environment is a namespace in Kubernetes. The namespace contains set of common shared modules - openg2p-commons - and the modules (Registry, PBMS, SPAR, G2P Bridge) themselves along with any third-party dependency modules. Access to each environment can be controlled using private access channels and RBAC of Kubernetes. Only one instance of PostgreSQL server is run per environment which means all modules use the same PostgreSQL server (Dockerized or external - depending on the choice of installation).

While the installation can be easily achieved by provided Helm Charts, tear down of the environment involves few manual steps. Refer to tear down section in the deployment documentation for each module.

Last updated

Was this helpful?