Social Registry
Social Registry (SR) is an independent module offered by OpenG2P to enable creating registries of individuals and groups of people with demographic data with advanced features that makes the SR interoperable and easily fit into the digital public infrastructure (DPI) infrastructure of a country.
Some of the key benefits of using a SR are:
Issue Verifiable Credentials (VCs) to registrants
Share data with other departments and organizations in a standardized manner thus avoiding multiple collection of data
Provide control to individual persons of their data empowering them
Attestation
Visualization and analysis of social data
Registry update mechanisms
Individual login and update
Admin login and update
Bulk update using CSV
Import of data from others sources
Offline update using ODK Central
Functionality and features
Registry of human demographic data
Should be a trusted source of truth
Attestation
Verifiable Credentials: should be able to issue VC
Person should have control over his/her data – person should be able to update the data (self service)
Relationships between people
Groups and Households
Privacy & security of data (using MOSIP encryption modules)
Should be possible to share this data with others (DPI)
Compliant to standards like DCI/G2P Connect/GovStack
Should be possible to add fields in the registry
Timestamped data
Change log
Multiple versions of person record
Reporting (Statistics)
Design
Change log
Change log can be build using the Odoo OCA package Audit Log. This would be set of changes in any field(s) for the registry.
Multiple versions of data
Multiple version of a person's record might come in the following scenarios
Change in field value
New updated record coming in for the person which would be termed as a named version (typically surveys)
Feedback call from the connected applications
Proposed solution: Utilizing Elasticsearch as the backbone, we aim to implement a robust solution for managing multiple versions efficiently. Below are the key technology components and strategies discussed:
Debezium configuration: Debezium will be configured to capture real-time changes from the database's Write-Ahead Logs (WALs), ensuring that any modifications or additions to the data are promptly recorded.
Elasticsearch setup: Elasticsearch will serve as the primary destination for streaming the captured changes. Leveraging its indexing capabilities, Elasticsearch will efficiently organize and store the data, facilitating quick retrieval and analysis.
Indexing strategy: An indexing strategy will be devised to optimize the storage and retrieval of captured data. This strategy will accommodate multiple versions of records, ensuring that historical data remains accessible and searchable.
Authorization implementation: Authorization mechanisms will be implemented within Elasticsearch. This will control access to the API endpoints, ensuring that only authorized users can interact with the data.
API Endpoints configuration: API endpoints will be configured in Elasticsearch to expose the captured data to authorized users. These endpoints will provide seamless access to multiple versions of records, enabling users to retrieve and analyze data as needed. Client should be in a position to fetch based on named version or any other parameter like timeframe.
Relationships between people
Relationship functionality would primarily address the family relationship.
Parent info for a registrant
Parent1Id
Parent2Id
IsAdopted
GaurdianId
Spouse Relationship (new table)
Person1Id
Person2Id
IsMarried
MarriedOn
IsSeperated
SeperatedOn
Current thought process is to set all possible relations with the above data structure.
Groups and Households
Attestation
Attestation table
id of the field
status (NEW, ATTESTED, REJECTED ..)
attested by
attestation datetime
comments
The status
fields will come from business processes and real use cases.
User interface
UI required for the following:
Person to log in, view and update records
Admin to view and attest fields with comments
Download of CSV for chosen fields of registry
Upload of attested CSV
Bulk attestation
We should be able to download a CSV from the registry, apply bulk attestation, and upload back the CSV. The upload should trigger an update of registry, change log and attestation table
Attestation table
id of the field
status (NEW, ATTESTED, REJECTED ..)
attested by
attestation datetime
comments
Search (WIP)
SR may contain several million records (say, 20 million) capturing demographic fields of individuals. The number of demographic fields may be large, say 60-70 columns in the database. We need a quicker search based on various columns. The following methods are proposed for faster search of large data in the social registry.
Indexing: Index columns based on the most frequent queries.
Opensearch: Use Reporting infrastructure to make all the data available in OpenSearch.
Citus: Consider splitting DB into multiple databases using Citus. This is an infrastructure layer.
Indexing should be the first approach, but if the number of columns are large, indexing with bloat the DB. In addition to indexing, we must make data available in OpenSearch. The data shunted into OpenSearch would not contain PII data. Any query on SR would first be routed to OpenSearch, and a list of IDs obtained. Then given a list of IDs, further query on the DB may be performed to fetch the results. The OpenSearch framework will also help us generate reports and real time stats. One issue with this approach is to make sure data is not missing in OpenSearch. Some reconciliation will have to be done periodically to ensure that all IDs present in DB are available in OpenSearch.
API
The SR exposes the following REST APIs:
TBD
Last updated