Mobiloitte Group
Contact Us
AI Governance Under the DPDP Act -Training, Inference, and Erasure

AI Governance Under the DPDP Act -Training, Inference, and Erasure

The DPDP Act applies fully to AI workloads in India. The Act does not have a separate carve-out for AI training, AI inference, or AI-driven decisions. Personal data is personal data regardless of which system processes it.

But the operational implications for AI are specific and substantial. AI workloads create four categories of complication that traditional data processing does not, training data consent, inference and erasure propagation, automated decision explainability, and cross-border model hosting. This article walks through each in operational depth.

Complication 1: Training data consent

Using customer personal data to train AI models is processing under the DPDP Act. It requires consent, and consent for service delivery is not consent for AI training, because the Act treats them as different purposes.

The architectural problem is that most Indian enterprises now training AI models have customer datasets that were collected over years, with consent text drafted before AI training was on anyone's mind. Backfilling consent is non-trivial, Data Principals must be re-contacted and given the chance to grant or refuse the specific purpose of AI training.

Three viable approaches:

● Obtain specific fresh consent for AI training, either as part of consent refresh cycles or at meaningful customer interactions
● Anonymise the data sufficiently that it falls outside the DPDP scope, recognising that 'anonymisation' has a high bar and that re-identification risk through combination with other data must be assessed
● Rely on a legitimate use provision where it genuinely applies, recognising that legitimate use provisions in DPDP are narrower than GDPR's legitimate interest basis and should not be assumed to cover AI training broadly

The wrong answer is to continue training AI on personal data with vague service-delivery consent and hope regulatory attention focuses elsewhere. As enforcement matures, this becomes increasingly risky.

Complication 2: AI inference and the right to erasure

When a Data Principal exercises the right to erasure, what happens to the AI systems that processed their data?

The Act does not require retraining of large foundation models for every individual erasure, that would be operationally impossible. But the Act does require that the specific personal data be removed from systems holding it. For AI workloads, this creates specific propagation requirements:

● Vector databases and embedding stores, personal data embedded into vectors must be removable, requiring the architecture to track which embeddings derive from which Data Principal records
● RAG knowledge bases, personal data in retrieval-augmented generation knowledge stores must be removable from the indexed content
● Training datasets retained for retraining, personal data in stored training corpora must be removable so that future model versions do not re-incorporate it
● Fine-tuned model weights, for fine-tuned models trained on personal data, the position is more complex; current regulatory interpretation generally accepts that base model weights need not be retrained for individual erasures, but the underlying training data should be removed so the next training round does not re-train on it

The practical implication, AI architectures need to be designed for erasure from day one, not retrofitted later. Lineage tracking from raw data through embeddings to model behaviour is essential infrastructure.

Complication 3: Automated decision-making and explainability

When AI makes a decision that significantly affects a Data Principal, the Data Principal has rights to information about the decision and its basis. Credit approval, claim settlement, hiring shortlist, fraud flagging, KYC outcome, all qualify.

AI explainability is therefore not optional in regulated decision contexts. The architectural implications:

● Decision logging, every AI-driven decision affecting a Data Principal must be logged with sufficient context to reconstruct the basis of the decision
● Explanation generation, when the Data Principal requests an explanation, the system must produce one that meaningfully describes why the decision was made, not just 'the model output 0.78'
● Human review pathway, the Data Principal should be able to request human review of the decision, with the human having access to the relevant information
● Adverse decision communication, when an AI-driven decision adversely affects the Data Principal, communication of the decision and the path to challenge it must be clear

For BFSI workloads, this connects directly to RBI's fair lending expectations, IRDAI's policyholder protection rules, and SEBI's investor protection framework, DPDP plus sector regulator together create a higher bar than either separately.

Complication 4: Cross-border AI model hosting

Many AI workloads in India use foundation models hosted outside India, OpenAI, Anthropic, Google, AWS Bedrock, and others. DPDP's cross-border framework allows this by default, but with two specific risks:

The Central Government can restrict transfers to specific countries by notification. Organisations need to monitor the position and have contingency plans if a critical hosting destination becomes restricted. The contingency plan should include alternative hosting locations (Indian data centres, alternative cloud regions, on-premise inference where viable for the use case).

Sector regulators, RBI in particular, have their own data localisation rules that may go further than DPDP. Payment data localisation, for example, requires certain data to be stored only in India. AI workloads processing payment data must respect both DPDP cross-border rules and RBI localisation, typically meaning Indian-hosted inference or on-premise deployment.

The unified AI governance posture

Mature AI governance under DPDP integrates all four complications into one architecture:

● Training data inventory, every dataset used for training, with consent state and purpose documentation
● Lineage tracking, from raw data to embeddings to model behaviour, sufficient to support erasure propagation
● Decision logging, for every Data Principal-affecting AI decision, with context sufficient to generate explanations
● Explanation infrastructure, automated generation of meaningful explanations for adverse decisions, with human review pathways
● Hosting topology, documented for each AI workload, with contingencies for cross-border restrictions
● AI-specific DPIA template, used for new AI deployments processing personal data, covering all four complications

The shift to make

Stop treating DPDP as a privacy team concern and AI governance as a data science team concern.

Start treating AI governance under DPDP as one architecture, designed jointly by privacy, security, data science, and engineering, with shared accountability. The technical questions, lineage, embeddings, explainability, are not separable from the regulatory questions, consent, erasure, decision rights. The teams that build them together build durable advantages. The teams that build them separately discover the gaps the hard way.

Indian enterprises that build AI on DPDP-aligned architecture today are positioning themselves for a regulatory environment that will only tighten. The cost of building it right from the start is much lower than the cost of retrofitting after enforcement crystallises.

Avni Chadha

Avni Chadha

SEO Executive

Avni Chadha is an SEO Expert at Mobiloitte Technologies Pvt. Ltd., specializing in search engine optimization and strategic content writing. She focuses on building data-driven content strategies that improve search visibility, organic growth, and digital brand presence.

Connect on LinkedIn ↗

Ready to Accelerate Your Enterprise Growth?

Connect with our international leadership team to explore custom development, workflow automation, and regional delivery models.

Connect with our Partners
Global Corporate Consultation