Skip to content

303284: Add deduplication engine reference PK to Individual model and…#5799

Open
arsen-vs wants to merge 1 commit intodevelopfrom
feature/303284-Use-ind-pk-instead-of-originating-id
Open

303284: Add deduplication engine reference PK to Individual model and…#5799
arsen-vs wants to merge 1 commit intodevelopfrom
feature/303284-Use-ind-pk-instead-of-originating-id

Conversation

@arsen-vs
Copy link

… related services

  • Introduced a new field deduplication_engine_reference_pk in the Individual model to facilitate communication with the biometric deduplication engine.
  • Updated the CreateLaxIndividuals endpoint to include this reference in the validated data.
  • Enhanced the BiometricDeduplicationService to utilize the new reference PK for individual identification during deduplication processes.
  • Added tests to ensure the correct usage of the deduplication reference PK in various service methods and endpoints.

… related services

- Introduced a new field `deduplication_engine_reference_pk` in the Individual model to facilitate communication with the biometric deduplication engine.
- Updated the CreateLaxIndividuals endpoint to include this reference in the validated data.
- Enhanced the BiometricDeduplicationService to utilize the new reference PK for individual identification during deduplication processes.
- Added tests to ensure the correct usage of the deduplication reference PK in various service methods and endpoints.
@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.06%. Comparing base (3ac8685) to head (88866c4).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #5799   +/-   ##
========================================
  Coverage    91.06%   91.06%           
========================================
  Files          500      500           
  Lines        34264    34281   +17     
  Branches      3540     3542    +2     
========================================
+ Hits         31201    31218   +17     
  Misses        2272     2272           
  Partials       791      791           
Flag Coverage Δ
e2e 52.75% <29.41%> (-0.02%) ⬇️
unit 90.68% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

migrations.AddField(
model_name="individual",
name="deduplication_engine_reference_pk",
field=models.CharField(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique?

false_positive_pair = IgnoredFilenamesPair(first=individual1_photo, second=individual2_photo)
self.api.report_false_positive_duplicate(false_positive_pair, program.unicef_id)

def report_individuals_status(self, program: Program, individual_ids: list[str], action: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function should now take individuals [Queryset] as an argument to avoid unnecessary queries

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then we can do just

    id_to_reference_pk = {
        str(individual.pk): self._reference_pk_for_individual(individual) for individual in individuals.only("id", "deduplication_engine_reference_pk")
    }

Comment on lines +363 to 386
reference_to_individual_id: dict[str, str] = {}
for pk, dedup_reference_pk in PendingIndividual.objects.filter(
registration_data_import__in=rdis
).values_list(
"pk", "deduplication_engine_reference_pk"
):
reference_pk = dedup_reference_pk or str(pk)
individual_ids.append(reference_pk)
reference_to_individual_id[reference_pk] = str(pk)

data = self.get_deduplication_set_results(program, individual_ids)
similarity_pairs = [
SimilarityPair(
score=item["score"],
status_code=item["status_code"],
first=item["first"]["reference_pk"] or None,
second=item["second"]["reference_pk"] or None,
first=self._resolve_individual_id_from_reference(
(item.get("first") or {}).get("reference_pk"),
reference_to_individual_id,
),
second=self._resolve_individual_id_from_reference(
(item.get("second") or {}).get("reference_pk"),
reference_to_individual_id,
),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

                    PendingIndividual.objects.filter(registration_data_import__in=rdis).only(
                        "id", "deduplication_engine_reference_pk"
                    )
                )
    reference_to_individual_id = {
                    self._reference_pk_for_individual(individual): str(individual.pk)
                    for individual in pending_individuals
                }

    data = self.get_deduplication_set_results(program, list(reference_to_individual_id))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants