perf(grades): optimize database queries for large-scale grade recalcu… by andrey-canon · Pull Request #38787 · openedx/openedx-platform

andrey-canon · 2026-06-19T16:53:14Z

Description

This PR addresses significant platform performance degradation caused by inefficient database queries during large-scale course grade recalculations.

The previous implementation relied on SQL OFFSET pagination, which forced the database to perform full index scans and discard hundreds of thousands of rows for high-offset tasks. Additionally, it suffered from an $N+1$ query problem by fetching user data individually for every enrollment in a batch.

Changes

Keyset Pagination: Refactored _course_task_args and compute_grades_for_course to use start_id (ID-based seeking) instead of offset. This ensures O(1) database lookup performance regardless of the course size.
Database Optimization: Replaced order_by('created') with order_by('id') to leverage the Primary Key clustered index.
Eager Loading: Added .select_related('user') to the enrollment QuerySet to fetch user data in a single JOIN query, eliminating $100$ extra queries per batch.
Memory Efficiency: Used .values_list('id', flat=True) in the task generator to minimize memory footprint when handling courses with 400k+ enrollments.

How to Test

Run the following script in the Django shell (python manage.py lms shell) on a high-enrollment course:

from common.djangoapps.student.models import CourseEnrollment
from opaque_keys.edx.keys import CourseKey
import time
from django.db import connection, reset_queries

course_key = CourseKey.from_string("your/course/id")
batch = 100
offset_test = 440000 

# Benchmark Legacy Logic
reset_queries()
st = time.time()
enrollments_legacy = CourseEnrollment.objects.filter(course_id=course_key).order_by('created')[offset_test:offset_test + batch]
ids_legacy = [e.user.id for e in enrollments_legacy]
print(f"Legacy Time: {time.time() - st:.4f}s | Queries: {len(connection.queries)}")

# Benchmark Optimized Logic
start_id = CourseEnrollment.objects.filter(course_id=course_key).order_by('id')[offset_test].id
reset_queries()
st = time.time()
enrollments_new = CourseEnrollment.objects.filter(course_id=course_key, id__gte=start_id).select_related('user').order_by('id')[:batch]
ids_new = [e.user.id for e in enrollments_new]
print(f"Optimized Time: {time.time() - st:.4f}s | Queries: {len(connection.queries)}")

Performance Benchmarks

Metric	Original (Offset + N+1)	Optimized (Seek + Join)	Improvement
Execution Time	~0.9614s	~0.0388s	~25x faster
DB Queries	101	1	100 fewer queries

openedx-webhooks · 2026-06-19T16:53:20Z

Thanks for the pull request, @andrey-canon!

This repository is currently maintained by @openedx/wg-maintenance-openedx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
- This process (including the steps you'll need to take) is documented here.
If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

Dependencies

This PR must be merged before / after / at the same time as ...
Blockers

This PR is waiting for OEP-1234 to be accepted.
Timeline information

This PR must be merged by XX date because ...
Partner information

This is for a course on edx.org.
Supporting documentation
Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

The size and impact of the changes that it introduces
The need for product review
Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

…lations Replaced inefficient SQL OFFSET pagination with ID-based keyset pagination to ensure consistent lookup performance, and updated ordering to leverage the primary key index. Resolved an N+1 query issue by eagerly loading user data via `.select_related('user')` and optimized memory footprint using `.values_list()`. These changes reduce execution time by ~25x and eliminate 100 redundant queries per batch during high-enrollment course processing.

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Jun 19, 2026

openedx-webhooks added this to Contributions Jun 19, 2026

github-project-automation Bot moved this to Needs Triage in Contributions Jun 19, 2026

andrey-canon force-pushed the and/optimize-grade-recalculation-queries branch from b72ca72 to 0c2403b Compare June 19, 2026 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(grades): optimize database queries for large-scale grade recalcu…#38787

perf(grades): optimize database queries for large-scale grade recalcu…#38787
andrey-canon wants to merge 1 commit into
openedx:masterfrom
eduNEXT:and/optimize-grade-recalculation-queries

andrey-canon commented Jun 19, 2026

Uh oh!

openedx-webhooks commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrey-canon commented Jun 19, 2026

Description

Changes

How to Test

Performance Benchmarks

Uh oh!

openedx-webhooks commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants