perf(grades): optimize database queries for large-scale grade recalcu…#38787
perf(grades): optimize database queries for large-scale grade recalcu…#38787andrey-canon wants to merge 1 commit into
Conversation
|
Thanks for the pull request, @andrey-canon! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
…lations
Replaced inefficient SQL OFFSET pagination with ID-based keyset pagination to ensure consistent lookup performance, and updated ordering to leverage the primary key index. Resolved an N+1 query issue by eagerly loading user data via `.select_related('user')` and optimized memory footprint using `.values_list()`.
These changes reduce execution time by ~25x and eliminate 100 redundant queries per batch during high-enrollment course processing.
b72ca72 to
0c2403b
Compare
Description
This PR addresses significant platform performance degradation caused by inefficient database queries during large-scale course grade recalculations.
The previous implementation relied on SQL$N+1$ query problem by fetching user data individually for every enrollment in a batch.
OFFSETpagination, which forced the database to perform full index scans and discard hundreds of thousands of rows for high-offset tasks. Additionally, it suffered from anChanges
_course_task_argsandcompute_grades_for_courseto usestart_id(ID-based seeking) instead ofoffset. This ensures O(1) database lookup performance regardless of the course size.order_by('created')withorder_by('id')to leverage the Primary Key clustered index..select_related('user')to the enrollment QuerySet to fetch user data in a singleJOINquery, eliminating.values_list('id', flat=True)in the task generator to minimize memory footprint when handling courses with 400k+ enrollments.How to Test
Run the following script in the Django shell (
python manage.py lms shell) on a high-enrollment course:Performance Benchmarks