Tuesday, June 27, 2023

Dave Computes Cohort Groups for Hybrid School during COVID-19

To permit social distancing during the COVID-19 pandemic, the school where I teach decided on a hybrid learning model. Each day, half the students in grades 6 - 12 would come to school in person and the other half would stay home and learn online. The school then changed its plan to split students into 4 groups, with only 1 group attending school at a time. Given the complexities of our schedule, I was skeptical that this could be done without very lopsided class sizes. And I was certain the school had no software or process to create these groups. I therefore wrote a program to put students into groups that would allow for balanced class sizes. I was shocked when it successfully generated fairly balanced classes. Here's the algorithm I used.

1.  I assign each student to a random group.

2.  I determine how balanced each section is. For example, suppose a particular section of some class has 18 students. If we're dividing students into 2 groups, then ideally there'd be 9 students in each group. But we'd be fairly happy to have 10 students in one group and 8 in the other. We'd be much less happy with an 11-7 split, and quite disappointed with a 12-6 split.

I want to pick a numerical score that captures how lopsided the split is. One way is to look at how much each group deviates from the ideal size. A 9-9 split would have a score of 0. A 10-8 split would have a score of 1, since 10 is one more than ideal and 8 is one less than ideal. An 11-7 split would have a score of 2, and 12-6 would have a score of 3. Technically, this number is called the standard deviation.

The problem with this approach is that, while 10-8 is only slightly worse than 9-9, intuitively 12-6 feels MUCH worse than 11-7. I therefore used the square of the deviation as my lopsidedness score. Thus, 11-7 has a score of 2 × 2 = 4, while 12-6 has a score of 3 × 3 = 9, as shown below.

Grp1  Grp2  StDev  Score
 9      9    0     0 × 0 = 0
10      8    1     1 × 1 = 1
11      7    2     2 × 2 = 4
12      6    3     3 × 3 = 9

(Technically, my lopsidedness score is called the variance, and it can be calculated for any number of groups--not just 2. For example, if students are split into 3 groups, then the ideal group size for a section of 18 students is 18 / 3 = 6 (technically called the arithmetic mean). If the students in this section are split 8-7-3, then the variance is [(8 - 6)² + (7 - 6)² + (3 - 6)²] / 3 ≈ 4.67.)

3.  I calculate the lopsidedness score for every section in the schedule. I add up all of those scores. That sum tells me how lopsided the entire schedule is.

4.  I pick a random student and I assign them to a different random group. Then I calculate the lopsidedness score for the entire schedule again. If the new score is higher than the old one, then the schedule is less balanced than before. I would therefore return the student to their original group. If, however, the new score is better (lower) than the old one, then the schedule is more balanced and I keep the student in the new group.

5. I continue assigning random students to random groups that reduce the lopsidedness score. Eventually, I reach a point where any change would make the score worse. I have therefore reached a local minimum, and I save these results.

6. I repeat this entire process again, assigning all students to random groups and then randomly assigning individual students to different groups until I can't improve the score anymore. Sometimes the final lopsidedness score is better and sometimes it's worse. Whenever I arrive at a new record (lower than all previous records), I save the results.

Though not optimal, the results were quite satisfactory. (Note that finding the optimal schedule for n students is an NP-Complete problem, and is therefore intractable for the number of students involved here.)

A couple of additional considerations. I kept all students with the same last name in the same cohort. That way, students in the same household would have school on the same day. I also modified the code to swap random students, in addition to swapping which cohort a single student was assigned to.

No comments:

Post a Comment