Data Science at Home: How to Solve the Nanny Scheduling Problem with Genetic and Monte Carlo Algorithms | by Courtney Perigo | Sep, 2024

Armed with the simulation of all the possible ways our timeline can throw us curveballs, I knew it was time to apply some heavy-hitting optimization techniques. Enter genetic algorithms, an optimization method inspired by natural selection that finds the best solution by iteratively evolving a population of candidate solutions.

Photo by Sangharsh Lohakare in Stop splashing

In this case, each “candidate” was a potential set of nanny characteristics, such as availability and flexibility. The algorithm evaluates different nanny characteristics and iteratively improves them to find the one that fits our family’s needs. The result? A highly optimized nanny with scheduling preferences that balance our parental coverage gaps with nanny availability.

At the heart of this approach is what I like to call the “nanny chromosome.” In terms of genetic algorithms, a chromosome is simply a way of representing possible solutions—in our case, different characteristics of the nanny. Each “nanny chromosome” had a set of characteristics that defined its schedule: the number of days per week the nanny could work, the maximum hours it could cover in a day, and its flexibility to accommodate different start times. These characteristics were the building blocks of each possible nanny schedule the algorithm would consider.

Definition of nanny chromosome

In genetic algorithms, a “chromosome” represents a possible solution, and in this case, it is a set of characteristics that define a babysitter’s schedule. Here’s how we define a babysitter’s characteristics:

# Function to generate nanny characteristics
def generate_nanny_characteristics():
return {
'flexible': np.random.choice((True, False)),  # Nanny's flexibility
'days_per_week': np.random.choice((3, 4, 5)),  # Days available per week
'hours_per_day': np.random.choice((6, 7, 8, 9, 10, 11, 12))  # Hours available per day
}

Each nanny's schedule is defined by their flexibility (whether they can adjust their start time), the number of days they are available per week, and the maximum number of hours they can work per day. This gives the algorithm the flexibility to evaluate a wide variety of potential schedules.

Preparing a schedule for each nanny

Once the nanny's characteristics have been defined, we need to generate a weekly schedule that fits those restrictions:

# Function to calculate a weekly schedule based on nanny's characteristics
def calculate_nanny_schedule(characteristics, num_days=5):
shifts = ()
for _ in range(num_days):
start_hour = np.random.randint(6, 12) if characteristics('flexible') else 9  # Flexible nannies have varying start times
end_hour = start_hour + characteristics('hours_per_day')  # Calculate end hour based on hours per day
shifts.append((start_hour, end_hour))
return shifts  # Return the generated weekly schedule

This feature creates a nanny's schedule based on their defined flexibility and work hours. Flexible nannies can start between 6:00 and 12:00, while others have fixed schedules that start and end at certain times. This allows the algorithm to evaluate a variety of possible weekly schedules.

Selecting the best candidates

Once we have generated an initial population of babysitting schedules, we use a fitness function to evaluate which ones best fit our childcare needs. The most suitable schedules are selected to be passed on to the next generation:

# Function for selection in genetic algorithm
def selection(population, fitness_scores, num_parents):
# Normalize fitness scores and select parents based on probability
min_fitness = np.min(fitness_scores)
if min_fitness < 0:
fitness_scores = fitness_scores - min_fitnessfitness_scores_sum = np.sum(fitness_scores)
probabilities = fitness_scores / fitness_scores_sum if fitness_scores_sum != 0 else np.ones(len(fitness_scores)) / len(fitness_scores)
# Select parents based on their fitness scores
selected_parents = np.random.choice(population, size=num_parents, p=probabilities)
return selected_parents

In the selection step, the algorithm evaluates the population of nanny schedules using a fitness function that measures how well the nanny’s availability aligns with the family’s needs. The most suitable schedules, those that best cover the required hours, are selected to become the “parents” of the next generation.

Adding mutations to keep things interesting

To avoid getting stuck in suboptimal solutions, we add a bit of randomness through mutation. This allows the algorithm to explore new possibilities by occasionally modifying the nanny's schedule:

# Function to mutate nanny characteristics
def mutate_characteristics(characteristics, mutation_rate=0.1):
if np.random.rand() < mutation_rate:
characteristics('flexible') = not characteristics('flexible')
if np.random.rand() < mutation_rate:
characteristics('days_per_week') = np.random.choice((3, 4, 5))
if np.random.rand() < mutation_rate:
characteristics('hours_per_day') = np.random.choice((6, 7, 8, 9, 10, 11, 12))
return characteristics

By introducing small mutations, the algorithm can explore new schemes that would not otherwise have been considered. This diversity is important to avoid local optima and improve the solution over several generations.

Evolving towards the perfect schedule

The final step was evolution. Once selection and mutation have been performed, the genetic algorithm is repeated over several generations, developing better nanny programs in each round. Here is how we implemented the evolution process:

# Function to evolve nanny characteristics over multiple generations
def evolve_nanny_characteristics(all_childcare_weeks, population_size=1000, num_generations=10):
population = (generate_nanny_characteristics() for _ in range(population_size))  # Initialize the populationfor generation in range(num_generations):
print(f"\n--- Generation {generation + 1} ---")
fitness_scores = ()
hours_worked_collection = ()
for characteristics in population:
fitness_score, yearly_hours_worked = fitness_function_yearly(characteristics, all_childcare_weeks)
fitness_scores.append(fitness_score)
hours_worked_collection.append(yearly_hours_worked)
fitness_scores = np.array(fitness_scores)
# Find and store the best individual of this generation
max_fitness_idx = np.argmax(fitness_scores)
best_nanny = population(max_fitness_idx)
best_nanny('actual_hours_worked') = hours_worked_collection(max_fitness_idx)
# Select parents and generate a new population
parents = selection(population, fitness_scores, num_parents=population_size // 2)
new_population = ()
for i in range(0, len(parents), 2):
parent_1, parent_2 = parents(i), parents(i + 1)
child = {
'flexible': np.random.choice((parent_1('flexible'), parent_2('flexible'))),
'days_per_week': np.random.choice((parent_1('days_per_week'), parent_2('days_per_week'))),
'hours_per_day': np.random.choice((parent_1('hours_per_day'), parent_2('hours_per_day')))
}
child = mutate_characteristics(child)
new_population.append(child)
population = new_population  # Replace the population with the new generation
return best_nanny  # Return the best nanny after all generations

Here, the algorithm evolves over several generations, selecting the best babysitting schedules based on their fitness scores and allowing new solutions to emerge through mutation. After several generations, the algorithm converges on the best possible babysitting schedule, optimizing coverage for our family.

Final thoughts

Using this approach, we applied genetic algorithms to iteratively improve nanny schedules, ensuring that the selected schedule could handle the chaos of Parent 2’s unpredictable work shifts while also balancing the needs of our family. Genetic algorithms may have been overkill for the task, but they allowed us to explore several possibilities and optimize the solution over time.

The images below depict the evolution of the nannies' fitness indices over time. The algorithm was able to quickly converge on the best nanny chromosome after just a few generations.

Data Science at Home: How to Solve the Nanny Scheduling Problem with Genetic and Monte Carlo Algorithms | by Courtney Perigo | Sep, 2024

Technical Terrence Team

Business Insider names WSJ's Jamie Heller as editor-in-chief By Reuters

Leave a Reply Cancel reply

Recommended.

Step-by-Step Hugging Face Fine-Tuning Tutorial

Managing uncertainty with Habi and Inspired Capital at TechCrunch Live

Starbucks traded in the red for seven consecutive sessions (NASDAQ:SBUX)

Official: Twitter will now charge for SMS two-factor authentication

Vivek Ramaswamy can't even dunk on Buzzfeed

Categories

Important Links

Data Science at Home: How to Solve the Nanny Scheduling Problem with Genetic and Monte Carlo Algorithms | by Courtney Perigo | Sep, 2024

Definition of nanny chromosome

Preparing a schedule for each nanny

Selecting the best candidates

Adding mutations to keep things interesting

Evolving towards the perfect schedule

Final thoughts

Related

Technical Terrence Team

Business Insider names WSJ's Jamie Heller as editor-in-chief By Reuters

Leave a Reply Cancel reply

Recommended.

Step-by-Step Hugging Face Fine-Tuning Tutorial

Managing uncertainty with Habi and Inspired Capital at TechCrunch Live

Starbucks traded in the red for seven consecutive sessions (NASDAQ:SBUX)

Official: Twitter will now charge for SMS two-factor authentication

Vivek Ramaswamy can't even dunk on Buzzfeed

Categories

Important Links

Get daily news updates to your inbox!