Take home
We were able to write a script and scrape the Google index for all of the reviews customers have left for their stays with sitters.
We have saved that information in the attached CSV.
Your command-line program should import the data to hold in memory (no need to use an actual database) and use it to recreate our search algorithm. At Rover, we write domain-driven code, so using an Object Oriented or other domain-centric approach will set you up for success in the in person interview. For example, one viable approach is modeling your data as if you were going to save it using a relational database.
If you are using a language that doesn't traditionally use an OO or domain-centric approach, please discuss this with the person who sent you this prompt in advance of beginning work, in order to have the best chance of success if you advance to the in person interview.
Here's how the search ranking algorithm will work:
For each sitter, we first calculate a Profile Score and a Ratings Score. These are then used to calculate the overall Search Score, which is used for search rankings.
The Profile Score is 5 times the fraction of the English alphabet comprised by the distinct letters in what we've recovered of the sitter's name. For example, the sitter name Leilani R. has 6 distinct letters.
The Ratings Score is the average of their stay ratings.
The Search Score is a weighted average of the Profile Score and Ratings Score. When a sitter has no stays, their Search Score is equal to the Profile Score. When a sitter has 10 or more stays, their Search Score is equal to the Ratings Score. The idea is that as a sitter gets more reviews, we will weigh the Ratings Score more heavily.
Scores should contain exactly two decimal places.
Output a list of Sitters
Your program should output a csv called sitters.csv, containing the following columns:
Sitter email (email)
Sitter name (name)
Profile Score (profile_score)
Ratings Score (ratings_score)
Search Score (search_score)
The csv should be sorted by Search Score (descending), sorting alphabetically on the sitter name as a tie-breaker.
Discussion Questions:
Imagine you are designing a Rover-like production web application based on the exercise you've just completed. The application will compute the search scores for sitters, return a list of search results based on those scores, and display them to the user through a web UI. Please answer ONE of the following discussion questions about the approach you'd take:
How would you adjust the calculation and storage of search scores in a production application?
Describe a technical implementation for the frontend you would use to display a list of sitters and their scores. How would the frontend manage state as users interact with a page?
What infrastructure choices might you make to build and host this project at scale? Suppose your web application must return fast search results with a peak of 10 searches per second.
Describe how you would approach API design for a backend service to provide sitter and rank data to a client/web frontend.