I applied online. The process took 2 weeks. I interviewed at BriteCore in May 2019
Interview
BEWARE - Potential SCAM -
They have been advertising the same position for past 3 years. Essentially after 3 initial easier technical quiz, you'll be invited to COMPLETE a full fledged end-to-end complex project with deploying to AWS at your own cost. This is considering extremely low IT salary. One thing that's noticeable is they are open for remote position anywhere in the world.
It's interesting how, they are being a small company in Missouri can afford the legal requirements and infrastructure to employ anywhere in the world. What makes it concerning is - they require to finish a full fledged complex project before even you are considered for an interview. These are kind of projects implementation consulting companies can charge hundreds of thousand dollars.
Just type britecore on github and you can see hundreds of projects submitted from gullible applicants around the world (especially people outside US).
It's hard to believe they couldn't fill the position for past 3 years, my guess is they are using these free projects for their clients and scamming applicants. They did find a great loop hole to get their work done for free. The overseas applicants after completing an entire project, when they don't get response they can't do anything about it. I give them props to this company creatively scamming these applicants.
Interview questions [1]
Question 1
Data Engineer Hiring Project
Build a simple analytics platform for a Fake Insurance company using the Kaggle dataset Agency Performance Model. This platform has two components: An ETL or Data Pipelines and an API.
Minimum Requirements
Build a Data Pipeline/ETL process that takes the CSVs as input and saves into a database at a detailed level while also calculating summarized views. These summarized views could follow star schema or any other that you think will allow for easy querying using different pivots/dimensions. The Data Pipeline can be manually triggered by running a script (include instructions of how to do it!) or automated somehow.
Build an API (REST or GraphQL) that provides:
Detailed information using different parameters (like agency, month, year, state, etc)
Summarized information using different parameters (like agency, month, year, state, etc)
An XLS, XLSX or CSV report with Premium info by Agency and Product Line using date range as parameters
The Data Pipeline/ETL process and also the logic for generating the report must be done using Pandas
Deployment to AWS
ETL/Data Pipeline Flow
Tech Stack Requirements
The following are requirements on the tech stack. This stack demonstrates mastery of tools our team favors:
Server-Side Development: Python 2.7+ or 3.5+ and Pandas for the API and report
Server Framework: Django, Flask or SimpleHTTPServer
Make sure that your instructions for accessing or otherwise running your code are extremely clear.
Bonus points
We know people may have jobs or other important things to do, leaving them little time available to complete our project. The above are the minimum requirements. Any of the following could make you stand out from the crowd by showing you current proficiency with other skills and tools:
Integration or Unit tests (at least one of those). You can use pytest or unittest
Authentication so that only authorized users can query the API
Tests with good test coverage
Documented code and that follows pep8 and The Zen of Python
API documentation
Using docker for deployment
Using other AWS stack components relevant for Data Engineering (Lambdas, S3, DynamoDb, Cognito, etc.)
Using any CI service like Travis, Shippable, Circle CI, etc. for running the tests
Including some predictive analysis like forecasting or categorization as part of the API
Incremental ETL that only processes and loads new records
Build a web app that consumes the API (we use Vue.js and Knockout)
Show some charts, tables, dashboards
Let users run reports with different input parameters and date ranges from there
Guidelines
We're looking for someone who can work independently and is curious and self-motivated. One major goal of this project is to see how you fill in ambiguities creatively. There is no such thing as a perfect project here, just interpretations of the instructions above, so be creative in your approach.
Deliverables
In order to move your application forward, deliverables will include:
A deployed version of your project, on AWS.
A GitHub repo containing your project. Your repo must contain these two items:
A detailed README that explains your approach and deployment method
Your code solution to this test
Adding of these items to your resume's cover letter:
The link to the GitHub repo that lists this project
The link to the deployed version of your project
Uploading of your resume with cover letter in PDF or DOCX format by clicking this link.