EPAM Systems interview question

Python, SQL. How would you design a scalable data pipeline to handle large datasets with real-time processing requirements?