Hippocratic AI interview question

How would you implement causal masking for transformers in pytorch.