PCS Global Tech interview question

How can we remove the problem duplicity and redundant data?