Microsoft Interview Question

Given Dataset : (using SQL and convert the same into PySpark) 1. Filter our the records where data is missing or null 2. Extract the top 5 users with the highest activity per day What is normalization/denormalization? What is ZOrdering? How to do incremental load in ADF?