Data Engineer Interview Questions

Data Engineer Interview Questions

I Data engineer sono professionisti informatici richiesti pressoché in tutti i settori. Si occupano di monitorare i trend dei dati per pianificare le azioni più adeguate che un'azienda deve intraprendere. Uno degli aspetti più critici del lavoro di un Data engineer è l'elaborazione dei dati grezzi e la loro trasformazione in dati utilizzabili per creare pipeline e sistemi di dati.

Domande tipiche dei colloqui per Data engineer e come rispondere

Question 1

Domanda 1: Puoi descrivere in dettaglio il tuo livello di competenza nell'ambito dei linguaggi di programmazione?

How to answer
Come rispondere: Prima del colloquio, ripassa il tuo CV e/o portfolio e stila un elenco dei programmi che conosci meglio. Se scopri di non avere una buona conoscenza del programma usato in prevalenza nell'azienda, descrivi te stesso come una persona intraprendente e altamente motivata, che si impegnerà senza sosta per imparare a usare il programma.
Question 2

Domanda 2: Spiega a parole tue che cos’è il data engineering.

How to answer
Come rispondere: Analizza il tuo ruolo in relazione all'azienda e ad altri ruoli quali il data scientist, così da definire in modo chiaro il tuo contributo al sistema aziendale nel suo complesso. Spiega la differenza tra il ruolo di un ingegnere che lavora ai database e quello di un ingegnere che si occupa di pipeline.
Question 3

Domanda 3: Puoi descrivere un'esperienza lavorativa con Apache Hadoop e in ambienti di gestione dei dati nel cloud?

How to answer
Come rispondere: Per prepararti a questa domanda, fai le dovute ricerche sul software utilizzato dall'azienda, sui prodotti cloud per i dati e sull'uso di Apache Hadoop. I Data engineer devono avere un'ottima padronanza dei linguaggi di programmazione e dei sistemi di gestione dei dati utilizzati nel settore, quali Apache Hadoop.

20,133 data engineer interview questions shared by candidates

Given a dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one (alphabetically). N can be higher than the number of elements in the dictionary.
avatar

Data Engineer

Interviewed at Meta

3.6
Aug 17, 2021

Given a dictionary, print the key for nth highest value present in the dict. If there are more than 1 record present for nth highest value then sort the key and print the first one (alphabetically). N can be higher than the number of elements in the dictionary.

Given a list of ints, balance the list so that each int appears equally in the list. Return a dictionary where the key is the int and the value is the count needed to balance the list. [1, 1, 2] => {2: 1} [1, 1, 1, 5, 3, 2, 2] => {5: 2, 3: 2, 2: 1}
avatar

Data Engineer

Interviewed at Meta

3.6
Aug 17, 2021

Given a list of ints, balance the list so that each int appears equally in the list. Return a dictionary where the key is the int and the value is the count needed to balance the list. [1, 1, 2] => {2: 1} [1, 1, 1, 5, 3, 2, 2] => {5: 2, 3: 2, 2: 1}

SQL questions on promotions, sales schema. what %age of products have both non fat and trans fat. find top 5 sales products having promotions what %age of sales happened on first and last day of the promotion Mysql was used and interviewer asked to if this can be done without subquery. Python:- [1,None,1,2,None} --> [1,1,1,2,2] Ensure you take care of case input[None] which means None object. find s in missisipi.
avatar

Data Engineer

Interviewed at Meta

3.6
Jun 29, 2020

SQL questions on promotions, sales schema. what %age of products have both non fat and trans fat. find top 5 sales products having promotions what %age of sales happened on first and last day of the promotion Mysql was used and interviewer asked to if this can be done without subquery. Python:- [1,None,1,2,None} --> [1,1,1,2,2] Ensure you take care of case input[None] which means None object. find s in missisipi.

products sales +------------------+---------+ +------------------+---------+ | product_id | int |------->| product_id | int | | product_class_id | int | +---->| store_id | int | | brand_name | varchar | | +->| customer_id | int | | product_name | varchar | | | | promotion_id | int | | price | int | | | | store_sales | decimal | +------------------+---------+ | | | store_cost | decimal | | | | units_sold | decimal | | | | transaction_date | date | | | +------------------+---------+ | | stores | | customers +-------------------+---------+ | | +---------------------+---------+ | store_id | int |-+ +--| customer_id | int | | type | varchar | | first_name | varchar | | name | varchar | | last_name | varchar | | state | varchar | | state | varchar | | first_opened_date | datetime| | birthdate | date | | last_remodel_date | datetime| | education | varchar | | area_sqft | int | | gender | varchar | +-------------------+---------+ | date_account_opened | date | +---------------------+---------+ Question 1: What brands have an average price above $3 and contain at least 2 different products? Question 2: To improve sales, the marketing department runs various types of promotions. The marketing manager would like to analyze the effectiveness of these promotion campaigns. In particular, what percent of our sales transactions had a valid promotion applied? Question 3: We want to run a new promotion for our most successful category of products (we call these categories “product classes”). Can you find out what are the top 3 selling product classes by total sales? Question 4: We are considering running a promo across brands. We want to target customers who have bought products from two specific brands. Can you find out which customers have bought products from both the “Fort West" and the "Golden" brands?
avatar

Data Engineer

Interviewed at Meta

3.6
May 22, 2020

products sales +------------------+---------+ +------------------+---------+ | product_id | int |------->| product_id | int | | product_class_id | int | +---->| store_id | int | | brand_name | varchar | | +->| customer_id | int | | product_name | varchar | | | | promotion_id | int | | price | int | | | | store_sales | decimal | +------------------+---------+ | | | store_cost | decimal | | | | units_sold | decimal | | | | transaction_date | date | | | +------------------+---------+ | | stores | | customers +-------------------+---------+ | | +---------------------+---------+ | store_id | int |-+ +--| customer_id | int | | type | varchar | | first_name | varchar | | name | varchar | | last_name | varchar | | state | varchar | | state | varchar | | first_opened_date | datetime| | birthdate | date | | last_remodel_date | datetime| | education | varchar | | area_sqft | int | | gender | varchar | +-------------------+---------+ | date_account_opened | date | +---------------------+---------+ Question 1: What brands have an average price above $3 and contain at least 2 different products? Question 2: To improve sales, the marketing department runs various types of promotions. The marketing manager would like to analyze the effectiveness of these promotion campaigns. In particular, what percent of our sales transactions had a valid promotion applied? Question 3: We want to run a new promotion for our most successful category of products (we call these categories “product classes”). Can you find out what are the top 3 selling product classes by total sales? Question 4: We are considering running a promo across brands. We want to target customers who have bought products from two specific brands. Can you find out which customers have bought products from both the “Fort West" and the "Golden" brands?

Viewing 11 - 20 interview questions

Glassdoor has 20,133 interview questions and reports from Data engineer interviews. Prepare for your interview. Get hired. Love your job.