40 Deloitte Interview Questions

Preparing for a Deloitte data analyst interview? This curated list of 40 Deloitte interview questions spans technical expertise, analytical thinking, and soft skills, reflecting Deloitte’s focus on client impact, data-driven insights, and collaboration.

1. Overview of Deloitte

Deloitte, based in London, is the world’s largest professional services network by revenue and workforce. As one of the Big Four accounting firms, alongside EY, KPMG, and PwC.
It offers audit, consulting, tax, and advisory services globally, serving a wide range of industries with expertise and innovation.

Deloitte interviews assess not just what you know but how you apply it to real-world scenarios. Tailor your preparation to showcase analytical depth, adaptability, and alignment with Deloitte’s mission to deliver measurable value.

2.1 SQL Essentials & Data Manipulation

**1. Explain the difference between INNER JOIN, LEFT JOIN, and RIGHT JOIN.**

INNER JOIN returns only matching rows from both tables.
For example, joining Employees and Departments on DepartmentID will exclude employees without a department or departments without employees.

LEFT JOIN returns all rows from the left table and matching rows from the right table. Unmatched right table values are filled with NULL.
For instance, listing all employees (including those without departments) requires a LEFT JOIN between Employees and Departments.

RIGHT JOIN is the inverse of LEFT JOIN: it returns all rows from the right table and matching rows from the left.
It’s rarely used, as reordering tables and using LEFT JOIN achieves the same result.

-- INNER JOIN (only matches)  
SELECT e.Name, d.DeptName  
FROM Employees e  
INNER JOIN Departments d ON e.DeptID = d.DeptID;  

-- LEFT JOIN (all employees, even unassigned)  
SELECT e.Name, d.DeptName  
FROM Employees e  
LEFT JOIN Departments d ON e.DeptID = d.DeptID;

**2. How do you use GROUP BY and HAVING together in a query?**

GROUP BY aggregates data into groups (e.g., total sales per category).

HAVING filters groups after aggregation, unlike WHERE, which filters rows before aggregation.

SELECT Category, SUM(Sales) AS TotalSales  
FROM Orders  
GROUP BY Category  
HAVING SUM(Sales) > 10000;

Key Difference:

WHERE clause would filter individual rows (e.g., WHERE Sales > 100).
HAVING works on aggregated results (e.g., HAVING SUM(Sales) > 10000).

3. Write a query to find duplicate entries in a table and delete them.

Use a CTE (Common Table Expression) with ROW_NUMBER() to identify duplicates:

WITH CTE AS (  
  SELECT *,  
  ROW_NUMBER() OVER (  
    PARTITION BY Column1, Column2  -- Columns defining duplicates  
    ORDER BY ID  -- Tiebreaker to retain one row  
  ) AS RN  
  FROM Orders  
)  
DELETE FROM CTE WHERE RN > 1;

Note: Replace DELETE with SELECT * to preview duplicates.

4. How would you calculate total sales per category and filter for categories with sales over $10,000?

Aggregate sales by category and filter using HAVING.

SELECT Category, SUM(OrderAmount) AS TotalSales  
FROM Sales  
GROUP BY Category  
HAVING SUM(OrderAmount) > 10000;

Use Case: Identify high-performing product categories for business strategy adjustments.

5. Describe how you would handle NULL values in a dataset.

Replace NULL: Use COALESCE(Column, DefaultValue) to substitute NULL with a placeholder (e.g., COALESCE(Salary, 0)).

Exclude NULL: Filter rows with IS NOT NULL (e.g., WHERE Salary IS NOT NULL).

Aggregate Safely: Functions like SUM() ignore NULL, but COUNT(Column) excludes NULL values.

-- Replace NULL with "Unknown" in Department  
SELECT Name, COALESCE(Dept, 'Unknown') AS Department  
FROM Employees;  

-- Count non-NULL salaries  
SELECT COUNT(Salary) FROM Employees;

6. Explain the differences between RANK(), DENSE_RANK(), and ROW_NUMBER() with examples.

RANK(): Skips ranks after ties.
- Example: Salaries [1000, 2000, 2000, 3000] → Ranks [1, 2, 2, 4].
DENSE_RANK(): Does not skip ranks.
- Example: Same salaries → Ranks [1, 2, 2, 3].
ROW_NUMBER(): Assigns unique sequential numbers, ignoring ties.
- Example: Same salaries → Row Numbers [1, 2, 3, 4].

SELECT Salary,  
  RANK() OVER (ORDER BY Salary DESC) AS Rank,  
  DENSE_RANK() OVER (ORDER BY Salary DESC) AS DenseRank,  
  ROW_NUMBER() OVER (ORDER BY Salary DESC) AS RowNum  
FROM Employees;

7. Write a query to extract the third-highest salary from an employee table.

Use DENSE_RANK() to handle ties:

WITH RankedSalaries AS (  
  SELECT Salary,  
  DENSE_RANK() OVER (ORDER BY Salary DESC) AS DR  
  FROM Employees  
)  
SELECT DISTINCT Salary  
FROM RankedSalaries  
WHERE DR = 3;

Alternative: OFFSET 2 skips the top two salaries.

SELECT DISTINCT Salary  
FROM Employees  
ORDER BY Salary DESC  
LIMIT 1 OFFSET 2;

8. Retrieve all employees reporting to a specific manager and their subordinates at any level (hierarchical query).

Use a recursive CTE for hierarchical data:
Result: Returns all employees under ManagerID 101 at any hierarchy level.

WITH RecursiveCTE AS (  
  -- Anchor: Direct reports of ManagerID = 101  
  SELECT EmpID, Name  
  FROM Employees  
  WHERE ManagerID = 101  
  UNION ALL  
  -- Recursive: Subordinates of the above  
  SELECT e.EmpID, e.Name  
  FROM Employees e  
  INNER JOIN RecursiveCTE r ON e.ManagerID = r.EmpID  
)  
SELECT * FROM RecursiveCTE;

9. Calculate the cumulative salary of employees in each department who joined in the past 30 days.

SELECT Dept,  
  SUM(Salary) OVER (  
    PARTITION BY Dept  
    ORDER BY JoinDate  
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW  
  ) AS CumulativeSalary  
FROM Employees  
WHERE JoinDate >= DATEADD(day, -30, GETDATE());

10. Find the top 2 customers with the highest order amounts per product category, handling ties.

Use DENSE_RANK() to allow ties:
Result: Returns top 2 customers per category, including ties (e.g., two customers with the same top amount).

WITH RankedCustomers AS (  
  SELECT CustomerID, ProductCategory, OrderAmount,  
    DENSE_RANK() OVER (  
      PARTITION BY ProductCategory  
      ORDER BY OrderAmount DESC  
    ) AS Rank  
  FROM Orders  
)  
SELECT CustomerID, ProductCategory, OrderAmount  
FROM RankedCustomers  
WHERE Rank <= 2;

2.2 Python & Data Analysis

11. How would you use Pandas to handle missing values?

Handling missing data is critical for accurate analysis. Pandas provides several methods:

Detect Missing Values: Use df.isnull().sum() to count NaN values per column.
Remove Rows/Columns:
- df.dropna() drops rows with any NaN. Use subset or thresh to control deletion.
- Example: df.dropna(subset=[‘Salary’], inplace=True) removes rows with missing salaries.
Fill Missing Data:
- Constant: df.fillna(0) replaces NaN with 0.
- Forward/Backward Fill: df.fillna(method=’ffill’) propagates the last valid value.
- Mean/Median: df[‘Salary’].fillna(df[‘Salary’].median(), inplace=True).
Interpolation: df.interpolate() estimates missing values using existing data trends.

Example:

# Replace missing ages with the column median  
df['Age'] = df['Age'].fillna(df['Age'].median())  

# Drop rows where both 'Age' and 'Salary' are missing  
df.dropna(subset=['Age', 'Salary'], how='all', inplace=True)

Best Practice: Choose strategies based on data context. For example, avoid mean imputation if outliers skew the data.

12. How do you filter a DataFrame based on multiple conditions?

Use logical operators (& for AND, | for OR) with conditions wrapped in parentheses:
Example: Filter users aged 25–30 with a salary over $50,000:

filtered_df = df[(df['Age'] >= 25) & (df['Age'] <= 30) & (df['Salary'] > 50000)]

Use Case: Filtering e-commerce orders from “2023-01-01” to “2023-12-31” with status “Delivered”:

orders = df[(df['OrderDate'] >= '2023-01-01') &  
            (df['OrderDate'] <= '2023-12-31') &  
            (df['Status'] == 'Delivered')]

13. How would you analyze and visualize a dataset with outliers?

Detect Outliers:
- Statistical Methods:
  - Z-Score: Flag values beyond ±3 standard deviations.
  - IQR (Interquartile Range): Values outside 1.5×IQR (Q3–Q1) are outliers.
Visualize Outliers:
- Boxplot: Shows median, quartiles, and outliers.
- Scatterplot: Identify outliers in 2D space.
Handle Outliers:
- Remove: df = df[z_scores <= 3].
- Cap: df[‘Salary’] = df[‘Salary’].clip(lower=Q1 – 1.5IQR, upper=Q3 + 1.5IQR).
- Transform: Apply log scaling to reduce skew.

Use Case: In revenue analysis, capping extreme values prevents skewed averages.

14. What’s the difference between a list, a dictionary, and a NumPy array? Use cases for each

List:
- Structure: Ordered, mutable collection of heterogeneous elements.
- Use Case: Storing sequences of items (e.g., [1, ‘apple’, True]).
- Operations: Append, slice, iterate.
Dictionary:
- Structure: Unordered key-value pairs for fast lookups.
- Use Case: Storing structured data (e.g., {‘name’: ‘John’, ‘age’: 30}).
- Operations: Access via keys, update values.
NumPy Array:
- Structure: Homogeneous, fixed-size multidimensional array for numerical data.
- Use Case: Math operations (e.g., matrix multiplication, statistical analysis).
- Operations: Vectorized operations (fast computations without loops).

2.3 Excel & Data Visualization

15. How do you use Pivot Tables to summarize data?

Pivot Tables summarize large datasets by aggregating values (sum, count, average) based on categories.

Select Data: Highlight your dataset.
Insert Pivot Table: Go to Insert > Pivot Table.
Drag Fields:
- Rows/Columns: Categorize data (e.g., Region or Month).
- Values: Choose metrics (e.g., SUM(Sales)).
Filter: Use slicers or field filters to focus on specific segments.
Example: To see total sales per product category, drag Category to Rows and Sales to Values. Pivot Tables dynamically update when source data changes, making them ideal for quick summaries like monthly revenue or customer demographics.

16. Describe using VLOOKUP or INDEX/MATCH for data lookup.

VLOOKUP: Searches the first column of a range for a value and returns a corresponding value from another column.
Syntax: =VLOOKUP(lookup_value, table_range, column_index, FALSE)
Limitation: Cannot look left (searches only rightward).

INDEX/MATCH: Combines INDEX (returns a value from a column) and MATCH (finds the position of a value).
Syntax: =INDEX(return_column, MATCH(lookup_value, lookup_column, 0))
Advantage: Works in any direction and handles column insertions/deletions.
Example: Use INDEX/MATCH to fetch prices from a product table dynamically.

17. How would you visualize trends (e.g., monthly sales changes) in Excel?

Use Line Charts or Sparklines:

Line Chart:
1. Select date and sales columns.
2. Go to Insert > Line Chart.
3. Customize axes (date format) and add trendlines.
Sparklines (mini charts in cells):
- Select data > Insert > Sparklines > Line.
- Shows trends at a glance.
  Example: Plot monthly sales in a line chart to highlight seasonal peaks. Add data labels for clarity. For dashboards, use sparklines next to KPIs (e.g., monthly growth per region).

18. Best practices for creating visualizations for non-technical stakeholders.

Simplify: Remove clutter (gridlines, excessive colors).
Focus on Key Metrics: Highlight trends or comparisons (e.g., YoY growth).
Use Intuitive Charts: Line/bar charts for trends, pie charts for proportions.
Add Context: Annotate outliers (e.g., “Holiday surge”).
Consistent Design: Use branded colors and clear labels.
Tell a Story: Structure visuals logically (problem → insight → action).

Example: In a sales report, start with a summary KPI (total revenue), followed by a line chart showing monthly trends, and end with a bar chart comparing top products. Test readability with non-technical peers.

2.4 Power BI

19. Key components of Power BI.

Power BI comprises Power Query (data transformation), DAX (formulas for calculations), Data Model (relationships between tables), Visualizations (charts, tables), and Power BI Service (cloud sharing). It integrates data from sources like Excel, SQL, and APIs into interactive reports and dashboards.

20. Difference between dashboards and reports.

Dashboards are single-page, high-level summaries with tiles (static or live).
Reports are multi-page, detailed explorations with interactive visuals.
Dashboards aggregate data from multiple reports; reports connect to a single dataset.

21. Explain dimension tables vs. fact tables and cardinality.

Dimension tables (e.g., Products, Dates) describe entities.
Fact tables (e.g., Sales) store metrics.
Cardinality defines table relationships: one-to-many (common), many-to-many (rare).

22. Differences between `DATESBETWEEN` and `DATESINPERIOD`.

DATESBETWEEN returns dates between fixed start/end dates.
DATESINPERIOD shifts a period (e.g., “last 3 months”) dynamically.

Example: DATESINPERIOD(Dates[Date], TODAY(), -3, MONTH)

23. Create calculated columns vs. measures.

Calculated Columns compute row-level values during refresh (e.g., Profit = Sales – Cost).
Measures calculate dynamically (e.g., Total Sales = SUM(Sales)). Use measures for aggregations.

24. Explain row-level security and incremental refresh.

RLS restricts data access via roles (e.g., [Region] = USERNAME()).
Incremental Refresh loads only new data (e.g., refresh last 30 days daily).

25. Types of visuals used in projects.

Bar/line charts (trends), tables (detailed data), maps (geospatial), cards (KPIs), slicers (filters). Use scatter plots for correlations.

26. Implement drill-through functionality.

Enable users to click a data point (e.g., a region) and navigate to a detailed page. Configure via Drill Through settings in visuals.

27. Use bookmarks with a scenario example.

Save a filter state (e.g., “Q3 Sales in Europe”). Users click a bookmark to reset visuals to predefined filters.

28. Troubleshoot slow-loading PBIX files and improve performance.

Optimize by removing unused columns, simplifying DAX, using aggregations, and enabling Import Mode over DirectQuery.

29. Manage large datasets and use Power Query.

Use Query Folding (push transformations to the source), data compression, and Aggregations (pre-summarize data).

30. Explain gateways and their types.

On-premises gateways connect Power BI to local data sources. Types: Personal (one user) and Enterprise (shared).

31. Walk through creating a sales dashboard from scratch.

Import data → Model relationships → Design visuals (revenue by region, trends) → Add filters → Publish to Power BI Service.

32. Optimize a slow Power BI report.

Use Performance Analyzer, replace calculated columns with measures, limit visuals, and enable Composite Models.

33. Describe 5 chart types and their use cases.

Line: Trends over time.
Bar: Category comparisons.
Pie: Proportions.
Scatter: Correlations.
Matrix: Hierarchical data (e.g., sales by region and product).

2.5 Scenario-Based Analysis

34. Analyze customer feedback data to identify complaints/trends.

Aggregate feedback from surveys, social media, and support tickets.
Clean data (remove duplicates, handle missing values) and perform text analysis (sentiment analysis, keyword extraction) to identify themes like “shipping delays” or “defects.”
Use tools like Python (NLTK, spaCy) or Power BI’s text analytics.
Visualize trends with word clouds, bar charts (common complaints), and time-series graphs to spot spikes.
Prioritize high-frequency issues (e.g., 30% complaints about packaging) and recommend fixes (improve packaging quality).
Continuously monitor trends via dashboards to track resolution effectiveness.

35. Present sales insights to adjust the marketing strategy.

Analyze sales data (revenue, product/region performance) to identify trends.
Highlight underperforming products (e.g., 20% sales drop) or regions (e.g., Midwest lags).
Use visualizations like line charts (monthly trends), heatmaps (geographic performance), and cohort analysis (customer retention).

Recommend actions: reallocate ad spend to high-growth channels (social media), launch targeted promotions (discounts for stagnant regions), or personalize campaigns (email offers for loyal customers).
Share insights via dashboards with KPIs (conversion rates, CLV) to align marketing tactics with data-driven opportunities (e.g., double down on top-selling products).

36. Analyze product returns to reduce return rates.

Calculate return rates per product/SKU and categorize reasons (defects, wrong size).
Use SQL/Power BI to link returns to production batches, shipping carriers, or customer demographics. Identify root causes: 40% defects from Supplier A, 25% sizing issues in apparel.

Recommend actions: improve quality checks, update sizing guides, or partner with reliable suppliers.
Test solutions (A/B test packaging changes) and track post-implementation return rates.

Example: After redesigning sizing charts, reduce size-related returns by 15%. Share findings via dashboards to align teams (manufacturing, customer service) on reducing costs and improving satisfaction.

2.6 Soft Skills & Communication

37. Explain a data insight to a non-technical team member.

Simplify technical jargon. Use analogies and visuals (charts, graphs) to convey the insight.

Example: Instead of “correlation coefficient,” say, “Sales rise 20% when social media ads run.
“Focus on the business impact: “Targeting young adults on Instagram could boost Q3 revenue by $50K.”
Ask for feedback to ensure clarity.

38. Ensure accuracy/clarity under pressure.

Break tasks into steps. Double-check critical metrics (e.g., SUM vs. AVG). Use tools like Excel formulas or DAX for consistency.

Example: Validate sales totals with a peer before presenting. Document assumptions (e.g., “Data excludes returns”) to avoid misinterpretation. Stay organized with checklists.

39. Prioritize tasks across multiple projects/deadlines.

Use the Eisenhower Matrix:

Urgent/Important: Fix broken dashboards.
Important/Not Urgent: Analyze trends for a quarterly report.
Delegate: Data entry tasks.
Communicate timelines with stakeholders. Tools like Trello or Outlook calendars help track deadlines.

40. Example of managing a challenging task with tight deadlines.

During a product launch, I had 48 hours to analyze customer survey data. I split the work:

Cleaned data (2 hrs).
Identified top complaints using word clouds (3 hrs).
Created a summary slide (1 hr).
Result: Delivered actionable insights (e.g., “40% disliked packaging”) on time, leading to a redesign.

Table of Contents

1. Overview of Deloitte