DEV Community

Cover image for Introduction to MS Excel for Data Analytics
Mariam Turnesh
Mariam Turnesh

Posted on • Edited on

Introduction to MS Excel for Data Analytics

Introduction

If you're stepping into data analytics, Excel is your best friend. It's the most accessible, powerful tool for working with data - no coding required. Whether you're tracking sales, analyzing trends, or building reports, Excel has you covered.

This guide uses real examples to make learning fun and relatable. By the end, you'll know how to clean data, write formulas, create pivot tables and extract insights like a pro.

Let's dive in!


Part 1: Understanding the Excel Interface

Before we jump into formulas and functions, let's get comfortable with Excel's layout.

Key components you need to know:

  • Ribbon: The toolbar at the top with tabs (Home, Insert, Formulas, Data, etc.) - this is where all Excel's features live
  • Formula Bar: Shows the contents of the active cell - especially useful when viewing formulas
  • Column Headers: The letters (A, B, C...) that identify columns
  • Row Headers: The numbers (1, 2, 3...) that identify rows
  • Active Cell: The currently selected cell with a green border (like A1 in the image)
  • Worksheet Tabs: Switch between different sheets in your workbook (Sheet1, Sheet2, etc.)

Pro tip: When someone says "cell D5", they mean the intersection of column D and row 5. This naming system is how we reference data in formulas.


PART 2: Data Cleaning & Sorting/Filtering

1. How to Find Duplicates

You might want to see duplicates before deleting them to be sure what you’re removing.

Highlight Duplicates Using Conditional Formatting

  • Select the cells you want to check.

  • Click Home → Conditional Formatting → Highlight Cells Rules → Duplicate Values.

  • Pick a color style and click OK.

  • Now all duplicates will be colored so you can see them easily.

✅ This doesn’t delete anything, it just shows duplicates so you can inspect them.

2. How to Remove Duplicates

If you’re sure you want to delete the extra copies:

Use the Remove Duplicates Tool

  • Select your data range (just click anywhere inside your table).

  • Go to Data → Remove Duplicates.

  • A box will pop up. Excel will usually guess your column headers for you.

  • Make sure the columns you want to check for duplicates are ticked.

  • Click OK.

Excel will remove all rows that are exact duplicates and leave only the first occurrence. It will also tell you how many duplicates it found and removed.

3.Find & Replace

Find and Replace is a feature in Excel that helps you quickly search for specific text or numbers and change them across your worksheet. It’s especially useful when cleaning data or fixing repeated mistakes.

Use Ctrl + F to find data and Ctrl + H to replace it. This feature saves time, fixes repeated errors, and helps clean data, but you should be careful with Replace All since it changes everything at once.

4. Sorting and Filtering

Sorting and Filtering help you organize data and focus only on what matters, without changing the actual data.

Sorting

Sorting in Excel helps you arrange your data in a meaningful order so it’s easier to understand.

Types of Sorting

Text: A → Z or Z → A

Numbers: Smallest → Largest or Largest → Smallest

Dates: Oldest → Newest or Newest → Oldest

How to Sort Data

  • Click anywhere inside your table.

  • Go to Data → Sort A–Z or Sort Z–A.

  • Excel will reorder the rows based on the selected column.

Filtering

Filtering in Excel lets you show only the data you want to see while hiding the rest. It helps you focus on specific information without deleting anything.

How to Apply a Filter

  • Click anywhere inside your table.

  • Go to Data → Filter.

  • Click the dropdown arrow in a column header.

  • Choose the values you want to see and click OK.

4. Freeze Panes in Excel

Freeze Panes is an Excel feature that keeps certain rows or columns visible while you scroll through your worksheet. This is helpful when working with large tables, so you don’t lose sight of headers.

How to Use Freeze Panes

  • Click the cell below the row or to the right of the column you want to keep visible.

  • Go to View → Freeze Panes.

Choose:

Freeze Top Row – keeps column headers visible

Freeze First Column – keeps row labels visible

Freeze Panes – freezes both rows and columns

Part 3: Basic Functions in Excel

In Excel, functions are built-in formulas that help you perform calculations quickly and correctly. Instead of doing math by hand, you tell Excel what to calculate.

All functions start with an equals sign (=).

Functions are pre-built formulas that perform calculations for you. Let's look at the most important ones you'll use every day.

The SUM Function

Need to add up numbers quickly? SUM is your go-to function.

What's happening here:

  • Formula: =SUM(D2:D10)
  • This adds all values from cell D2 through D10
  • Result: 2489

Real-world use: Totaling sales, expenses, inventory quantities, hours worked, etc.


The AVERAGE Function

Want to find the mean of a group of numbers? AVERAGE does the math for you.

What's happening here:

  • Formula: =AVERAGE(B2:E2)
  • This calculates the average of values across columns B through E in row 2
  • Result: 4 (average number of pokeballs for trainer Iva)

Why it matters: Averages help you understand typical values in your data - average sales per month, average customer rating, average response time, etc.


The MAX and MIN Functions

These functions find the highest and lowest values in a range.

What's happening:

  • Formula: =MAX(D2:D21)
  • Finds the highest value in the Total stats column
  • Result: 525 (Cloyster has the highest stats)

What's happening:

  • Formula: =MIN(D2:D21)
  • Finds the lowest value in the Total stats column
  • Result: 200 (Magikarp has the lowest stats)

Real-world applications:

  • MAX: Find your best-selling product, highest revenue month, top performer
  • MIN: Identify lowest inventory levels, minimum order quantities, slowest sales day

The MEDIAN Function

While AVERAGE can be skewed by extreme values, MEDIAN finds the true middle value.

What's happening:

  • Formula: =MEDIAN(A2:G2)
  • Finds the middle value when all numbers are sorted
  • Result: 2

When to use MEDIAN vs AVERAGE: If you have outliers (extremely high or low values), MEDIAN gives you a better sense of the "typical" value. For example, median income is often more useful than average income because a few billionaires don't skew it.


Logical Functions

Logical functions help Excel make decisions based on conditions you set. They're like teaching Excel to think: "If this is true, do that. Otherwise, do something else."

The IF Function

The IF function is your decision-maker. It checks a condition and returns one value if true, another if false.


What's happening here:

  • Formula: =IF(C5>80,TRUE)
  • Logic: If the Score (in column C) is greater than 80, display TRUE, otherwise display FALSE
  • Result:
    • Ava Thompson (85) → TRUE
    • Lily Smith (79) → FALSE
    • Ethan Johnson (82) → TRUE
    • And so on...

Key Insight: This simple IF formula evaluates each student's score and returns TRUE for passing scores (>80) and FALSE for those who didn't meet the threshold.


The AND Function

AND checks if ALL conditions are true. Every single condition must pass.

What's happening here:

  • Formula: =AND(B2>=60, C2>=90)
  • Logic: Returns TRUE only if BOTH Score 1 is greater than or equal to 60 AND Score 2 is greater than or equal to 90
  • Result:
    • Richard (53, 80) → FALSE (Score 1 fails)
    • Jennifer (60, 91) → TRUE (both conditions met)
    • James (58, 75) → FALSE (both conditions fail)
    • Lisa (79, 51) → FALSE (Score 2 fails)
    • Sharon (41, 33) → FALSE (both conditions fail)

Key Insight: ALL conditions must be satisfied. If even one condition is false, the entire AND function returns FALSE.


The OR Function

OR checks if AT LEAST ONE condition is true. Only one needs to pass.


What's happening here:

  • Formula: =OR(B8>30, C8="North")
  • Logic: Returns TRUE if EITHER Sales Volume is greater than 30 OR Team equals "North"
  • Result:
    • Malkowitz, Sally (18, North) → TRUE (Team is North)
    • Wilson, Timothy (31, East) → TRUE (Sales > 30)
    • Birch, Julia (42, West) → TRUE (Sales > 30)
    • Fairclough, John (57, North) → TRUE (both conditions met!)
    • Johnson, Mark (30, East) → FALSE (Sales not > 30 AND Team not North)
    • Williams, Eric (16, South) → FALSE (neither condition met)

Key Insight: Only ONE condition needs to be true. If any condition passes, the entire OR function returns TRUE.


The NOT Function

NOT reverses the logic. It turns TRUE into FALSE and FALSE into TRUE.


What's happening here:

  • Formula: =NOT(OR(B5="green", B5="red"))
  • Logic: First checks if Color is green OR red, then reverses that result
  • Result:
    • Red → FALSE (NOT green or red = NOT TRUE = FALSE)
    • Blue → TRUE (NOT green or red = NOT FALSE = TRUE)
    • Green → FALSE (NOT green or red = NOT TRUE = FALSE)

Key Insight: NOT flips the logic. It's useful when you want to find everything EXCEPT certain values.


Text Functions

Text functions help you manipulate, clean, and standardize text data. Essential for messy real-world datasets!

The TRIM Function

TRIM removes extra spaces that mess up your data.


What's happening here:

  • Formula: =TRIM(A2)
  • Original data has inconsistent spacing (extra spaces before, after, or between names)
  • Result: Clean, properly formatted names
    • " Ronnie Anderson" → "Ronnie Anderson"
    • "Tom Boone" → "Tom Boone"
    • "Sally Brooke " → "Sally Brooke"
    • "Jeremy Hill" → "Jeremy Hill"

Why this matters: Excel treats " Ronnie Anderson" and "Ronnie Anderson" as different values. When counting or filtering, extra spaces create duplicate entries and wrong totals. TRIM fixes this instantly.


The UPPER, LOWER, and PROPER Functions

The Excel functions UPPER, LOWER and PROPER change text case: UPPER converts all letters to caps

LOWER converts all to lowercase
PROPER capitalizes only the first letter of each word (useful for names)

Upper Function

What's happening here:

Formula: =UPPER(A4)
Converts all text to UPPERCASE
Examples from the data:

"william" → "WILLIAM"
"jody" → "JODY"
"sophie" → "SOPHIE"
"gurpreet" → "GURPREET"
"maximus" → "MAXIMUS"

lower Function

Formula: =LOWER(A4)
Converts all text to lowercase
Examples:

"william" → "william"
"JODY" → "jody"
"SOPHIE" → "sophie"

Proper Function

Formula: =PROPER(A4)
Capitalizes the first letter of each word (Title Case)
Examples:

"william" → "William"
"jody" → "Jody"
"sophie" → "Sophie"
"gurpreet" → "Gurpreet"
"maximus" → "Maximus"

Key Insight: These three functions give you complete control over text capitalization. UPPER for acronyms and codes, LOWER for email addresses and PROPER for names and titles.

The CONCATENATE Function

Combine text from multiple cells into one.


What's happening here:

Formula: =CONCATENATE(B5,", ",C5,", ",D5) or using &: =B5&", "&C5&", "&D5
Combines multiple values with separators
Examples:

Values A, B, C → "A, B, C"
Numbers 1, 2, 3 → "1, 2, 3"
"apples", "oranges", "pears" → "apples, oranges, pears"
"red", "blue", "green" → "red, blue, green"
Title "Mr.", First "Bob", Last "Smith" → "Mr. Bob Smith"

Pro tip: The & symbol is faster to type and does the same thing!
Real-world use: Create full names from first/last names, combine address fields (street + city + state), build product descriptions, generate email addresses (first.last@company.com)

LEFT Function

The LEFT function is used to retrieve a chosen amount of characters, counting from the left side of an Excel cell.


What's happening here:

Formula: =LEFT(B5, 3)
Extracts the first 3 characters from the left
Examples:

"January" → "Jan"
"February" → "Feb"
"March" → "Mar"
"April" → "Apr"
"September" → "Sep"

RIGHT Function

The RIGHT function is used to retrieve a chosen amount of characters, counting from the right side of an Excel cell.


What's happening here:

Formula: =RIGHT(B5, 1) or =RIGHT(B5, 2) or =RIGHT(B5, 3)
Extracts characters from the right side
Examples:

"ABC" with RIGHT(B5,1) → "C"
"ABC" with RIGHT(B5,2) → "BC"
"Portland, OR" with RIGHT(B8,2) → "OR"
"google.com" with RIGHT(B10,3) → "com"
"+1 303-512-4271" with RIGHT(B12,12) → "303-512-4271"
"2000 x 3000" with RIGHT(B14,4) → "3000"

Date & Time Functions

Excel stores dates as numbers (January 1, 1900 = 1), which means you can do math with dates! These functions make working with dates much easier.

The TODAY Functions
Get the current date or date and time.


What's happening here:

Formula: =TODAY()
Returns current date only (no time component)
Note: These functions take NO arguments - just empty parentheses!
Examples of using TODAY():

=TODAY() → 31-May-21 (current date)
=TODAY()-7 → 24-May-21 (one week in past)
=TODAY()+7 → 7-Jun-21 (one week in future)
=TODAY()+90 → 29-Aug-21 (90 days from today)

The NOW function

Formula: =NOW()
Returns current date AND time
Examples of using NOW():

=NOW() → 31-May-21 17:39 (current date and time)
=NOW()-7 → 24-May-21 17:39 (last week same time)
=NOW()+7 → 7-Jun-21 17:39 (next week same time)

Important: These functions update automatically every time you open the file or recalculate.

The Year Function


What's happening here:

Formula: =YEAR(B5)
Extracts the year from any date
Examples:

15-Apr-12 → 1912
28-Jun-14 → 1914
29-Oct-29 → 1929

Month Function

Formula: =MONTH(B5)
Extracts the month as a number (1-12)
Examples:

15-Apr-1912 → 4 (April)
21-May-1927 → 5 (May)
29-Oct-1929 → 10 (October)

DAY Function

Create a date from separate year, month and day values.


What's happening here:

Formula: =DATE(B5, C5, D5)
Combines separate Year, Month, and Day columns into a complete date
Examples:

Year: 1995, Month: 1, Day: 1 → 1-Jan-1995
Year: 2010, Month: 1, Day: 1 → 1-Jan-2010
Year: 2021, Month: 1, Day: 1 → 1-Jan-2021
Year: 2021, Month: 2, Day: 1 → 1-Feb-2021

Key Insight: This is incredibly useful when your data has dates split across multiple columns (common in imported data or forms).


*Part 4: Pivot Tables *

Pivot tables are Excel's most powerful feature for analyzing large datasets. They let you summarize, analyze, and explore thousands of rows of data in seconds without writing a single formula.

Think of it this way: If you have a list of 1,000 sales transactions, a pivot table can instantly tell you:

  • Total sales by salesperson
  • Average order value by month
  • Sales by product category and region
  • Year-over-year comparisons

All without a single formula!


Creating Your First Pivot Table

Step 1: Prepare Your Data

Your source data should be organized in a table format with:

  • Headers in the first row: Employee Name, Department, Salary, Joining Year, Performance Rating
  • No blank rows or columns
  • Consistent data types in each column
  • One record per row

Step 2: Insert the Pivot Table

  1. Click anywhere in your data
  2. Go to Insert tabPivotTable (top-left corner)
  3. Excel will automatically detect your data range
  4. In the Create PivotTable dialog:
    • Table/Range: Verify the range (e.g., Sheet3!$A$1:$E$10)
    • Choose where to place: Select "New Worksheet" (recommended for beginners)
    • Click OK

Step 3: Build Your Report

You'll see a blank pivot table and the PivotTable Fields pane on the right with four areas:

The Four Key Areas:

  1. FILTERS (top-left box)

    • Add fields here to filter the entire report
    • Example: Add "Joining Year" to filter data by year
  2. COLUMNS (top-right box)

    • Fields placed here become column headers
    • Example: Add "Department" to see departments across columns
  3. ROWS (bottom-left box)

    • Fields placed here become row labels
    • Example: Add "Employee Name" to list employees down the side
  4. VALUES (bottom-right box)

    • The actual numbers you want to calculate
    • Example: Add "Salary" to calculate sum or average

Step 4: Drag Fields to Build Your Analysis

Example: Summarize Salaries by Department

  1. Drag Department → ROWS area
  2. Drag Employee Name → ROWS area (below Department for grouping)
  3. Drag Salary → VALUES area
  4. Drag Joining Year → VALUES area
  5. Drag Performance Rating → VALUES area

Key Insights from this Pivot Table:

  • IT department has the highest total salary ($226,600)
  • Finance has the highest individual salary (Chris Taylor at $89,300)
  • You can see totals by department AND individual employees
  • Grand total shows company-wide statistics

Pivot Charts

Pivot charts are dynamic charts connected to your pivot table. When you update the pivot table, the chart updates automatically.

Creating a Pivot Chart

Method 1: From Existing Pivot Table

  1. Click anywhere in your pivot table
  2. Go to Insert tabPivotChart (in the Charts group)
  3. Choose your chart type

Method 2: Create Both at Once

  1. Select your source data
  2. Insert tabPivotChart dropdown → PivotChart & PivotTable
  3. Excel creates both simultaneously

Choosing the Right Chart Type

Slicers

Slicers are visual filter buttons that make filtering pivot tables intuitive and user-friendly. Instead of dropdown menus, you get clickable buttons.

What are Slicers?

In the example above:

  • Left slicer: Filter by Product (Apples, Bananas, Cherries, Oranges)
  • Right slicer: Filter by Reseller (John, Mike, Pete, Sally)

The pivot table shows sales data that updates instantly when you click slicer buttons.


** Adding Slicers to Your Pivot Table**

Steps:

  1. Click anywhere in your pivot table
  2. PivotTable ToolsAnalyze tabInsert Slicer (in the Filter group)
  3. In the Insert Slicers dialog, check the fields you want as slicers:

    • ☑ Product
    • ☑ Reseller
    • ☐ Month
    • ☐ Sales
  4. Click OK

Slicers appear as floating windows you can move and resize.


Conclusion

Excel is more than a spreadsheet,it’s a powerful tool for turning raw data into clear insights. With logical, text and date functions, you can clean data, automate decisions and analyze trends efficiently. Pivot tables, charts, and slicers take this further by helping you summarize, visualize, and explore data with ease.

You don’t need to know every Excel feature to be effective. What matters is understanding which tool to use and when. With these fundamentals, you’re already equipped to solve most real-world data problems confidently.

Keep practicing, experiment with your own datasets, and build from here. This foundation sets you up to work smarter, faster, and more confidently with data. 🚀

Top comments (0)