Fatskills
Practice. Master. Repeat.
Study Guide: TECH **Python for Data Science: Customizing Charts (Labels, Legends, Color Palettes, Annotations)**
Source: https://www.fatskills.com/introdution-to-engineering/chapter/tech-python-for-data-science-customizing-charts-labels-legends-color-palettes-annotations

TECH **Python for Data Science: Customizing Charts (Labels, Legends, Color Palettes, Annotations)**

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~10 min read

Python for Data Science: Customizing Charts (Labels, Legends, Color Palettes, Annotations)

A Hyper-Practical, Zero-Fluff Guide


1. What This Is & Why It Matters

You’re building a dashboard for a client, and your raw matplotlib/seaborn plots look like they were generated by a 2005 Excel macro. The labels are unreadable, the legend is misplaced, the colors are clashing, and there’s no clear takeaway. This is what happens when you ignore chart customization.

In production, bad charts = bad decisions. Stakeholders won’t trust your analysis if they can’t read it. Executives won’t act on insights if they’re buried in clutter. And if you’re preparing for a PL-300 (Power BI) or data science certification, you’ll fail if your visuals don’t meet professional standards.

Real-world scenario:
You’re a data scientist at a retail company. Your team just ran an A/B test on two ad campaigns. You plot the results, but: - The y-axis label says "conversion_rate" instead of "Conversion Rate (%)".
- The legend overlaps the data.
- The colors are indistinguishable for colorblind users.
- There’s no annotation highlighting the statistically significant difference.

Your manager rejects the report. You just wasted 3 hours of work because you didn’t customize the chart.

This guide will teach you how to: ✅ Make charts readable (labels, titles, axis formatting).
Guide attention (legends, annotations, color palettes).
Avoid common pitfalls (overlapping text, poor color choices).
Pass certification exams (PL-300, Google Data Analytics, etc.).


2. Core Concepts & Components


1. Labels & Titles

  • Definition: Text that describes what the chart represents (title, axis labels, tick labels).
  • Production insight: If your axis labels are unclear, stakeholders will misinterpret the data. Always use human-readable units (e.g., "Revenue (USD)" instead of "rev").

2. Legends

  • Definition: A key that explains what each color/shape in the chart represents.
  • Production insight: Legends that overlap data are useless. Always position them outside the plot or in a clear, non-obstructive area.

3. Color Palettes

  • Definition: A predefined set of colors used in the chart.
  • Production insight: Default palettes (like matplotlib’s) are not colorblind-friendly. Use seaborn’s "colorblind" or "viridis" for accessibility.

4. Annotations

  • Definition: Text or arrows added to highlight key insights (e.g., "Statistically significant (p < 0.05)").
  • Production insight: Annotations guide the viewer’s attention. Without them, important trends may go unnoticed.

5. Axis Formatting

  • Definition: Customizing axis scales (log vs. linear), tick marks, and number formatting (e.g., "$10K" instead of 10000).
  • Production insight: Log scales are great for wide-ranging data (e.g., stock prices), but linear scales are better for most business metrics.

6. Grid & Spines

  • Definition: The background lines (grid) and borders (spines) of the plot.
  • Production insight: Remove unnecessary spines (e.g., top/right borders) to reduce clutter. Light grids improve readability.

7. Figure Size & DPI

  • Definition: The dimensions (figsize) and resolution (dpi) of the saved image.
  • Production insight: Low DPI (e.g., 72) = blurry charts in reports. Use 300 DPI for print, 150 for web.

8. Subplots & Layout

  • Definition: Arranging multiple charts in a single figure.
  • Production insight: Tight layout (tight_layout()) prevents overlapping labels. subplots_adjust() fine-tunes spacing.


3. Step-by-Step Hands-On: Customizing a Sales Dashboard


Prerequisites

  • Python 3.8+ installed.
  • Libraries: matplotlib, seaborn, pandas.
    bash pip install matplotlib seaborn pandas
  • Sample dataset (we’ll use a CSV of monthly sales).

Task:

Create a professional-looking sales dashboard with: ✅ Clear labels & title.
✅ A well-positioned legend.
✅ A colorblind-friendly palette.
✅ Annotations highlighting key trends.
✅ Proper axis formatting.


Step 1: Load & Inspect Data

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load sample data (replace with your CSV)
data = {
"Month": ["Jan", "Feb", "Mar", "Apr", "May", "Jun"],
"Product_A": [12000, 15000, 18000, 22000, 25000, 28000],
"Product_B": [8000, 9000, 11000, 13000, 15000, 17000],
"Product_C": [5000, 6000, 7000, 8000, 9000, 10000] } df = pd.DataFrame(data) df.set_index("Month", inplace=True) print(df.head())

Output:


       Product_A  Product_B  Product_C
Month
Jan        12000       8000       5000
Feb        15000       9000       6000
Mar        18000      11000       7000


Step 2: Create a Basic Line Plot (Before Customization)

plt.figure(figsize=(10, 6))
plt.plot(df.index, df["Product_A"], label="Product A")
plt.plot(df.index, df["Product_B"], label="Product B")
plt.plot(df.index, df["Product_C"], label="Product C")
plt.title("Monthly Sales")
plt.xlabel("Month")
plt.ylabel("Revenue")
plt.legend()
plt.show()

Problem: The chart is ugly and unprofessional: - Default colors are hard to distinguish.
- Legend overlaps the data.
- No units on the y-axis.
- No annotations for key insights.


Step 3: Customize Labels & Title

plt.figure(figsize=(10, 6))
plt.plot(df.index, df["Product_A"], label="Product A")
plt.plot(df.index, df["Product_B"], label="Product B")
plt.plot(df.index, df["Product_C"], label="Product C")

# Custom title & labels
plt.title("Monthly Sales Performance (2023)", fontsize=16, pad=20)  # pad = space below title
plt.xlabel("Month", fontsize=12)
plt.ylabel("Revenue (USD)", fontsize=12)

plt.legend()
plt.show()

Improvements:
Clearer title with padding.
Units on y-axis (USD).


Step 4: Improve Legend & Colors

plt.figure(figsize=(10, 6))
sns.set_palette("colorblind")  # Colorblind-friendly palette

plt.plot(df.index, df["Product_A"], label="Product A", linewidth=2.5)
plt.plot(df.index, df["Product_B"], label="Product B", linewidth=2.5)
plt.plot(df.index, df["Product_C"], label="Product C", linewidth=2.5)

plt.title("Monthly Sales Performance (2023)", fontsize=16, pad=20)
plt.xlabel("Month", fontsize=12)
plt.ylabel("Revenue (USD)", fontsize=12)

# Move legend outside the plot
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0)

plt.show()

Improvements:
Colorblind-friendly palette (sns.set_palette("colorblind")).
Thicker lines for better visibility.
Legend outside the plot (bbox_to_anchor).


Step 5: Add Annotations for Key Insights

plt.figure(figsize=(10, 6))
sns.set_palette("colorblind")

plt.plot(df.index, df["Product_A"], label="Product A", linewidth=2.5)
plt.plot(df.index, df["Product_B"], label="Product B", linewidth=2.5)
plt.plot(df.index, df["Product_C"], label="Product C", linewidth=2.5)

plt.title("Monthly Sales Performance (2023)", fontsize=16, pad=20)
plt.xlabel("Month", fontsize=12)
plt.ylabel("Revenue (USD)", fontsize=12)

# Highlight Product A's growth
plt.annotate(
"Strong growth (+133%)",
xy=("Jun", 28000),
xytext=("Apr", 25000),
arrowprops=dict(facecolor="black", shrink=0.05),
fontsize=10 ) plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0) plt.show()

Improvements:
Annotation highlights key trend.
Arrow guides attention.


Step 6: Format Axes & Remove Clutter

plt.figure(figsize=(10, 6))
sns.set_palette("colorblind")

plt.plot(df.index, df["Product_A"], label="Product A", linewidth=2.5)
plt.plot(df.index, df["Product_B"], label="Product B", linewidth=2.5)
plt.plot(df.index, df["Product_C"], label="Product C", linewidth=2.5)

plt.title("Monthly Sales Performance (2023)", fontsize=16, pad=20)
plt.xlabel("Month", fontsize=12)
plt.ylabel("Revenue (USD)", fontsize=12)

# Format y-axis to show $10K instead of 10000
plt.gca().yaxis.set_major_formatter('${x:,.0f}K')

# Remove top/right spines (borders)
sns.despine()

# Add light grid
plt.grid(alpha=0.3)

plt.annotate(
"Strong growth (+133%)",
xy=("Jun", 28000),
xytext=("Apr", 25000),
arrowprops=dict(facecolor="black", shrink=0.05),
fontsize=10 ) plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0) plt.tight_layout() # Prevent label cutoff plt.show()

Final Improvements:
Y-axis formatted ($10K instead of 10000).
Removed clutter (sns.despine()).
Light grid for readability.
tight_layout() prevents label cutoff.


Step 7: Save the Chart (High DPI for Reports)

plt.savefig(
"sales_dashboard.png",
dpi=300, # High resolution for print
bbox_inches="tight" # Prevents label cutoff )

Why this matters:
- Low DPI (72) = blurry in reports.
- High DPI (300) = crisp in PDFs/print.


4. ? Production-Ready Best Practices


Labels & Titles

  • Always include units (e.g., "Revenue (USD)" instead of "Revenue").
  • Use pad in plt.title() to prevent title overlap.
  • Avoid all-caps titles (harder to read).

Legends

  • Place outside the plot (bbox_to_anchor=(1.05, 1)).
  • Use frameon=False for a cleaner look.
  • Sort legend entries if order matters.

Color Palettes

  • Use sns.set_palette("colorblind") for accessibility.
  • Avoid red/green (common in colorblindness).
  • Use sns.color_palette("viridis") for sequential data.

Annotations

  • Highlight key insights (e.g., "Statistically significant").
  • Use arrows to guide attention.
  • Keep text short (1-2 words max).

Axis Formatting

  • Use plt.gca().yaxis.set_major_formatter() for custom units.
  • Log scale for wide-ranging data (plt.yscale("log")).
  • Rotate x-axis labels if crowded (plt.xticks(rotation=45)).

Figure Size & DPI

  • figsize=(10, 6) for standard reports.
  • dpi=300 for print, dpi=150 for web.
  • bbox_inches="tight" prevents label cutoff.

Subplots & Layout

  • plt.tight_layout() prevents label overlap.
  • plt.subplots_adjust() for fine-tuning spacing.
  • Use sns.FacetGrid for multiple subplots.


5. ⚠️ Common Mistakes & Traps

Mistake Symptom Fix/Prevention
Default colors Hard to distinguish lines Use sns.set_palette("colorblind")
Legend overlaps data Can’t read legend plt.legend(bbox_to_anchor=(1.05, 1))
No units on axes Stakeholders misinterpret data Always label axes (e.g., "Revenue (USD)")
Low DPI (72) in saved charts Blurry in reports plt.savefig(dpi=300)
Annotations too long Clutters the chart Keep annotations to 1-2 words


6. ? Exam/Certification Focus


Typical Question Patterns

  1. "Which function moves the legend outside the plot?"
  2. plt.legend(loc="best")
  3. plt.legend(bbox_to_anchor=(1.05, 1))

  4. "How do you format y-axis labels as currency?"

  5. plt.ylabel("$")
  6. plt.gca().yaxis.set_major_formatter('${x:,.0f}')

  7. "Which palette is colorblind-friendly?"

  8. "tab10"
  9. "colorblind" or "viridis"

Key Trap Distinctions

  • plt.title() vs ax.set_title()
  • plt.title() works on the last active plot.
  • ax.set_title() is explicit (better for subplots).

  • sns.despine() vs plt.grid(False)

  • sns.despine() removes borders.
  • plt.grid(False) removes grid lines.

Common Scenario-Based Question

"You need to highlight a key data point in a line chart. Which method should you use?"
- ❌ plt.text() (static, no arrow) - ✅ plt.annotate() (supports arrows)


7. ? Hands-On Challenge (With Solution)


Challenge:

Create a bar chart comparing quarterly sales for 3 products. Customize it with: - A colorblind-friendly palette.
- Dollar-formatted y-axis.
- Annotations showing the highest sales quarter.

Solution:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = {
"Quarter": ["Q1", "Q2", "Q3", "Q4"],
"Product_X": [50000, 60000, 70000, 80000],
"Product_Y": [30000, 35000, 40000, 45000],
"Product_Z": [20000, 25000, 30000, 35000] } df = pd.DataFrame(data).set_index("Quarter") plt.figure(figsize=(10, 6)) sns.set_palette("colorblind") df.plot(kind="bar", width=0.8) plt.title("Quarterly Sales by Product", fontsize=16, pad=20) plt.xlabel("Quarter", fontsize=12) plt.ylabel("Revenue (USD)", fontsize=12) # Format y-axis as currency plt.gca().yaxis.set_major_formatter('${x:,.0f}') # Annotate highest sales max_sales = df.max().max() max_quarter = df.idxmax().idxmax() plt.annotate(
"Peak Sales",
xy=(df.index.get_loc(max_quarter), max_sales),
xytext=(1, max_sales * 0.9),
arrowprops=dict(facecolor="black", shrink=0.05) ) plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left') plt.tight_layout() plt.show()

Why it works:
- sns.set_palette("colorblind") ensures accessibility.
- yaxis.set_major_formatter formats currency.
- annotate() highlights the peak sales quarter.


8. ? Rapid-Reference Crib Sheet

Task Code
Set title plt.title("Title", fontsize=16, pad=20)
Set axis labels plt.xlabel("X Label", fontsize=12)
Move legend outside plt.legend(bbox_to_anchor=(1.05, 1))
Colorblind palette sns.set_palette("colorblind")
Format y-axis as currency plt.gca().yaxis.set_major_formatter('${x:,.0f}')
Remove top/right borders sns.despine()
Add annotation plt.annotate("Text", xy=(x, y), xytext=(x2, y2), arrowprops=dict(facecolor="black"))
Save high-DPI chart plt.savefig("chart.png", dpi=300, bbox_inches="tight")
Rotate x-axis labels plt.xticks(rotation=45)
⚠️ Default matplotlib colors are not colorblind-friendly Use sns.set_palette("colorblind")
⚠️ plt.legend() inside plot overlaps data Use bbox_to_anchor


9. ? Where to Go Next

  1. Matplotlib Customization Docs – Official guide to styling.
  2. Seaborn Palettes – Choosing the right colors.
  3. ColorBrewer – Predefined colorblind-friendly palettes.
  4. [PL-300 Exam Guide (Microsoft)](https://learn.microsoft.com/en-us/certifications/exams/pl


ADVERTISEMENT