🏠

7.3 Understanding What You Actually Need

You're convinced. You won't build custom instrumentation. But you're staring at a complex codebase, and you still need to understand what it does. Before reaching for any tool, you need to clarify exactly what question you're trying to answer. This isn't pedantic philosophy—it's practical strategy. Different questions require different tools, and using the wrong tool wastes time.

Distinguishing between tracing, profiling, and debugging

These three words get used interchangeably in conversation, but they represent distinct activities with distinct tools. Let's define them precisely:

Tracing answers: "What code runs and in what order?"

You want to see the execution path. You need to know: When I click this button, what functions get called? What's the sequence? Which middleware runs? What order do signal handlers fire in?

Tracing is about flow. You're building a map of the execution journey. Think of it like recording a video of your code as it runs—you want to see every frame in sequence.

Profiling answers: "Where does the time go?"

You know the code is slow. You need to identify bottlenecks. Which function takes 5 seconds? How many times is this database query executed? What percentage of CPU time is spent in this loop?

Profiling is about performance. You're measuring resource consumption—time, memory, CPU cycles. Think of it like a stopwatch on every function, showing you which ones dominate the runtime.

Debugging answers: "Why doesn't this work as expected?"

You have a bug. You need to understand why the code behaves incorrectly. What's the value of this variable at this point? Why did this conditional take the wrong branch? Why didn't this function get called?

Debugging is about state. You're inspecting values, watching variables change, testing hypotheses about what went wrong. Think of it like pausing time and looking around to see what's happening.

Let's see why the distinction matters:

Scenario 1: The form submission mystery

Sarah's problem from earlier: "I need to understand what happens when I submit this form."

Scenario 2: The slow API endpoint

Your API endpoint /api/products returns in 2 seconds. Your boss wants it under 500ms. You need to optimize.

Scenario 3: The incorrect calculation

A pricing calculation returns $42.50 when it should return $38.75. You need to find the bug.

The Key Insight: These aren't just different words for the same thing. They're different lenses for looking at code execution:

Most tools do one of these well, and the others poorly or not at all. A profiler can show you function call counts, but it won't let you inspect variables. A debugger lets you step through code, but it doesn't automatically measure timing.

Here's the practical implication: Before you open any tool, ask yourself:

  1. "Do I need to know WHAT runs?" → Tracing

  2. "Do I need to know HOW LONG things take?" → Profiling

  3. "Do I need to know WHY something produces wrong output?" → Debugging

Often, you need more than one. You might trace to understand the flow, then profile to find bottlenecks, then debug to fix a specific bug. But you do them sequentially with different tools, not all at once.

The three questions every codebase explorer asks

When you start working with unfamiliar code, you're really trying to answer three fundamental questions. Let's examine each one and understand what it's really asking.

"What code runs when I do X?"

This is the most common question when exploring a codebase. You perform some action—submit a form, click a button, send an API request—and you want to know what code executes as a result.

What you're really asking:

Example situations:

Why this matters:

You can't understand a system by reading class definitions and function signatures. You need to see the system in motion. Modern applications have execution paths determined by:

The only way to see the complete picture is to watch what actually runs.

Real example:

You inherit a Django admin interface. Users report: "When we delete a user via the admin panel, their comments disappear immediately, but their forum posts stay for 24 hours, then disappear. Why?"

Reading the code:

What actually happens (discovered by tracing):

You cannot find this by reading. The signal handler is in a different app. The Celery task is in yet another file. The 24-hour delay is configured in settings. Only tracing the actual execution reveals the complete flow.

"In what order does it execute?"

This question often follows the first one. You've identified the pieces—now you need to understand their sequence and relationships.

What you're really asking:

Example situations:

Why order matters:

Order determines:

Real example:

Your Django app has a view decorated with three custom decorators:

@require_login

@check_subscription

@rate_limit(max_calls=100)

def api_endpoint(request):

    # implementation

A bug report: "Free users are hitting the rate limit even though they should be rejected at subscription check."

The order matters here:

By tracing execution, you discover:

  1. rate_limit runs first (decorators execute bottom-to-top)

  2. check_subscription runs second

  3. require_login runs third

  4. The view runs last (if all decorators pass)

The fix: reorder the decorators so subscription check runs before rate limiting. You cannot determine the correct order by reading Python syntax alone—decorator execution order is a language feature you need to know, and tracing confirms your understanding.

"What data flows through the system?"

This is the most detailed question. You understand what runs and in what order—now you want to see the actual data being processed.

What you're really asking:

Example situations:

Why data matters:

Understanding execution flow isn't enough if you can't see the values. You might know that calculate_discount(user, price) runs, but if you can't see that user.discount_tier is unexpectedly None, you can't diagnose the bug.

Real example:

Your Django REST Framework API returns user data:

{
  "id": 123,

  "username": "alice",

  "email": "alice@example.com",

  "is_premium": true,

  "last_login": "2025-10-11T10:30:00Z",

  "subscription_expires": "2025-12-31T23:59:59Z"
}

You look at your serializer:

class UserSerializer(serializers.ModelSerializer):

    class Meta:

        model = User

        fields = ['id', 'username', 'email', 'is_premium']

Wait—your serializer only defines 4 fields, but the API returns 6. Where do last_login and subscription_expires come from?

Reading the code doesn't help. There's no obvious place where those fields are added. Possibilities:

By tracing execution and inspecting data at each step, you discover:

  1. Your serializer inherits from CustomModelSerializer (defined in a base app)

  2. CustomModelSerializer.to_representation() is overridden

  3. It checks if the model has last_login or subscription_expires fields

  4. If so, it automatically adds them to the output

The extra fields come from an implicit framework convention your team built years ago. You only discover this by watching the data transform through the serialization pipeline.

Matching tools to questions: A decision matrix

Now that you understand the three core questions, let's map tools to questions. This is your decision-making guide.

Question: "What code runs when I do X?"

| Tool | Best For | Limitations |

| ------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------- |

| Framework DevTools (Django Debug Toolbar, React DevTools) | First line of investigation. Shows high-level execution in context of framework. | Framework-specific. Won't show code outside framework conventions. |

| Debugger with breakpoints | Precise control. Set breakpoint at entry point, see call stack. | Requires knowing where to set breakpoint. Can't see past executions. |

| Logging/Print statements | Quick checks. "Did this function run?" | Requires code changes. No call stack context. |

| sys.settrace() or similar | Comprehensive function call log without code changes. | Overwhelming output. Hard to filter. Performance impact. |

| APM tools (New Relic, DataDog) | Production environments. Historical data. | Expensive. Overkill for local exploration. Setup overhead. |

Recommended approach:

  1. Start with framework DevTools (if available)

  2. Use debugger breakpoints for precise investigation

  3. Use sys.settrace() only if you need comprehensive automated logging

  4. Avoid custom instrumentation

Question: "In what order does it execute?"

| Tool | Best For | Limitations |

| ------------------------------- | --------------------------------------------------------------- | -------------------------------------------------------- |

| Debugger call stack | Shows current execution hierarchy at any breakpoint. | Point-in-time only. Can't see full flow history. |

| Debugger step-through | Watch execution step by step. See exact order. | Slow for large flows. Requires interactive stepping. |

| Framework DevTools timeline | Visualizes execution order (React Profiler, Chrome Performance) | Limited to framework-aware execution. |

| sys.settrace() output | Complete chronological log of all function calls. | Too much information. Hard to read. |

| Logging with timestamps | See order of specific checkpoints. | Requires adding log statements. Only shows what you log. |

Recommended approach:

  1. Use debugger call stack to understand nesting

  2. Step through execution with debugger for sequential flow

  3. Use framework timelines for visual understanding

  4. Add strategic log statements only for long-running flows you can't step through interactively

Question: "What data flows through the system?"

| Tool | Best For | Limitations |

| ------------------------------ | ------------------------------------------------------------- | ------------------------------------------------------ |

| Interactive debugger | Inspect any variable at any breakpoint. Evaluate expressions. | Requires stopping execution. Must know where to break. |

| Debugger watch expressions | Monitor specific variables as you step. | Manual setup. Only watches what you specify. |

| Logging data | Record values at specific points. | Requires code changes. Can't inspect on-the-fly. |

| Framework DevTools | See framework-specific data (props, state, SQL queries). | Only shows framework-aware data structures. |

| Memory profilers | Track object allocation and references. | For memory issues, not general data inspection. |

Recommended approach:

  1. Always start with interactive debugger

  2. Set breakpoints at key points in flow

  3. Inspect variables, evaluate expressions, modify values to test hypotheses

  4. Use framework DevTools for framework-specific data (database queries, component props)

  5. Add logging only for data in production or long-running processes

Combined Questions Decision Tree:

START: I need to understand unfamiliar code

│

├─ Do I need to understand WHAT runs?

│  │

│  ├─ Is there a framework-specific DevTool?

│  │  └─ YES → Start there (Django Debug Toolbar, React DevTools, etc.)

│  │  └─ NO → Use debugger with breakpoint at entry point

│  │

│  └─ Do I also need to see ORDER?

│     └─ YES → Use debugger step-through + call stack view

│     └─ NO → Framework DevTools or single breakpoint is enough

│

├─ Do I need to understand WHERE TIME IS SPENT?

│  │

│  └─ Use a profiler

│     ├─ Python: cProfile, py-spy, line_profiler

│     ├─ JavaScript: Chrome Performance tab, clinic.js

│     └─ NOT a debugger (too slow), NOT a tracer (no timing info)

│

└─ Do I need to understand WHY BEHAVIOR IS WRONG?

   │

   └─ Use interactive debugger

      ├─ Set breakpoint where you suspect the bug

      ├─ Inspect variables

      ├─ Step through to find where behavior diverges from expected

      └─ Use conditional breakpoints to catch specific cases

The Critical Principle: Start Simple, Escalate Only When Necessary

Notice the pattern in all the recommendations above:

  1. Try the simplest tool first (framework DevTools, single breakpoint)

  2. Escalate to more powerful tools only if simple ones don't answer the question

  3. Build custom instrumentation only as a last resort (or never)

Common Mistake: Developers often skip straight to complex tools because they seem more powerful. They use sys.settrace() when a single breakpoint would suffice. They build custom instrumentation when Django Debug Toolbar would answer their question in 30 seconds.

The Efficiency Rule: The best tool is the one that answers your question in the least time with the least setup. Sometimes that's a single print statement. Sometimes it's a debugger. It's almost never custom AST transformation.

Example Decision-Making in Practice:

Scenario: New Django project. User registration seems to take 5 seconds. Need to understand why.

Bad approach:

Good approach:

  1. Install Django Debug Toolbar (3 minutes)

  2. Submit registration form

  3. Look at SQL panel: See 47 database queries

  4. Look at timeline: 4.8 seconds in queries

  5. Identify N+1 query problem

  6. Total time: 5 minutes

The difference isn't skill—it's knowing which tool answers your question.

Your Tracing Toolkit Checklist:

âś“ Learn your debugger deeply (VS Code, PyCharm, Chrome DevTools)

âś“ Install framework DevTools

âś“ Know your profiler (for performance questions)

âś“ Keep it simple

With these tools and this decision framework, you can trace execution in any codebase without building custom solutions. The tools already exist. They're more powerful than anything you'd build. Learn them deeply, and you'll understand codebases faster than developers who spend weeks building "elegant" custom instrumentation.