🏠

7.16 Case Study 4: The AST Instrumentation Mistake

The Setup: A team of five developers inherited a large Django monolith with 80 apps and 15,000 lines of view code. The application had grown organically over 5 years with minimal documentation. New features required understanding how requests flowed through the system, which views called which helper functions, and how data transformed across layers.

The team lead, an experienced Python developer familiar with AST manipulation, proposed an "elegant solution": build a custom decorator injection system using Python's ast module that would automatically instrument every function call across the entire codebase.

The vision:

# Original code - no instrumentation

def process_order(order_id):

    order = fetch_order(order_id)

    validate_inventory(order)

    charge_payment(order)

    send_confirmation(order)

After running the AST transformation:

# Automatically instrumented code

@trace_execution

def process_order(order_id):

    @trace_execution

    def fetch_order(order_id):

        ...



    @trace_execution

    def validate_inventory(order):

        ...

    # etc.

The system would inject @trace_execution decorators into every function definition, collect call traces, and visualize the execution flow in a web dashboard.

The team lead argued:

The team agreed and allocated 1 week for development.

Problem: Team built custom AST decorator injection system

Let's walk through what actually happened over those 3 weeks (it was never just 1 week).

Week 1: Initial development

The team started with Python's ast module, which parses Python source code into an abstract syntax tree:

import ast

import inspect



class InstrumentationTransformer(ast.NodeTransformer):

    """Injects @trace_execution decorator into all function definitions"""



    def visit_FunctionDef(self, node):

        # Add decorator to function

        decorator = ast.Name(id='trace_execution', ctx=ast.Load())

        node.decorator_list.insert(0, decorator)



        # Process nested functions

        self.generic_visit(node)

        return node



def instrument_module(module):

    """Transform a Python module to add instrumentation"""

    source = inspect.getsource(module)

    tree = ast.parse(source)



    transformer = InstrumentationTransformer()

    new_tree = transformer.visit(tree)



    # Compile and execute the transformed code

    ast.fix_missing_locations(new_tree)

    code = compile(new_tree, module.__file__, 'exec')

    exec(code, module.__dict__)

This looked promising in early tests. They could parse a Python file, inject decorators, and see traced execution.

Problems discovered in Week 1:

  1. Import handling: The transformed code needed to import trace_execution, but where should that import be added in the AST?

  2. Class methods: The transformer needed special handling for methods vs functions:

class OrderProcessor:

    def process(self, order_id):  # self parameter handling

        ...
  1. Async functions: Required async def support and await handling:
async def fetch_user_data(user_id):  # Different AST node type

    ...
  1. Decorators already present: Some functions already had decorators like @login_required. Where should @trace_execution be inserted?

  2. Lambda functions: lambda x: x + 1 is a function but doesn't have a decorator list

By the end of Week 1, the team had a working prototype for simple synchronous functions, but it broke on ~40% of the codebase due to edge cases.

Week 2: Fighting edge cases

The team spent Week 2 handling edge cases:

class InstrumentationTransformer(ast.NodeTransformer):

    def visit_FunctionDef(self, node):

        # Skip if already instrumented

        if any(self._is_trace_decorator(d) for d in node.decorator_list):

            return node



        # Add decorator

        decorator = ast.Name(id='trace_execution', ctx=ast.Load())

        node.decorator_list.insert(0, decorator)

        self.generic_visit(node)

        return node



    def visit_AsyncFunctionDef(self, node):

        # Same logic but for async functions

        return self.visit_FunctionDef(node)



    def visit_Lambda(self, node):

        # Can't decorate lambdas - skip

        return node



    def visit_ClassDef(self, node):

        # Process methods inside classes

        self.generic_visit(node)

        return node



    def _is_trace_decorator(self, decorator_node):

        # Check if decorator is our trace_execution

        ...

New problems in Week 2:

  1. Line number preservation: After transformation, tracebacks showed wrong line numbers:
# Original code:

def process_order(order_id):  # Line 45

    order = fetch_order(order_id)  # Line 46

    raise ValueError("Invalid order")  # Line 47



# Error shows: "ValueError at line 52" (wrong!)

The fix required calling ast.fix_missing_locations() correctly and preserving lineno and col_offset attributes.

  1. Source code modification detection: The system needed to detect when source files changed and re-instrument:
import hashlib



def get_file_hash(filepath):

    with open(filepath, 'rb') as f:

        return hashlib.md5(f.read()).hexdigest()



# Cache transformed modules by hash

_instrumented_cache = {}



def instrument_if_changed(module):

    filepath = module.__file__

    current_hash = get_file_hash(filepath)



    if filepath in _instrumented_cache:

        if _instrumented_cache[filepath]['hash'] == current_hash:

            return  # Already instrumented, skip



    instrument_module(module)

    _instrumented_cache[filepath] = {'hash': current_hash}
  1. Third-party library instrumentation: Instrumenting django.views and other libraries required modifying code in site-packages, which:
  2. Gets overwritten on pip install --upgrade

  3. Breaks virtual environment isolation

  4. Causes weird bugs when instrumented code doesn't match installed version

  5. Performance overhead: Every function call now had decorator overhead. Simple functions became noticeably slower:

# Before: ~50ns per call

def add(a, b):

    return a + b



# After: ~500ns per call (10Γ— slower!)

@trace_execution

def add(a, b):

    return a + b

For functions called thousands of times per request, this added hundreds of milliseconds to page load times.

By end of Week 2, the system "worked" on ~80% of the codebase but had significant performance problems and edge case bugs.

Week 3: The maintenance burden becomes clear

Week 3 was spent on:

  1. Debugging the instrumenter itself: When the AST transformation had bugs, error messages were cryptic:
SyntaxError: invalid syntax (<unknown>, line 0)

Developers spent hours figuring out which AST transformation produced invalid code.

  1. Integration with existing tools: Django Debug Toolbar and the custom instrumentation conflicted. The toolbar's middleware tried to instrument already-instrumented code, creating double-wrapping issues.

  2. Documentation: The system was complex enough that team members needed documentation to understand:

    • How to enable/disable instrumentation

    • How to add custom trace handlers

    • How to debug when instrumentation breaks

    • Which code paths were instrumented vs not

  3. Testing: How do you test an AST transformer? The team wrote tests:

def test_instrument_simple_function():

    source = """

def hello():

    return "world"

"""

    tree = ast.parse(source)

    transformed = InstrumentationTransformer().visit(tree)

    # Assert decorator was added...

    # But does it actually work at runtime?

Testing required both AST-level assertions and runtime execution tests.

Result: 3 weeks of development, fragile, hard to maintain

After 3 weeks, the team had a working system with significant problems:

What they built:

Total: ~2,000 lines of custom instrumentation infrastructure.

What was fragile:

  1. Python version sensitivity: AST structure changed between Python 3.8, 3.9, 3.10. The transformer broke on Python 3.10's match statements (new AST nodes).

  2. Framework updates: Django upgrades sometimes changed internal APIs that the instrumentation relied on.

  3. Syntax edge cases: Any Python syntax the team hadn't explicitly handled (walrus operator :=, positional-only parameters /, etc.) caused crashes.

  4. Performance unpredictability: Some code paths had severe overhead, others were fine. No clear pattern.

  5. Debug mode only: Too slow for production, so only usable in development. But bugs often appeared only in production.

  6. Onboarding pain: New team members needed 2-3 days to understand the instrumentation system before they could effectively use it.

What was hard to maintain:

The final straw: A Django security update broke the instrumentation in a subtle way. For 2 days, developers couldn't trace execution flow. They had become dependent on the custom system and were now blocked without it.

Resolution: Replaced with Django Debug Toolbar + sys.settrace()

The team finally questioned: "What problem are we actually trying to solve?"

The answer:

Could these be solved without custom AST instrumentation? Yes.

The replacement stack:

1. Django Debug Toolbar (for 80% of use cases):

pip install django-debug-toolbar

Add to settings.py:

INSTALLED_APPS = [

    'debug_toolbar',

    # ...

]



MIDDLEWARE = [

    'debug_toolbar.middleware.DebugToolbarMiddleware',

    # ...

]

What it provides:

2. sys.settrace() for custom tracing (for the remaining 20%):

For specific investigations where Debug Toolbar wasn't enough:

import sys



def trace_calls(frame, event, arg):

    if event == 'call':

        code = frame.f_code

        print(f"Calling {code.co_name} in {code.co_filename}:{frame.f_lineno}")

    return trace_calls



# Enable tracing

sys.settrace(trace_calls)



# Your code here

process_order(order_id)



# Disable tracing

sys.settrace(None)

This is 10 lines instead of 2,000, and it's built into Python.

3. VS Code debugger (for deep dives):

Instead of automated tracing, use breakpoints:

def process_order(order_id):

    breakpoint()  # Python 3.7+ built-in

    order = fetch_order(order_id)

    validate_inventory(order)

Run with debugger attached, step through execution, inspect variables.

The comparison:

| Aspect | Custom AST System | Standard Tools |

| -------------------------------- | ------------------------------------- | --------------------------------------------- |

| Development time | 3 weeks | 5 minutes (install Debug Toolbar) |

| Lines of code | 2,000+ | 0 (for Debug Toolbar), 10 (for sys.settrace) |

| Maintenance burden | 10-15 hours/month | 0 hours (maintained by community) |

| Python version compatibility | Breaks on each version | Works automatically |

| Performance impact | 10Γ— slower for instrumented functions | Negligible (Debug Toolbar only in DEBUG=True) |

| Coverage | 80% of codebase (edge cases broken) | 100% of codebase |

| Learning curve | 2-3 days for new developers | 30 minutes |

| SQL query visibility | Not implemented | Built into Debug Toolbar |

| Template debugging | Not implemented | Built into Debug Toolbar |

| Third-party code | Risky/broken | Works seamlessly |

| Production use | Not viable (too slow) | N/A (dev-only by design) |

The migration:

The team spent 2 days removing the custom instrumentation system:

  1. Uninstall the custom package

  2. Remove AST transformation calls from manage.py and app initialization

  3. Install Django Debug Toolbar

  4. Document common debugging workflows using standard tools

Immediate benefits:

For the 20% case (custom deep tracing):

When Debug Toolbar wasn't enough, they wrote targeted sys.settrace() scripts:

# scripts/trace_order_processing.py

import sys

import os

import django



# Setup Django

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myproject.settings')

django.setup()



from orders.views import process_order



# Define what to trace

modules_to_trace = ['orders', 'inventory', 'payment']



def selective_trace(frame, event, arg):

    if event == 'call':

        code = frame.f_code

        module = frame.f_globals.get('__name__', '')



        # Only trace our modules

        if any(mod in module for mod in modules_to_trace):

            args = {k: frame.f_locals[k] for k in code.co_varnames[:code.co_argcount]}

            print(f"{'  ' * len([f for f in sys._getframe().f_back if f])} β†’ {module}.{code.co_name}({args})")



    return selective_trace



sys.settrace(selective_trace)

process_order(order_id=123)

sys.settrace(None)

Run this script when needed:

python scripts/trace_order_processing.py

Output:

β†’ orders.views.process_order({'order_id': 123})

  β†’ orders.models.fetch_order({'order_id': 123})

    β†’ orders.db.query({'model': 'Order', 'id': 123})

  β†’ inventory.validators.validate_inventory({'order': <Order>})

    β†’ inventory.models.check_stock({'product_id': 456})

  β†’ payment.processors.charge_payment({'order': <Order>})

This 30-line script gave them the same call tracing they wanted from AST instrumentation, but:

Lessons learned: Solve the problem, not the imagined elegant solution

This case study teaches several critical lessons about tool choice and software engineering judgment.

Lesson 1: Beware the "elegant solution" trap

The AST instrumentation seemed elegant because:

But elegance in design doesn't equal practicality in execution. The "elegant" solution:

The real problem vs. the imagined problem:

| What team thought they needed | What they actually needed |

| ------------------------------------------ | ---------------------------------------------- |

| "Automatic instrumentation of all code" | "See which functions execute for this request" |

| "Comprehensive call graphs" | "Understand this specific authentication flow" |

| "Production-ready tracing infrastructure" | "Debug why this view is slow" |

| "Deep insights into third-party libraries" | "See what SQL queries Django ORM generates" |

They were solving the imagined problem (comprehensive instrumentation) instead of the real problem (understand execution flow).

Lesson 2: Standard tools encode community wisdom

Django Debug Toolbar exists because thousands of Django developers needed exactly what this team needed. The toolbar includes:

This knowledge came from years of real-world Django debugging. Building custom tools meant rediscovering all these insights from scratch.

The community wisdom they missed:

When they finally asked "How do experienced Django developers trace execution?", the answers were:

They could have saved 3 weeks by asking this question first.

Lesson 3: Maintenance burden is often invisible upfront

Week 1: "This is so cool, look what we built!"

Week 5: "Why is this breaking on Python 3.10?"

Week 10: "Can someone who understands the AST transformer fix this bug?"

Week 15: "The original developer left, and no one wants to maintain this"

Week 20: "Let's just use Django Debug Toolbar"

The team didn't account for:

Estimation failure:

| Initial estimate | Actual cost |

| -------------------------------- | -------------------------------------------- |

| 1 week development | 3 weeks development |

| 0 maintenance (it'll just work!) | 10-15 hours/month maintenance |

| Entire team can use it | Only original developer fully understands it |

| Will pay dividends forever | Replaced after 5 months |

Lesson 4: "Not Invented Here" syndrome in debugging tools

The team wanted to build their own solution instead of using existing tools because:

The cost of NIH:

Total: ~$42,000 to build a tool that was replaced by a free, community-maintained one.

Lesson 5: Ask "What would a professional use?" before building

Before building any developer tool, ask:

  1. "What do experts in this framework/language use?"

  2. "Is there a reason I haven't heard of this problem being solved?"

  3. "What's the community-standard approach?"

For Django:

If you're about to build something that seems obviously useful but doesn't exist in the community, the most likely explanation is: it already exists and you haven't found it yet, or there's a good reason it doesn't exist.

Lesson 6: Start with the simplest tool that could work

The team's tool selection process should have been:

  1. Try Django Debug Toolbar first (5 minutes)
  2. Does this solve the problem? β†’ YES for 80% of cases

  3. Try debugger breakpoints second (10 minutes)

  4. Does this solve remaining cases? β†’ YES for 15% of cases

  5. Try sys.settrace() third (30 minutes)

  6. Does this solve edge cases? β†’ YES for 4% of cases

  7. Only then consider custom tools (for the final 1%)

  8. Is the ROI worth it? β†’ Almost always NO

Instead, they jumped straight to step 4 because it seemed more impressive.

The correct escalation pattern:

Simple problem β†’ Simple tool (print/log)

    ↓ (not sufficient)

Moderate problem β†’ Standard tool (Debug Toolbar, debugger)

    ↓ (not sufficient)

Complex problem β†’ Advanced standard tool (py-spy, sys.settrace)

    ↓ (not sufficient)

Unique problem β†’ Custom tool (AST, instrumentation)

Lesson 7: Production constraints should inform tool choice

The AST system was too slow for production, which meant:

Contrast with standard tools:

They built a dev-only tool when production-safe alternatives existed.

Lesson 8: Documentation is a warning sign

When the team needed to write documentation for their custom tool, that was a signal: "This is too complex."

Django Debug Toolbar documentation:

# Installation

pip install django-debug-toolbar

# Add to INSTALLED_APPS

INSTALLED_APPS = ['debug_toolbar', ...]

# Done!

Their custom tool documentation:

# Installation

pip install custom-tracer

# Configuration

Edit settings.py to add:

- AST_TRANSFORM_ENABLED = True

- AST_TRACE_MODULES = [...]

- AST_CACHE_DIR = "..."

# Usage

- Enable tracing: tracer.enable()

- Disable tracing: tracer.disable()

- View traces: http://localhost:8000/**traces**/

# Troubleshooting

If you see "SyntaxError: invalid syntax"...

If tracing isn't working...

If performance is slow...

[20 more pages of edge cases]

Complex documentation = complex tool = wrong tool.

The golden rule validated:

This chapter opened with a golden rule: "If you're writing code to trace code, you're probably solving the wrong problem."

This case study proves it. The team wrote 2,000 lines of tracing infrastructure when they should have installed a 5-minute package.

When custom instrumentation IS justified:

There are legitimate cases for custom instrumentation:

Notice these are all: you are building a product for others to use or regulatory requirements demand custom solutions.

For understanding your own codebase during development? Standard tools win every time.

Final reflection:

Six months after switching to Django Debug Toolbar:

The "elegant" solution was actually the problem, not the solution. The real elegance was in using boring, standard tools that just worked.

Key takeaways from this case study:

  1. Sophistication β‰  Appropriateness: AST manipulation is sophisticated, but Django Debug Toolbar is appropriate

  2. Automation β‰  Better: Automated instrumentation sounds great but manual debugging is often faster

  3. Custom β‰  Optimal: Custom tools might fit your workflow perfectly, but standard tools fit everyone's workflow adequately

  4. Elegant β‰  Practical: Code beauty doesn't pay the maintenance bill

  5. New β‰  Necessary: Just because you can build it doesn't mean you should

The question to ask: Before building any developer tool, ask yourself:

"If Django Debug Toolbar exists and I'm about to build something that seems similar, what do I know that thousands of Django developers don't?"

Usually the answer is: "Nothing. I should just use Debug Toolbar."

This humilityβ€”recognizing that your problem probably isn't uniqueβ€”is the foundation of professional judgment in tool selection.


Part V Summary: Patterns Across Case Studies

Looking across all four case studies, several patterns emerge:

Pattern 1: The right tool makes "impossible" problems trivial

The time difference isn't 10-20%, it's 10-20Γ—.

Pattern 2: Standard tools reveal invisible behavior

Each case study had "invisible" behavior that standard tools made visible:

Pattern 3: Custom solutions create maintenance burden

The only case study involving custom code (AST instrumentation) resulted in:

Pattern 4: Community knowledge > Individual cleverness

Every standard tool used (Debug Toolbar, React DevTools, py-spy) encoded years of community experience:

These insights came from thousands of developers encountering the same problems.

Pattern 5: Simple first, complex only when necessary

None of these case studies required:

They all required:

The meta-lesson: Professional debugging is about tool selection, not code cleverness. The developers in these case studies weren't less skilledβ€”they were using inappropriate tools for their problems. Switching to standard tools transformed their effectiveness.

Use these case studies as templates:

When you encounter similar problems:

These patterns repeat across languages, frameworks, and problem domains. The specific tools change (Debug Toolbar for Django, React DevTools for React), but the principle remains: standard, purpose-built tools beat custom solutions.