It Was 11:47 PM on a Month-End Night

I was on my seventeenth reconciliation spreadsheet. My eyes were crossing. My back hurt. My coffee was cold.

The task? Reconcile 500 investment accounts between our legacy accounting system and our custodian’s reports.

The process?

  1. Open trial balance export (text file with 50,000 lines)
  2. Copy account numbers
  3. Paste into Excel
  4. Open custodian report
  5. Copy balances
  6. Paste into Excel
  7. Create VLOOKUP formulas
  8. Find differences
  9. Research discrepancies
  10. Repeat 499 more times

Time required: 40 hours per month (an entire work week!)

There HAD to be a better way.


The Breaking Point

Month-end was always brutal, but December was worse. Our CFO needed the reconciliations by 9 AM the next morning for the board meeting.

At midnight, I was only halfway through. I made a decision:

“I’m learning Python. Tonight. Right now.”

(Okay, technically I decided to learn Python the NEXT day, but dramatic effect, you know?)


What I Discovered: Python Could Do in 30 Seconds What Took Me 40 Hours

The Old Way (Excel Hell):

Step 1: Parse trial balance text file

  • Open in Notepad
  • Copy account sections manually
  • Paste into Excel
  • Use Text-to-Columns
  • Hope it doesn’t break
  • Time: 2 hours

Step 2: Parse custodian report

  • Same nightmare
  • Different format
  • Different issues
  • Time: 2 hours

Step 3: Reconcile

  • Create VLOOKUP formulas
  • =VLOOKUP(A2, OtherSheet!$A:$Z, 25, FALSE)
  • Copy down 500 rows
  • Wait for Excel to stop calculating
  • Find #N/A errors
  • Fix them manually
  • Time: 36 hours (seriously)

Total: 40 hours/month


The Python Way: Automation Magic

import pandas as pd

# Step 1: Read trial balance (messy text file)
tb = pd.read_table('trial_balance.txt', header=None, names=['raw'])

# Extract account numbers and balances
tb[['Account', 'Description', 'Balance']] = tb['raw'].str.extract(
 r'(\d{6})\s+(.{30})\s+([\d,]+\.\d{2}(?:\s+CR)?)'
)

# Handle credit notation
mask = tb['Balance'].str.contains('CR')
tb.loc[mask, 'Balance'] = '-' + tb.loc[mask, 'Balance'].str.replace(' CR', '')
tb['Balance'] = tb['Balance'].str.replace(',', '').astype(float)

# Step 2: Read custodian report (Excel)
custodian = pd.read_excel('custodian_report.xlsx', sheet_name='Holdings')
custodian = custodian.rename(columns={'Acct_Number': 'Account', 'Market_Value': 'Custodian_Balance'})

# Step 3: Reconcile with ONE LINE
reconciliation = pd.merge(
 tb[['Account', 'Description', 'Balance']],
 custodian[['Account', 'Custodian_Balance']],
 on='Account',
 how='outer', # Keep accounts from both sources
 indicator=True # Flag where each account came from
)

# Calculate differences
reconciliation['Difference'] = reconciliation['Balance'].fillna(0) - reconciliation['Custodian_Balance'].fillna(0)

# Flag material differences (over $100)
reconciliation['Material'] = abs(reconciliation['Difference']) > 100

# Export results
reconciliation.to_excel('reconciliation_results.xlsx', index=False)

# Summary stats
print(f"Total accounts: {len(reconciliation)}")
print(f"Perfect matches: {(reconciliation['Difference'] == 0).sum()}")
print(f"Material differences: {reconciliation['Material'].sum()}")
print(f"Total difference: ${reconciliation['Difference'].sum():,.2f}")

Time: 30 seconds


The Results

Before Python:

  • Time: 40 hours/month
  • Stress Level: Through the roof
  • Errors: 5-10 per month (copy-paste mistakes)
  • Accounts Reconciled: 500 (painfully)
  • Late Nights: Every month-end
  • Cost: $2,000/month in overtime

After Python:

  • Time: 30 seconds + 2 hours research (only material items)
  • Stress Level: Manageable
  • Errors: Zero (automated = consistent)
  • Accounts Reconciled: 500 (automatically)
  • Late Nights: Haven’t had one in 6 months
  • Cost: $0 overtime

Annual Savings: 456 hours = $24,000 in labor costs


What Made This Work: The Key Patterns

1. Text Parsing with Regex

Instead of manually splitting text, Python extracts exactly what you need:

# Extract: Account (6 digits), Description (30 chars), Balance (numbers)
pattern = r'(\d{6})\s+(.{30})\s+([\d,]+\.\d{2}(?:\s+CR)?)'
df[['Account', 'Description', 'Balance']] = df['raw'].str.extract(pattern)

2. Outer Merge with Indicator

Find accounts that exist in:

  • Both systems (perfect!)
  • Only trial balance (missing from custodian - investigate)
  • Only custodian (missing from books - BIG PROBLEM)
df = pd.merge(left, right, on='Account', how='outer', indicator=True)
# indicator creates '_merge' column: 'both', 'left_only', 'right_only'

3. Vectorized Calculations

No formulas to copy. No waiting. Just instant math:

# Calculate differences for ALL 500 accounts at once
df['Difference'] = df['Balance'].fillna(0) - df['Custodian_Balance'].fillna(0)

The “Aha!” Moment

The CFO walked by my desk at 9:15 AM (15 minutes after the deadline).

CFO: “How’s the reconciliation coming? Board meeting is in an hour.”

Me: “Done. Sent it to you 10 minutes ago.”

CFO: suspicious “Done? Like… all 500 accounts?”

Me: “Yep. 498 perfect matches. 2 material differences. Already researched them. It’s a timing difference from yesterday’s trades.”

CFO: long pause “It usually takes you the whole week…”

Me: “Not anymore.” tries not to grin too much

CFO: “Can you teach the rest of the team?”

Me: “Already writing the training materials.”

That’s when I knew Python had changed my career forever.


Bonus: Handling Common Issues

Issue #1: Accounts in Different Formats

Problem: Trial balance uses “001234” but custodian uses “1234”

Solution:

# Standardize before merging
tb['Account'] = tb['Account'].str.lstrip('0') # Remove leading zeros
custodian['Account'] = custodian['Account'].astype(str).str.zfill(6) # Add leading zeros

Issue #2: Fuzzy Matching

Problem: Sometimes descriptions don’t match exactly

Solution:

from fuzzywuzzy import fuzz

# Calculate similarity score
df['similarity'] = df.apply(
 lambda row: fuzz.ratio(row['Description_TB'], row['Description_Custodian']),
 axis=1
)

# Flag potential matches
df['Potential_Match'] = df['similarity'] > 85

Issue #3: Multiple Custodians

Problem: We have 5 different custodians

Solution:

# Read all custodian files at once
custodian_files = [
 'custodian_A.xlsx',
 'custodian_B.xlsx',
 'custodian_C.xlsx',
 'custodian_D.xlsx',
 'custodian_E.xlsx'
]

all_custodian = pd.concat([
 pd.read_excel(f, sheet_name='Holdings').assign(Custodian=f.split('_')[1].split('.')[0])
 for f in custodian_files
])

# Now reconcile against combined custodian data
reconciliation = pd.merge(tb, all_custodian, on='Account', how='outer')

Your Turn: Start Small

You don’t need to automate 500 reconciliations on day one.

Week 1: Automate reading one file Week 2: Add the merge logic Week 3: Handle credit/debit notation Week 4: Add difference calculations Week 5: Export results

By week 5, you’ll have your first automated reconciliation.

By week 10, you’ll wonder how you ever did it manually.


The Bottom Line

Before Python:

  • 40 hours of copy-paste hell
  • Stressed
  • Error-prone
  • Unsustainable

After Python:

  • 30 seconds of automated bliss
  • Confident
  • Accurate
  • Scalable

The best part? Once it’s set up, it works forever. Next month? 30 seconds. Month after? 30 seconds.

Your time back: 456 hours/year

What would you do with an extra 456 hours?

I learned Python. Best decision of my career.


Try It Yourself

Want the complete working script? Download it from our GitHub (adapt it to your reconciliation needs).

Have questions? Drop a comment below. Seriously—I read every one.

Your reconciliation nightmare can end tonight.


Join the Discussion on Discord! -

Have questions about automating reconciliations? Want to share your own automation success story? Join our community on Discord!

We’re building a community of finance professionals who are tired of Excel hell and ready to level up with Python.

Join PANDAUDIT Discord Server


Next time: “When VLOOKUP Fails You: Why Merge Functions Are Your New Best Friend” →