It Was 11:47 PM on a Month-End Night
I was on my seventeenth reconciliation spreadsheet. My eyes were crossing. My back hurt. My coffee was cold.
The task? Reconcile 500 investment accounts between our legacy accounting system and our custodian’s reports.
The process?
- Open trial balance export (text file with 50,000 lines)
- Copy account numbers
- Paste into Excel
- Open custodian report
- Copy balances
- Paste into Excel
- Create VLOOKUP formulas
- Find differences
- Research discrepancies
- Repeat 499 more times
Time required: 40 hours per month (an entire work week!)
There HAD to be a better way.
The Breaking Point
Month-end was always brutal, but December was worse. Our CFO needed the reconciliations by 9 AM the next morning for the board meeting.
At midnight, I was only halfway through. I made a decision:
“I’m learning Python. Tonight. Right now.”
(Okay, technically I decided to learn Python the NEXT day, but dramatic effect, you know?)
What I Discovered: Python Could Do in 30 Seconds What Took Me 40 Hours
The Old Way (Excel Hell):
Step 1: Parse trial balance text file
- Open in Notepad
- Copy account sections manually
- Paste into Excel
- Use Text-to-Columns
- Hope it doesn’t break
- Time: 2 hours
Step 2: Parse custodian report
- Same nightmare
- Different format
- Different issues
- Time: 2 hours
Step 3: Reconcile
- Create VLOOKUP formulas
=VLOOKUP(A2, OtherSheet!$A:$Z, 25, FALSE)- Copy down 500 rows
- Wait for Excel to stop calculating
- Find #N/A errors
- Fix them manually
- Time: 36 hours (seriously)
Total: 40 hours/month
The Python Way: Automation Magic
import pandas as pd
# Step 1: Read trial balance (messy text file)
tb = pd.read_table('trial_balance.txt', header=None, names=['raw'])
# Extract account numbers and balances
tb[['Account', 'Description', 'Balance']] = tb['raw'].str.extract(
r'(\d{6})\s+(.{30})\s+([\d,]+\.\d{2}(?:\s+CR)?)'
)
# Handle credit notation
mask = tb['Balance'].str.contains('CR')
tb.loc[mask, 'Balance'] = '-' + tb.loc[mask, 'Balance'].str.replace(' CR', '')
tb['Balance'] = tb['Balance'].str.replace(',', '').astype(float)
# Step 2: Read custodian report (Excel)
custodian = pd.read_excel('custodian_report.xlsx', sheet_name='Holdings')
custodian = custodian.rename(columns={'Acct_Number': 'Account', 'Market_Value': 'Custodian_Balance'})
# Step 3: Reconcile with ONE LINE
reconciliation = pd.merge(
tb[['Account', 'Description', 'Balance']],
custodian[['Account', 'Custodian_Balance']],
on='Account',
how='outer', # Keep accounts from both sources
indicator=True # Flag where each account came from
)
# Calculate differences
reconciliation['Difference'] = reconciliation['Balance'].fillna(0) - reconciliation['Custodian_Balance'].fillna(0)
# Flag material differences (over $100)
reconciliation['Material'] = abs(reconciliation['Difference']) > 100
# Export results
reconciliation.to_excel('reconciliation_results.xlsx', index=False)
# Summary stats
print(f"Total accounts: {len(reconciliation)}")
print(f"Perfect matches: {(reconciliation['Difference'] == 0).sum()}")
print(f"Material differences: {reconciliation['Material'].sum()}")
print(f"Total difference: ${reconciliation['Difference'].sum():,.2f}")
Time: 30 seconds
The Results
Before Python:
- ⏰ Time: 40 hours/month
- Stress Level: Through the roof
- Errors: 5-10 per month (copy-paste mistakes)
- Accounts Reconciled: 500 (painfully)
- Late Nights: Every month-end
- Cost: $2,000/month in overtime
After Python:
- ⏰ Time: 30 seconds + 2 hours research (only material items)
- Stress Level: Manageable
- Errors: Zero (automated = consistent)
- Accounts Reconciled: 500 (automatically)
- Late Nights: Haven’t had one in 6 months
- Cost: $0 overtime
Annual Savings: 456 hours = $24,000 in labor costs
What Made This Work: The Key Patterns
1. Text Parsing with Regex
Instead of manually splitting text, Python extracts exactly what you need:
# Extract: Account (6 digits), Description (30 chars), Balance (numbers)
pattern = r'(\d{6})\s+(.{30})\s+([\d,]+\.\d{2}(?:\s+CR)?)'
df[['Account', 'Description', 'Balance']] = df['raw'].str.extract(pattern)
2. Outer Merge with Indicator
Find accounts that exist in:
- Both systems (perfect!)
- Only trial balance (missing from custodian - investigate)
- Only custodian (missing from books - BIG PROBLEM)
df = pd.merge(left, right, on='Account', how='outer', indicator=True)
# indicator creates '_merge' column: 'both', 'left_only', 'right_only'
3. Vectorized Calculations
No formulas to copy. No waiting. Just instant math:
# Calculate differences for ALL 500 accounts at once
df['Difference'] = df['Balance'].fillna(0) - df['Custodian_Balance'].fillna(0)
The “Aha!” Moment
The CFO walked by my desk at 9:15 AM (15 minutes after the deadline).
CFO: “How’s the reconciliation coming? Board meeting is in an hour.”
Me: “Done. Sent it to you 10 minutes ago.”
CFO: suspicious “Done? Like… all 500 accounts?”
Me: “Yep. 498 perfect matches. 2 material differences. Already researched them. It’s a timing difference from yesterday’s trades.”
CFO: long pause “It usually takes you the whole week…”
Me: “Not anymore.” tries not to grin too much
CFO: “Can you teach the rest of the team?”
Me: “Already writing the training materials.”
That’s when I knew Python had changed my career forever.
Bonus: Handling Common Issues
Issue #1: Accounts in Different Formats
Problem: Trial balance uses “001234” but custodian uses “1234”
Solution:
# Standardize before merging
tb['Account'] = tb['Account'].str.lstrip('0') # Remove leading zeros
custodian['Account'] = custodian['Account'].astype(str).str.zfill(6) # Add leading zeros
Issue #2: Fuzzy Matching
Problem: Sometimes descriptions don’t match exactly
Solution:
from fuzzywuzzy import fuzz
# Calculate similarity score
df['similarity'] = df.apply(
lambda row: fuzz.ratio(row['Description_TB'], row['Description_Custodian']),
axis=1
)
# Flag potential matches
df['Potential_Match'] = df['similarity'] > 85
Issue #3: Multiple Custodians
Problem: We have 5 different custodians
Solution:
# Read all custodian files at once
custodian_files = [
'custodian_A.xlsx',
'custodian_B.xlsx',
'custodian_C.xlsx',
'custodian_D.xlsx',
'custodian_E.xlsx'
]
all_custodian = pd.concat([
pd.read_excel(f, sheet_name='Holdings').assign(Custodian=f.split('_')[1].split('.')[0])
for f in custodian_files
])
# Now reconcile against combined custodian data
reconciliation = pd.merge(tb, all_custodian, on='Account', how='outer')
Your Turn: Start Small
You don’t need to automate 500 reconciliations on day one.
Week 1: Automate reading one file Week 2: Add the merge logic Week 3: Handle credit/debit notation Week 4: Add difference calculations Week 5: Export results
By week 5, you’ll have your first automated reconciliation.
By week 10, you’ll wonder how you ever did it manually.
The Bottom Line
Before Python:
- 40 hours of copy-paste hell
- Stressed
- Error-prone
- Unsustainable
After Python:
- 30 seconds of automated bliss
- Confident
- Accurate
- Scalable
The best part? Once it’s set up, it works forever. Next month? 30 seconds. Month after? 30 seconds.
Your time back: 456 hours/year
What would you do with an extra 456 hours?
I learned Python. Best decision of my career.
Try It Yourself
Want the complete working script? Download it from our GitHub (adapt it to your reconciliation needs).
Have questions? Drop a comment below. Seriously—I read every one.
Your reconciliation nightmare can end tonight.
Join the Discussion on Discord! -
Have questions about automating reconciliations? Want to share your own automation success story? Join our community on Discord!
We’re building a community of finance professionals who are tired of Excel hell and ready to level up with Python.
Next time: “When VLOOKUP Fails You: Why Merge Functions Are Your New Best Friend” →