CSV Cleanup Playbook: Deduplicate, Split, Normalize (Sheets & Excel)
Fix messy CSVs fast with copy‑paste recipes for the most common cleanup tasks.
Quick Recipes Table
Task | Sheets Formula / Path |
---|---|
Remove duplicate rows (by email) | =UNIQUE(A2:D) or Data → Remove duplicates |
Trim spaces/nonprintables | =TRIM(CLEAN(A2)) |
Proper case names | =PROPER(A2) |
Split full name | =SPLIT(A2," ") or =REGEXEXTRACT(A2,"^(\\S+)\\s+(.*)$") |
Standardize phone | =REGEXREPLACE(A2,"[^0-9]","") |
Anti‑join (remove overlaps) | =FILTER(A2:A, ISNA(MATCH(A2:A, C2:C,0))) |
Step‑by‑Step
1) Deduplicate by key (email + company)
=UNIQUE(SORT(A2:D,1,true))
Advanced: Multi‑column uniqueness with helper: =A2&"|"&B2
→ run UNIQUE on helper.
2) Split names into first/last
=ARRAYFORMULA({REGEXEXTRACT(A2:A,"^(\S+)"), REGEXEXTRACT(A2:A,"\s(.+)$")})
3) Normalize phone numbers
=ARRAYFORMULA(IF(A2:A="","",REGEXREPLACE(A2:A,"[^0-9]","")))
4) Clean whitespace and stray characters
=ARRAYFORMULA(IF(A2:A="","",TRIM(CLEAN(A2:A))))
5) Merge two lists with match flag
=ARRAYFORMULA(IF(A2:A="","",IF(ISNUMBER(MATCH(A2:A, C2:C,0)),"match","new")))
6) Export without losing leading zeros
- Format the column as Plain text before paste/export.
- Use
TEXT(A2,"000000")
to pad as needed.
One‑Click Cleanup (Apps Script)
function cleanupSheet(){
const sh = SpreadsheetApp.getActiveSheet();
const range = sh.getDataRange();
let values = range.getValues();
for (let r=1; r<values.length; r++){
for (let c=0; c<values[0].length; c++){
let v = values[r][c];
if (typeof v === 'string'){
v = v.replace(/[\u0000-\u001F\u007F]/g,'');
v = v.replace(/\s+/g,' ').trim();
values[r][c] = v;
}
}
}
range.setValues(values);
}