|

CSV Cleanup Playbook: Deduplicate, Split, Normalize (Sheets & Excel)

Fix messy CSVs fast with copy‑paste recipes for the most common cleanup tasks.

Quick Recipes Table

TaskSheets Formula / Path
Remove duplicate rows (by email)=UNIQUE(A2:D) or Data → Remove duplicates
Trim spaces/nonprintables=TRIM(CLEAN(A2))
Proper case names=PROPER(A2)
Split full name=SPLIT(A2," ") or =REGEXEXTRACT(A2,"^(\\S+)\\s+(.*)$")
Standardize phone=REGEXREPLACE(A2,"[^0-9]","")
Anti‑join (remove overlaps)=FILTER(A2:A, ISNA(MATCH(A2:A, C2:C,0)))

Step‑by‑Step

1) Deduplicate by key (email + company)

=UNIQUE(SORT(A2:D,1,true))

Advanced: Multi‑column uniqueness with helper: =A2&"|"&B2 → run UNIQUE on helper.

2) Split names into first/last

=ARRAYFORMULA({REGEXEXTRACT(A2:A,"^(\S+)"), REGEXEXTRACT(A2:A,"\s(.+)$")})

3) Normalize phone numbers

=ARRAYFORMULA(IF(A2:A="","",REGEXREPLACE(A2:A,"[^0-9]","")))

4) Clean whitespace and stray characters

=ARRAYFORMULA(IF(A2:A="","",TRIM(CLEAN(A2:A))))

5) Merge two lists with match flag

=ARRAYFORMULA(IF(A2:A="","",IF(ISNUMBER(MATCH(A2:A, C2:C,0)),"match","new")))

6) Export without losing leading zeros

  • Format the column as Plain text before paste/export.
  • Use TEXT(A2,"000000") to pad as needed.

One‑Click Cleanup (Apps Script)

function cleanupSheet(){
  const sh = SpreadsheetApp.getActiveSheet();
  const range = sh.getDataRange();
  let values = range.getValues();
  for (let r=1; r<values.length; r++){
    for (let c=0; c<values[0].length; c++){
      let v = values[r][c];
      if (typeof v === 'string'){
        v = v.replace(/[\u0000-\u001F\u007F]/g,'');
        v = v.replace(/\s+/g,' ').trim();
        values[r][c] = v;
      }
    }
  }
  range.setValues(values);
}

Similar Posts