From Spreadsheet to Database: A CSV Migration Checklist

April 24, 2026 · 7 min read · Back to blog

Every production database starts somewhere. For many teams, that somewhere is a spreadsheet. Whether it's a Google Sheet tracking customers, an Excel file with inventory data, or a Notion table that finally outgrew itself — the day comes when rows freeze, formulas break, and someone asks, "Can we just put this in Postgres?"

CSV is the universal bridge between spreadsheets and SQL. But dumping a CSV into a database without a plan is how you end up with VARCHAR(255) columns that should be DATE, duplicate primary keys, and silent data truncation. This checklist will help you migrate cleanly.

Phase 1: Audit Your Data (Before You Touch SQL)

Remove empty rows and columns. Spreadsheets accumulate blank rows at the bottom and hidden columns on the right. Export to CSV and open it in a text editor — you'll be surprised what lurks past your visible range.

Standardize headers. Replace spaces with underscores, remove special characters, and decide on a naming convention (snake_case is the SQL standard). Customer ID becomes customer_id.

Fix mixed types in columns. A column that is mostly integers but has three cells with text notes will cause import failures. Either clean the data or decide those cells belong in a separate column.

Handle dates explicitly. Spreadsheets store dates as opaque serial numbers. Export them in ISO 8601 format (2026-04-24) before converting to CSV so your database parser knows what they are.

Check for duplicate keys. If your spreadsheet has a column you plan to use as a primary key, run a quick pivot or conditional formatting rule to highlight duplicates. Fix them before migration.

Phase 2: Design Your Schema

CSV files are flat. Relational databases are not. Before you create a single table, ask whether your data should be normalized.

When to keep it flat (one table)

Under 10,000 rows
No repeating groups or multi-value cells
No obvious parent-child relationships
You need the migration done in the next hour

When to normalize (multiple tables)

Cells contain comma-separated values (e.g., tags: "red, blue, green")
The same entity appears in multiple rows with redundant data
You have lookup values that should be foreign keys (e.g., status column with 5 repeated strings)
You plan to query with JOINs in the future

Choose a primary key. If your data doesn't have a natural unique identifier, add an auto-incrementing id column. Every table should have a primary key.

Infer types, then verify. Integers look like integers until row 8,402 contains a trailing space. Sample the first 1,000 rows to guess types, then spot-check the rest.

Decide on NULL vs default values. Empty cells in spreadsheets can mean "unknown," "zero," or "not applicable." Be explicit about which becomes NULL and which gets a default value.

Phase 3: Generate and Validate SQL

Once your CSV is clean and your schema is sketched, it's time to generate the actual SQL. You have two options: manual DDL or a tool.

Manual DDL gives you full control but is error-prone. A tool like SchemaLens CSV to SQL Converter auto-detects delimiters, infers column types, and generates dialect-correct CREATE TABLE and INSERT statements for PostgreSQL, MySQL, SQLite, and SQL Server.

Generate CREATE TABLE first. Don't skip straight to inserting data. Create the table, inspect the column types, and make sure they match your expectations.

Test with a small batch. Insert the first 100 rows, run a few SELECT queries, and verify counts and types. Catching a type mismatch on 100 rows is better than discovering it on 100,000.

Handle encoding issues. CSVs exported from Excel on Windows are often UTF-8 with BOM or Windows-1252 encoded. If you see garbled characters in names or addresses, check the file encoding with file -i data.csv and re-encode if needed.

Phase 4: Migrate the Data

With a validated schema, you're ready for the full migration. The exact method depends on your database size and tooling.

Small datasets (< 10,000 rows)

Use multi-row INSERT statements generated by a converter tool. Batch them in groups of 500–1,000 to avoid statement size limits.

Medium datasets (10,000 – 1,000,000 rows)

Use your database's native bulk import command:

-- PostgreSQL
COPY users FROM '/path/to/users.csv' WITH (FORMAT csv, HEADER true);

-- MySQL
LOAD DATA INFILE '/path/to/users.csv'
INTO TABLE users
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

-- SQLite
.mode csv
.import /path/to/users.csv users

Large datasets (> 1,000,000 rows)

Use a dedicated ETL tool like pgloader (PostgreSQL), mysqlimport, or a streaming pipeline. For one-time migrations, pgloader handles type casting, encoding conversion, and batching automatically.

Run in a transaction (if possible). For smaller datasets, wrap the import in a transaction so you can ROLLBACK if something goes wrong.

Monitor disk space. Bulk imports create WAL (write-ahead log) entries and may require temporary disk space equal to your dataset size. Make sure your server has headroom.

Phase 5: Validate Everything

Importing is not the end. Validation is. Run these checks before you declare victory:

Row count matches. SELECT COUNT(*) FROM table; should equal the number of data rows in your CSV (minus the header).

Spot-check random samples. Pick 10 random rows and compare them cell-by-cell against the original spreadsheet. Pay special attention to dates, booleans, and text with special characters.

Run constraint checks. If you defined NOT NULL, UNIQUE, CHECK, or FOREIGN KEY constraints, verify they all pass. Any failure means your data or schema needs adjustment.

Check for silent truncation. A VARCHAR(50) column will silently truncate a 60-character string in some databases. Compare MAX(LENGTH(column)) against your column definitions.

Phase 6: Post-Migration Cleanup

Add indexes. At minimum, index your primary key and any foreign key columns. If you know your query patterns, add indexes on frequently filtered columns.

Set up backups. Your data is now in a database. Configure automated backups before someone runs an accidental DELETE without a WHERE clause.

Document the schema. Use a tool like SchemaLens Schema Documentation Generator to create a shareable reference of your new table structure.

Convert CSV to SQL in seconds

Paste your CSV data and get CREATE TABLE + INSERT statements for PostgreSQL, MySQL, SQLite, or SQL Server. Auto-detects types and delimiters.

Open CSV to SQL Converter

Common Pitfalls to Avoid

Assuming CSV is UTF-8. Excel on macOS exports UTF-8. Excel on Windows often exports Windows-1252. Always verify encoding before importing.

Ignoring leading zeros. ZIP codes, phone numbers, and IDs that start with zeros will lose those zeros if imported as integers. Store them as TEXT or VARCHAR.

Using spreadsheet formulas as data. If a cell contains =A1+B1, the CSV export will contain the calculated value — which is what you want. But verify that calculated values were actually exported and not the formula string.

Forgetting time zones. Spreadsheets don't store time zones. If your timestamps need zone awareness, decide on a canonical zone (usually UTC) and convert before import.