After this, you'll be able to tell in advance whether a file will give Claude useful text or just a picture of text, and fix it before you upload.
Before you start
Before diving in, complete Share a Document with Claude so you have real upload experience to apply the format-quality diagnosis this lesson teaches.
The idea
You uploaded a spreadsheet. Claude gave you a confident-sounding answer that was completely wrong. This is one of the most common file-sharing failures, and it almost always comes from a format problem, not a Claude problem.
Here is the before and after: There are two kinds of PDFs, and they look identical in your file browser. A text-extractable PDF was created by exporting from Word, Google Docs, or a design tool. When you open it, the words are actual selectable text. Claude reads this perfectly. An image-only PDF was created by scanning a paper document without running text recognition. When you open it, the page looks like a photo. Claude sees a picture of words, not words it can read. The fix is to run OCR (optical character recognition) before uploading.
Spreadsheets have their own failure mode. A plain Excel or CSV file usually works fine. But if your spreadsheet has merged cells, color-coded categories with no text labels, or data organized in columns that depend on visual alignment, Claude loses that structure when it reads the file. What looks like a clean table to you becomes a scrambled list of numbers with no headers. The fix: before uploading a complex spreadsheet, add explicit text labels to every column and unmerge any merged cells.
Knowing which format to use before you upload saves the back-and-forth of getting a wrong answer and wondering why.
Try it (12 min)
Watch out for
Paste this into Claude:
I am going to upload [describe your file: its type, where it came from, and roughly what it contains]. Before I do, I want to know: based on how this file was likely created, will Claude be able to read the text in it? What should I check or fix before uploading? If the file format might cause problems, tell me the best way to convert or prepare it so I get a reliable answer.
What good looks like:
What a good response looks like:
Based on your description, a few things to check before uploading your quarterly expense report. If it was exported from Excel or Google Sheets as a PDF: it is almost certainly text-extractable and will work well. Test: try selecting text in the PDF with your cursor. If you can highlight individual words, Claude can read them. If it was scanned from a printed report: it is likely image-only. Claude will see the page as a picture. To fix this, open the file in Google Drive, right-click it, and choose 'Open with Google Docs.' Google will run text recognition and convert it to a readable document in about 30 seconds. For the spreadsheet version: the merged 'Q3 Summary' header row spanning columns A through D will likely lose its structure when Claude reads it. Before uploading, unmerge those cells and add the column label to each cell individually. Also, if your color coding (red for over-budget, green for under) carries meaning, add a text note to those rows since Claude cannot see colors. Make those two changes and re-upload. You should get clean answers about your budget numbers.
Go deeper (12 min)
Paste this into Claude:
I am uploading a file that I suspect has a format problem: [describe the file and what you tried to do with it that gave a bad or confusing result]. Ask Claude to tell you exactly what it can see in the file. Say: 'Before you answer any questions about this file, tell me: what text can you actually read in it? Are there any sections that appear as images rather than text? Are there any tables or data structures that look garbled or incomplete?' Use that answer to decide whether to fix the file or ask your question differently.
What good looks like:
What a good response looks like:
Here is what I can see in the file you uploaded: Pages 1-4: clean text, fully readable. I can see the contract terms, party names, and dates clearly. Page 5 (the signature page): this appears to be a scanned image. I can see there are signatures and a date stamp, but I cannot read the handwritten date or confirm the exact text near the signature lines. If you need to verify the execution date, you will need to check page 5 directly in the original document. The table on page 3 (Exhibit A, pricing schedule): the columns came through but the merged header row that spans all columns did not. I can see the individual numbers but I am not certain which column label applies to which set of figures. If you can tell me the column headers (e.g. 'Unit Price / Quantity / Total'), I can answer pricing questions accurately. Everything else looks clean. What would you like to know about the contract?
When this breaks
Claude can do it for you
After uploading any file, ask Claude before your real question: 'What can you actually see in this file? Flag anything that looks like an image, a garbled table, or missing headers.' That one check prevents most file-reading failures.
You can now
Diagnose at least one format issue before uploading (image-only PDF, merged cells, color-only labels), apply a fix, and produce a reliable answer that the un-fixed version would have gotten wrong.
Key takeaways
Before you upload, ask: can I select text in this file? If not, Claude is reading a picture. Fix the format first, then ask your question.