I am working a project in .NET Core. And, I want to be able to determine if the content in my Stream object is actually a PDF file. Is there a simple way to tell if the in-stream content is PDF?
Yes, there is.
According to the Adobe PDF specification, the first five bytes of a PDF file typically start with the sequence "%PDF-"
. This header provides a quick way to check if a file is likely in PDF format. By reading the initial five bytes from the stream and verifying if they match "%PDF-"
, you can identify PDF content across platforms without needing extra libraries.
Here’s an example in C#:
using System; using System.IO; using System.Text; public bool IsPdf(Stream stream) { byte[] buffer = new byte[5]; stream.Seek(0, SeekOrigin.Begin); // Reset stream position stream.Read(buffer, 0, buffer.Length); string header = Encoding.ASCII.GetString(buffer); return header == "%PDF-"; }
This code snippet reads the first five bytes of a Stream
and confirms if they match "%PDF-"
. For more details, you can refer to section 7.5.2 of the Adobe PDF Specification, and for the latest specification, see the PDF Association’s resource page.