The purpose of this model is to classify a single document page image to define if it is the beginning page of the document or the middle/end page of a document. Single-page documents are classified as beginning page. It is a first step of the more general document boundary classification problem.

To generate the embeddings use google/siglip2-so400m-patch16-512 with no fine tuning. You have a tiny script in generate_embeddings.py to generate a pickle file with the embeddings, provided a Pandas DataFrame tasks_df with a col "image_path" that contains all the images paths.

Then you can use the resulting embeddings with the model here uploaded.

Output meaning:

0 -> Middle or end page of a document

1 -> Beginning page of a document (or single-page document)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support