API Reference¶
Document¶
-
class
redstork.Document(file_name, password=None)¶ PDF document.
A
list-like container of pages. Sample use:doc = Document('sample.pdf') print("Number of pages:', len(doc)) for key, value in doc.meta.items(): print(' ', key, ':', value)
-
__init__(file_name, password=None)¶ Create new PDF Document object, from a file.
Parameters:
-
__getitem__(page_index)¶ Returns
Pageat this index.Example:
doc = ... page = doc[0] # first page
Parameters: page_index (int) – zero-based page index Returns: Pageobject
-
__len__()¶ Returns number of pages in this document
-
__iter__()¶ Iterate over the pages of this document
-
get_all_pages_size()¶ get width and height of all pages, without loading each page
-
changed¶ True is PDF was changed since teh load (or last save)
-
save(filename)¶ Saves PDF file, resets
Document.changed()to False
-
Page¶
-
class
redstork.Page(page, page_index, parent)¶ Represents page of a PDF file.
-
crop_box¶ Page crop box.
-
media_box¶ Page media box.
-
rotation¶ Page rotation.
- 0 - no rotation
- 1 - rotated 90 degrees clock-wise
- 2 - rotated 180 degrees clock-wise
- 3 - rotated 270 degrees clock-wise
-
label¶ Page label.
-
__len__()¶ Number of objects on this page.
-
__getitem__(index)¶ Get object at this index.
-
__iter__()¶ Iterates over page objects.
-
flat_iter()¶ Iterates over all non-container objects (Text, Image, Path).
-
render_to_buffer(scale=1.0, rect=None)¶ Render page (or rectangle on the page) to memory (the pixel format is BGRx)
Parameters:
-
render(file_name, scale=1.0, rect=None)¶ Render page (or rectangle on the page) as PPM image file.
Parameters:
-
PageObject¶
-
class
redstork.PageObject(obj, index, typ, parent)¶ -
OBJ_TYPE_TEXT= 1¶ see
TextObject
-
OBJ_TYPE_PATH= 2¶ see
PathObject
-
OBJ_TYPE_IMAGE= 3¶ see
ImageObject
-
OBJ_TYPE_SHADING= 4¶ see
ShadingObject
-
OBJ_TYPE_FORM= 5¶ Common superclass of all page objects
-
type= None¶ type of this object
-
matrix= None¶ transformation matrix of this object
-
page¶ Links back to the parent page
-
TextObject¶
-
class
redstork.TextObject(obj, index, typ, parent)¶ Represents a string of text on a page
-
font_size= None¶ font size of this text object
-
matrix= None¶ matrix for this page object
-
__len__()¶ Number of items in this string
-
__getitem__(index)¶ Returns item at this index.
Each item is a 3-tuple: (charcode, x, y).
-
__iter__()¶ Iterates over items.
-
char_iter()¶ Iterates over characters (skips kerns)
-
text_geometry_iter()¶ Iterates over characters and returns character text and bounds
-
effective_font_size¶ Returns effective (user-visible) font size
-
scale_y¶ Returns Y-scale of text matrix transformation
-
scale_x¶ Returns X-scale of text matrix transformation
-
skew¶ Returns skew value of text matrix.
-
box(x0, y0, x1, y1)¶ Computes bounding box after transformation with text matrix
-
Font¶
-
class
redstork.Font(font, parent)¶ Represents font used in a PDF file.
-
FLAGS_NORMAL= 0¶ Normal font
-
FLAGS_FIXED_PITCH= 1¶ Fixed pitch font
-
FLAGS_SERIF= 2¶ Serif font
-
FLAGS_SYMBOLIC= 4¶ Symbolic font
-
FLAGS_SCRIPT= 8¶ Script font
-
FLAGS_NONSYMBOLIC= 32¶ Non-symbolic font
-
FLAGS_ITALIC= 64¶ Italic font
-
FLAGS_ALLCAP= 65536¶ All-cap font
-
FLAGS_SMALLCAP= 131072¶ Small-cap font
-
FLAGS_FORCE_BOLD= 262144¶ Force-bold font
-
name¶ Font name in the PDF document.
-
simple_name¶ Font name without PDF-specific prefix.
-
flags¶ Font flags.
-
weight¶ Font weight.
-
is_vertical¶ True for vertical writing systems (CJK)
-
id¶ Tuple of (Object_id, Generation_id), identifying underlaying stream in PDF file
-
load_glyph(charcode)¶ Load glyph, see
GlyphParameters: charcode (int) – the character code (see TextObject)
-
__getitem__(charcode)¶ Returns Unicode text of this character.
Parameters: charcode (int) - the character code (see TextObject) –
-
is_editable¶ True if font encoding can be changed
-
ImageObject¶
PathObject¶
ShadingObject¶
-
class
redstork.ShadingObject(obj, index, typ, parent)¶ Represents a shading object on a page.
FormObject¶
-
class
redstork.FormObject(obj, index, typ, parent)¶ Represents a form (XObject) on a page - a container of other page objects (used internally).
-
matrix= None¶ matrix for this page object
-
form_matrix= None¶ transformation matrix for contained objects
-
flat_iter()¶ Iterates over all non-container objects in this form.
-