Petri Savolainen (petri)

petri

Geek Repo

Company:@koodaamo

Location:Helsinki, Finland

Github PK Tool:Github PK Tool


Organizations
beanstalkd
collective
koodaamo
plone
zopefoundation

Petri Savolainen's starred repositories

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:15747Issues:132Issues:662

unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

Language:HTMLLicense:Apache-2.0Stargazers:8058Issues:51Issues:1089

pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

Language:PythonLicense:NOASSERTIONStargazers:7944Issues:149Issues:1107

pdfplumber

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Language:PythonLicense:MITStargazers:6217Issues:92Issues:545

svg-spinners

A collection of 24 x 24 dp SVG spinners! (CSS & SMIL)

Language:SVGLicense:MITStargazers:5906Issues:42Issues:10

pdfminer.six

Community maintained fork of pdfminer - we fathom PDF

Language:PythonLicense:MITStargazers:5740Issues:118Issues:691

PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language:PythonLicense:AGPL-3.0Stargazers:4831Issues:59Issues:1957

camelot

Camelot: PDF Table Extraction for Humans

Language:PythonLicense:NOASSERTIONStargazers:3628Issues:82Issues:381

borb

borb is a library for reading, creating and manipulating PDF files in python.

Language:PythonLicense:NOASSERTIONStargazers:3348Issues:34Issues:193

python-readability

fast python port of arc90's readability tool, updated to match latest readability.js!

Language:PythonLicense:Apache-2.0Stargazers:2621Issues:96Issues:104

pdfrw

pdfrw is a pure Python library that reads and writes PDFs

Language:PythonLicense:NOASSERTIONStargazers:1846Issues:67Issues:174

pretix

Ticket shop application for conferences, festivals, concerts, tech events, shows, exhibitions, workshops, barcamps, etc.

Language:PythonLicense:NOASSERTIONStargazers:1790Issues:49Issues:1226

llmsherpa

Developer APIs to Accelerate LLM Projects

Language:Jupyter NotebookLicense:MITStargazers:1262Issues:11Issues:71

goose3

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

Language:HTMLLicense:Apache-2.0Stargazers:796Issues:17Issues:86

jusText

Heuristic based boilerplate removal tool

Language:PythonLicense:BSD-2-ClauseStargazers:714Issues:21Issues:28

img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing

Language:PythonLicense:MITStargazers:480Issues:4Issues:133

pypdfium2

Python bindings to PDFium

FuzzTypes

Pydantic extension for annotating autocorrecting fields.

Language:PythonLicense:MITStargazers:203Issues:5Issues:0

searcharray

Full text search in your Pandas dataframe

Language:PythonLicense:Apache-2.0Stargazers:192Issues:5Issues:1

htmldate

Fast and robust date extraction from web pages, with Python or on the command-line

Language:PythonLicense:Apache-2.0Stargazers:117Issues:5Issues:54

pdfreader

Python API for PDF documents

Language:PythonLicense:MITStargazers:112Issues:8Issues:66

htmlvoc

A RDF-based representation of the HTML-vocabulary to express HTML-documents in RDF, rendering them semantic.

Language:PythonLicense:NOASSERTIONStargazers:20Issues:3Issues:3

polarpy

Tools for reading and fusing live data streams from Polar OH1 (PPG) and H10 (ECG) sensors. pip install polarpy.

EDFbrowser

A free, opensource, multiplatform, universal viewer and toolbox intended for, but not limited to, timeseries storage files like EEG, EMG, ECG, BioImpedance, etc.

Language:C++License:NOASSERTIONStargazers:10Issues:0Issues:0

pymupdf-fonts

Collection of optional fonts for PyMuPDF

Language:PythonLicense:NOASSERTIONStargazers:8Issues:5Issues:3
Language:PythonLicense:AGPL-3.0Stargazers:6Issues:4Issues:0

polar_accesslink

Python client for Polar AccessLink API.

Language:PythonLicense:MITStargazers:2Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:2Issues:1Issues:0

weighed-levenshtein-substring

Fork of https://github.com/infoscout/weighted-levenshtein

Language:PythonLicense:MITStargazers:2Issues:0Issues:0

polar_web

Python client for Polar web API.

Language:PythonLicense:MITStargazers:1Issues:2Issues:0