Cloud-Native Neural Search[?] Framework for Any Kind of Data
Jina๐
allows you to build deep learning-powered search-as-a-service in just minutes.
Run Quick Demo
๐ Fashion image search:jina hello fashion
๐ค QA chatbot:pip install "jina[chatbot]" && jina hello chatbot
๐ฐ Multimodal search:pip install "jina[multimodal]" && jina hello multimodal
๐ด Fork the source of a demo to your folder:jina hello fork fashion ../my-proj/
Install
- via PyPI
$ pip install "jina[devel]" $ jina -v 2.0.0
- via Docker
$ docker run jinaai/jina:latest -v 2.0.0
๐ฆ More installation options
x86/64,arm64,v6,v7,Apple M1 |
On Linux/macOS & Python 3.7/3.8/3.9 | Docker Users |
---|---|---|
Standard | pip install jina |
docker run jinaai/jina:latest |
Daemon | pip install "jina[daemon]" |
docker run --network=host jinaai/jina:latest-daemon |
With Extras | pip install "jina[devel]" |
docker run jinaai/jina:latest-devel |
Version identifiers are explained here. Jina can run on Windows Subsystem for Linux. We welcome the community to help us with native Windows support.
Get Started
Document, Executor, and Flow are the three fundamental concepts in Jina.
๐ Document is the basic data type in Jina;โ๏ธ Executor is how Jina processes Documents;๐ Flow is how Jina streamlines and distributes Executors.
import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests
class CharEmbed(Executor): # a simple character embedding with mean-pooling
offset = 32 # letter `a`
dim = 127 - offset + 1 # last pos reserved for `UNK`
char_embd = np.eye(dim) * 1 # one-hot embedding for all chars
@requests
def foo(self, docs: DocumentArray, **kwargs):
for d in docs:
r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
d.embedding = self.char_embd[r_emb, :].mean(axis=0) # average pooling
class Indexer(Executor):
_docs = DocumentArray() # for storing all documents in memory
@requests(on='/index')
def foo(self, docs: DocumentArray, **kwargs):
self._docs.extend(docs) # extend stored `docs`
@requests(on='/search')
def bar(self, docs: DocumentArray, **kwargs):
q = np.stack(docs.get_attributes('embedding')) # get all embeddings from query docs
d = np.stack(self._docs.get_attributes('embedding')) # get all embeddings from stored docs
euclidean_dist = np.linalg.norm(q[:, None, :] - d[None, :, :], axis=-1) # pairwise euclidean distance
for dist, query in zip(euclidean_dist, docs): # add & sort match
query.matches = [Document(self._docs[int(idx)], copy=True, scores={'euclid': d}) for idx, d in enumerate(dist)]
query.matches.sort(key=lambda m: m.scores['euclid'].value) # sort matches by their values
f = Flow(port_expose=12345, protocol='http').add(uses=CharEmbed, parallel=2).add(uses=Indexer) # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip())) # index all lines of this file
f.block() # block for listening request
http://localhost:12345/docs
(an extended Swagger UI) in your browser, click /search tab and input
{"data": [{"text": "@requests(on=something)"}]}
Here @requests(on=something)
is our textual query, we want to find the lines most similar to request(on=something)
from the above server code snippet. Now click Execute button!
from jina import Client, Document
from jina.types.request import Response
def print_matches(resp: Response): # the callback function invoked when task is done
for idx, d in enumerate(resp.docs[0].matches[:3]): # print top-3 matches
print(f'[{idx}]{d.scores["euclid"].value:2f}: "{d.text}"')
c = Client(protocol='http', port_expose=12345) # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)
, which prints the following results:
Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.192049: "query.matches = [Document(self._docs[int(idx)], copy=True, score=d) for idx, d in enumerate(dist)]"
Read Tutorials
๐ง What is "Neural Search"?๐ Document
&DocumentArray
: the basic data type in Jina.โ๏ธ Executor
: how Jina processes Documents.๐ Flow
: how Jina streamlines and distributes Executors.๐คน Serving Jina๐ Developer References๐งผ Clean & Efficient Coding in Jina๐ 3 Reasons to Use Jina 2.0
Support
- Join our Slack community to chat to our engineers about your use cases, questions, and support queries.
- Join our Engineering All Hands meet-up to discuss your use case and learn Jina's new features.
- When? The second Tuesday of every month
- Where? Zoom (see our public events calendar/.ical) and live stream on YouTube
- Subscribe to the latest video tutorials on our YouTube channel.
Join Us
Jina is backed by Jina AI. We are actively hiring full-stack developers, solution engineers to build the next neural search ecosystem in open source.
Contributing
We welcome all kinds of contributions from the open-source community, individuals and partners. We owe our success to your active involvement.
- Contributing guidelines
- Code of conduct - play nicely with the Jina community
- Good first issues
- Release cycles and development stages