Models that generate embeddings from inputs
Return CLIP features for the clip-vit-large-patch14 model
This is a language model that can be used to obtain document embeddings suitable for downstream tasks like semantic search and clustering.
A model for text, audio, and image embeddings in one space