Marqo Cloud is built to be highly available. Health checks, ensure that your infrastructure is running as intended. Any unhealthy inference nodes are replaced automatically and all indexing come with replicas.
Marqo cloud scales to meet your needs, you can expect low latency searches with millions of documents and high request velocity. Marqo searches include the inference to create your vectors.
Manage your accesses and API keys for members of your organisation with the Marqo Cloud console. Your API keys secure your Marqo end-point.
Vector generation and management are included out of the box. Marqo Cloud pricing is comprised of two parts: storage and inference. You can scale storage and inference to meet your needs. Billing is determined by the per-hour-price of your chosen instances multiplied by the number of instances you allocate.
Storage refers to the hardware which hosts your vectors and enables the searching of those vectors. Your storage scales with the size of your data. With Marqo you can pick from three tiers of storage: basic, balanced, or performance.
Marqo is a documents-in-documents-out system, inference hardware converts your documents into vectors for you. CPU instances are recommended for smaller models or where latency is not crucial, GPU instances are recommended for larger models where low latency is critical.
As you grow, you can scale your storage capacity and inference throughput by increasing your number of instances. Swap between CPU and GPU inference to customise your cost, concurrency, and latency behaviours. You can even sale to zero if not actively using an index.