4 architecture options for your multitenant analytics solution (2024)

Multitenant analytics is about delivering analytics to users in multiple organizations (tenants). The most common use case for multitenant analytics is customer-facing reports, and dashboards embedded in a SaaS application.

Another frequent use case is an organization that provides analytics to its business partners: suppliers, distributors, resellers, franchises, etc.

4 architecture options for your multitenant analytics solution (3)

Multitenant analytics is often delivered as a product. This involves the following high-level steps:

  • Deliver an initial analytical experience (data visualizations, reports, dashboards, etc.) to new tenants (e.g. organizations, customers, business partners)
  • Organizations customize their analytics with self-service tools
  • You release a new version of the analytics without breaking the customizations
  • Rinse and repeat …

This article describes architecture options for multitenant analytics products.

Here are the key considerations for evaluation of different multitenant analytics architecture options described below. You should weigh them based on the architecture of your application and your user's needs:

  • Data and metadata privacy: privacy of each tenant data and metadata (dashboards, reports, data models, metrics, etc.) must be strongly enforced.
  • Multi-domain analytics: the ability to cross-analyze data from other business domains (e.g. sales, marketing, product, shipments, support).
  • Performance and scalability: sub-second report computation latencies and the ability to scale from single-digit tenants to tens of thousands of tenants.
  • Realtime latencies: analytics uses fresh data with minimum delays.
  • Time to market: solution implementation complexity and cost. Change management velocity (implementation of new versions, and bugfixes).
  • Operational complexity & cost: solution operation complexity and cost. Provisioning new tenants, users, ACLs, permissions, etc. Releasing a new version and rolling it out to all tenants.

This option utilizes the existing operational database that is used for CRUD (Create-Read-Update-Delete) operations on top of the operational data. This approach is good as long as there are few reports (low number of executions) and no or very little data aggregation. If you need to just serve plain lists of data or a few, simple operational reports, this is the easiest option that provides the best realtime reporting capabilities.

4 architecture options for your multitenant analytics solution (4)

However, when your analytical throughput grows (more data, users, or report execution numbers) or becomes unpredictable because of self-service analytics, you’ll need to separate the analytical queries from the operational transactions for performance and scalability reasons. The separation is more important in architectures where the operational database is shared across multiple (or all) tenants of your application.

You might want to invest in a better architecture right from the beginning to not spend your efforts on a temporary solution. Trying to survive with this architecture too long usually leads to significant overspending for the database layer.

This architecture also doesn’t scale in terms of additional data sources. Analytical use cases usually involve data from more domains (e.g. marketing, product, sales data, etc.). Pushing all these additional data to the operational database is another data processing workload for the operational database.

The per-tenant siloed architecture is probably the first that comes to your mind if when you are tasked with extending a single-tenant (internal) analytics solution to a multitenant solution. You simply take a single-tenant analytics solution and deploy it for every tenant. This option is ideal when your application already utilizes a similar siloed architecture.

The siloed approach is great for data and metadata privacy as each of your tenants uses its dedicated infrastructure. Similarly, you can scale individual tenants based on their size and needs.

4 architecture options for your multitenant analytics solution (5)

Achieving close-to-realtime data reports is hard especially when your users need additional datasets that must be distributed to each silo. This applies to additional data (from different domains) as well as to benchmarks.

Operation and management of the siloed multitenant analytics is very hard and costly as you have to deploy, configure, upgrade, and manage all tenants individually. The distributed data management with many databases is also hard because of the data distribution and the fact that you need to apply configurations and upgrades to each tenant individually.

You also might need to invest in advanced virtualization to allocate hardware resources because you don’t want to dedicate the hardware to every tenant.

The shared analytical database architecture relies on the power of a central analytical engine that stores all data for all tenants and serves all queries. Metadata is also stored in a centralized, shared metadata store. The data and metadata access privacy is enforced at the application level using some configuration (e.g. ACLs, forced database filters, etc.).

4 architecture options for your multitenant analytics solution (6)

Data and metadata privacy require special attention in this architecture as all tenants access the centralized data and metadata. In most cases the access is to data and metadata is enforced using some mandatory SQL WHERE filters appended to each query. Automation of all operation and configuration procedures is strongly recommended to prevent human errors that might result in a data breach.

The central analytical database can quickly become a bottleneck as it is used for both data transformation and low-latency analytics queries. Despite many vendor’s claims, there is an inevitable tradeoff between query latency, concurrency and data freshness to be made. The key implication for you is that this architecture will soon require the central database sharding to avoid huge investments in hardware.

Cost and data privacy are the reasons for extending the previous, shared analytical database architecture with workspaces (aka namespaces). The extended architecture contains these two fundamental components:

  • Data warehouse (or data lake) that aggregates data for all tenants for shared data transformations and management (e.g. machine learning, benchmark computation, shared datasets, etc.) purposes. Unlike in the previous architecture option, the low-latency analytical queries execute at the workspace level. So the data warehouse can be optimized for data transformation (ETL/ELT). This allows using more cost-efficient components like Apache Spark, AWS Athena, or cloud storages like AWS S3 or Azure Blob Storage instead of costly Snowflakes or Redshifts.
  • Workspace (aka namespace) contains private data and metadata for each tenant. There are important considerations regarding the workspace query implementation (e.g. in-memory cube, database instance, federated query with a caching layer, etc.).

This architecture is less brittle from a data privacy perspective than the previous one as the workspaces automate the private data distribution from the data warehouse. The workspace also isolates the tenant-private metadata (e.g. custom reports or dashboards).

The distributed nature of workspaces provides more flexibility for scaling. The fact that data volume is partitioned by tenant enables usage of more cost-efficient or faster technology (e.g. in-memory or opensource databases). Also, the workspace isolates other tenants from query workload from large tenants (a large number of users or large data volume).

4 architecture options for your multitenant analytics solution (7)

As stated above this architecture requires heavy automation at the data distribution (from the data warehouse to workspaces), and metadata distribution (releases). This automation requires additional investments (build vs. buy).

There are many open-source technologies that you can leverage as building blocks for your multitenant analytics solution architectures described above.

Analytical database/data warehouse

There are many open-source and commercial databases that you can leverage. Postgres or traditional commercial databases like Oracle or MS SQL Server are probably the best options for the first architecture option that requires good handling of mixed-load workloads.

Postgres or MariaDB are great choices for the siloed architecture option unless you have tenants with larger data (>50GB) or many users (>100). You should again take a look at the commercial options. Vertica with the community edition option might be an interesting option for larger tenants.

Snowflake, Google BigQuery, Dremio, Amazon Redshift, and Vertica are the best for the shared database option as they are optimized for the mixed load from low-latency queries and ETL/ELT micro-batches.

Apache Spark and Amazon Athena are more cost-efficient options for the data warehouse implementation. You can leverage them in combination with workspaces that handle low-latency queries.

Postgres, MariaDB, or Vertica (for larger tenants) are in my opinion the best options for workspace implementations.

Reports, dashboards, and data visualizations

Reports, dashboards, and data visualizations can be implemented via standard single-tenant BI tools. Many vendors use in-memory data processing for low latency (e.g. PowerBI, Qlik). The in-memory approach usually doesn’t provide great realtime capabilities (limited data refresh frequency) and doesn’t scale in terms of data volume. Other BI tools like Tableau use a file-based query mechanism that scales better to larger data volume than the in-memory alternatives.

Many BI tool vendors implement a direct query mechanism that allows executing queries at the database level. Be careful when you use the BI tools for the implementation of the most advanced, workspace-based architecture. The direct query degrades this architecture to the shared analytical database or forces you to implement the workspace storage and query layer manually.

If you don’t want to design and engineer your multitenant analytics architecture yourself, you can use an existing analytics platform. There are a couple of them available on the market. The GoodData analytics platform is in my opinion the best choice for multitenant analytics solutions. Let me briefly show how this platform implements the architectures above.

GoodData platform: fully managed service and local containers

GoodData platform offers two deployment options:

  • Fully managed SaaS platform with workspaces that connects to your data warehouse and distributes data to GoodData hosted workspaces.
  • Docker & Kubernetes container’s images that allow for deploying analytics to your on-premise data center or to private or public cloud (e.g. Amazon AWS, Azure, or Google Cloud) side-by-side with your application. In this case, the GoodData platform connects to your local database.

The fully managed SaaS platform de-facto implements the last, most advanced workspace-based architecture option described in this article.

The GoodData platform container images can be used for the implementation of the first three architectures described in this article: operational database analytics, per-tenant silos, or centralized analytical database. The locally deployed GoodData platform also provides virtual workspaces for multitenant management of your tenant’s metadata.

The unified analytics layer and the same analytics tools allow for easy hybrid deployment of your solution by combining the fully managed SaaS with public or private cloud deployment.

Multitenant analytics like customer-facing analytics are hard and costly to implement and operate. I strongly recommend you plan your implementation at least 18 months ahead. Try to assess the future state of your analytical solution and design its architecture based on the future state’s requirements. Spend more efforts with planning your engineering and operation budget to decide whether you want to build the solution in-house or adopt an existing analytics platform.

4 architecture options for your multitenant analytics solution (2024)
Top Articles
Eliminating Unused CSS
Optimizing for Production - Tailwind CSS
Bank Of America Financial Center Irvington Photos
Jailbase Orlando
Math Playground Protractor
Geodis Logistic Joliet/Topco
Kristine Leahy Spouse
Giovanna Ewbank Nua
Youtube Combe
What is Cyber Big Game Hunting? - CrowdStrike
Nene25 Sports
Becu Turbotax Discount Code
Ou Class Nav
Billionaire Ken Griffin Doesn’t Like His Portrayal In GameStop Movie ‘Dumb Money,’ So He’s Throwing A Tantrum: Report
Inside the life of 17-year-old Charli D'Amelio, the most popular TikTok star in the world who now has her own TV show and clothing line
Velocity. The Revolutionary Way to Measure in Scrum
U Break It Near Me
Axe Throwing Milford Nh
Mikayla Campinos Laek: The Rising Star Of Social Media
Foxy Brown 2025
Crawlers List Chicago
Toyota Camry Hybrid Long Term Review: A Big Luxury Sedan With Hatchback Efficiency
Drug Test 35765N
Two Babies One Fox Full Comic Pdf
Skycurve Replacement Mat
1979 Ford F350 For Sale Craigslist
Past Weather by Zip Code - Data Table
Motor Mounts
Ff14 Sage Stat Priority
Homewatch Caregivers Salary
Babbychula
AI-Powered Free Online Flashcards for Studying | Kahoot!
8005607994
Geology - Grand Canyon National Park (U.S. National Park Service)
Stanford Medicine scientists pinpoint COVID-19 virus’s entry and exit ports inside our noses
South Bend Tribune Online
Craigs List Palm Springs
2 Pm Cdt
Nina Flowers
Hkx File Compatibility Check Skyrim/Sse
Mathews Vertix Mod Chart
Goats For Sale On Craigslist
CrossFit 101
Sandra Sancc
Access to Delta Websites for Retirees
Cara Corcione Obituary
Meee Ruh
Wrentham Outlets Hours Sunday
Bluebird Valuation Appraiser Login
Craigs List Sarasota
Ihop Deliver
Latest Posts
Article information

Author: Cheryll Lueilwitz

Last Updated:

Views: 6242

Rating: 4.3 / 5 (54 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Cheryll Lueilwitz

Birthday: 1997-12-23

Address: 4653 O'Kon Hill, Lake Juanstad, AR 65469

Phone: +494124489301

Job: Marketing Representative

Hobby: Reading, Ice skating, Foraging, BASE jumping, Hiking, Skateboarding, Kayaking

Introduction: My name is Cheryll Lueilwitz, I am a sparkling, clean, super, lucky, joyous, outstanding, lucky person who loves writing and wants to share my knowledge and understanding with you.