Warehouse considerations | Snowflake Documentation (2024)

This topic provides general guidelines and best practices for using virtual warehouses in Snowflake to process queries. It does not provide specific or absolute numbers, values,or recommendations because every query scenario is different and is affected by numerous factors, including number of concurrent users/queries, number of tables being queried, and data size andcomposition, as well as your specific requirements for warehouse availability, latency, and cost.

It also does not cover warehouse considerations for data loading, which are covered in another topic (see the sidebar).

The keys to using warehouses effectively and efficiently are:

  1. Experiment with different types of queries and different warehouse sizes to determine the combinations that best meet your specific query needs and workload.

  2. Don’t focus on warehouse size. Snowflake utilizes per-second billing, so you can run larger warehouses (Large, X-Large, 2X-Large, etc.) and simply suspend them when not in use.

Note

These guidelines and best practices apply to both single-cluster warehouses, which are standard for all accounts, and multi-cluster warehouses,which are available in Snowflake Enterprise Edition (and higher).

How are credits charged for warehouses?

Credit charges are calculated based on:

  • The warehouse size.

  • The number of clusters (if using multi-cluster warehouses).

  • The length of time the compute resources in each cluster runs.

For example:

X-Small:

Bills 1 credit per full, continuous hour that each cluster runs; each successive size generally doubles the number of computeresources per warehouse.

4X-Large:

Bills 128 credits per full, continuous hour that each cluster runs.

Note the following:

  • When compute resources are provisioned for a warehouse:

  • Resizing a warehouse provisions additional compute resources for each cluster in the warehouse:

    • This results in a corresponding increase in the number of credits billed for the warehouse (while the additional compute resources arerunning).

    • The additional compute resources are billed when they are provisioned (i.e. credits for the additional resources are billed relativeto the time when the warehouse was resized).

    • Resizing between a 5XL or 6XL warehouse to a 4XL or smaller warehouse results in a brief period during which the customer ischarged for both the new warehouse and the old warehouse while the old warehouse is quiesced.

    • Credit usage is displayed in hour increments. With per-second billing, you will see fractional amounts for credit usage/billing.

How does query composition impact warehouse processing?

The compute resources required to process a query depends on the size and complexity of the query. For the most part, queries scale linearly with regards to warehouse size, particularly forlarger, more complex queries. When considering factors that impact query processing, consider the following:

  • The overall size of the tables being queried has more impact than the number of rows.

  • Query filtering using predicates has an impact on processing, as does the number of joins/tables in the query.

Tip

To achieve the best results, try to execute relatively hom*ogeneous queries (size, complexity, data sets, etc.) on the same warehouse; executing queries of widely-varying size and/orcomplexity on the same warehouse makes it more difficult to analyze warehouse load, which can make it more difficult to select the best size to match the size, composition, and number ofqueries in your workload.

How does warehouse caching impact queries?

Each warehouse, when running, maintains a cache of table data accessed as queries are processed by the warehouse. This enables improvedperformance for subsequent queries if they are able to read from the cache instead of from the table(s) in the query. The size of the cacheis determined by the compute resources in the warehouse (that is, the larger the warehouse and, therefore, more compute resources in thewarehouse, the larger the cache).

This cache is dropped when the warehouse is suspended, which may result in slower initial performance for some queries after the warehouse is resumed. As the resumed warehouse runs and processesmore queries, the cache is rebuilt, and queries that are able to take advantage of the cache will experience improved performance.

Keep this in mind when deciding whether to suspend a warehouse or leave it running. In other words, consider the trade-off between saving credits by suspending a warehouse versus maintaining thecache of data from previous queries to help with performance.

Creating a warehouse

When creating a warehouse, the two most critical factors to consider, from a cost and performance perspective, are:

  • Warehouse size (i.e. available compute resources)

  • Manual vs automated management (for starting/resuming and suspending warehouses).

The number of clusters in a warehouse is also important if you are using Snowflake Enterprise Edition (or higher) andmulti-cluster warehouses. For more details, see Scaling Up vs Scaling Out (in this topic).

Selecting an initial warehouse size

The initial size you select for a warehouse depends on the task the warehouse is performing and the workload it processes. For example:

  • For data loading, the warehouse size should match the number of files being loaded and the amount of data in each file. For more details, see Planning a data load.

  • For queries in small-scale testing environments, smaller warehouses sizes (X-Small, Small, Medium) may be sufficient.

  • For queries in large-scale production environments, larger warehouse sizes (Large, X-Large, 2X-Large, etc.) may be more cost effective.

However, note that per-second credit billing and auto-suspend give you the flexibility to start with larger sizes and then adjust the size to match your workloads. You can always decrease the sizeof a warehouse at any time.

Also, larger is not necessarily faster for smaller, more basic queries. Small/simple queries typically do not need an X-Large (or larger) warehouse because they do not necessarily benefit from theadditional resources, regardless of the number of queries being processed concurrently. In general, you should try to match the size of the warehouse to the expected size and complexity of thequeries to be processed by the warehouse.

Tip

Experiment by running the same queries against warehouses of multiple sizes (e.g. X-Large, Large, Medium). The queries you experiment with should be of a size and complexity that you know willtypically complete within 5 to 10 minutes (or less).

Using the default warehouse for Notebook apps

A dedicated Snowflake-managed warehouse is provisioned in each account to exclusively run Notebook apps. SYSTEM$STREAMLIT_NOTEBOOK_WH is a multi-cluster XS warehouse that reduces cluster fragmentation, optimizes your overall costs, and aids in better bin packing. For more details, see Default warehouse for Notebooks.

Automating warehouse suspension

Warehouses can be set to automatically suspend when there’s no activity after a specified period of time. Auto-suspend is enabled by specifying the time period (minutes, hours, etc.) of inactivityfor the warehouse.

We recommend setting auto-suspend according to your workload and your requirements for warehouse availability:

  • If you enable auto-suspend, we recommend setting it to a low value (e.g. 5 or 10 minutes or less) because Snowflake utilizes per-second billing. This will help keep your warehouses from running(and consuming credits) when not in use.

    However, the value you set should match the gaps, if any, in your query workload. For example, if you have regular gaps of 2 or 3 minutes between incoming queries, it doesn’t make sense to setauto-suspend to 1 or 2 minutes because your warehouse will be in a continual state of suspending and resuming (if auto-resume is also enabled) and each time it resumes, you are billed for theminimum credit usage (i.e. 60 seconds).

  • You might want to consider disabling auto-suspend for a warehouse if:

    • You have a heavy, steady workload for the warehouse.

    • You require the warehouse to be available with no delay or lag time. Warehouse provisioning is generally very fast (e.g. 1 or 2seconds); however, depending on the size of the warehouse and the availability of compute resources to provision, it can take longer.

Important

If you chose to disable auto-suspend, please carefully consider the costs associated with running a warehouse continually, even when the warehouse is not processing queries. The costscan be significant, especially for larger warehouses (X-Large, 2X-Large, etc.).

To disable auto-suspend, you must explicitly select Never in the web interface, or specify 0 or NULL in SQL.

Automating warehouse resumption

Warehouses can be set to automatically resume when new queries are submitted.

We recommend enabling/disabling auto-resume depending on how much control you wish to exert over usage of a particular warehouse:

  • If cost and access are not an issue, enable auto-resume to ensure that the warehouse starts whenever needed. Keep in mind that there might be a short delay in the resumption of the warehousedue to provisioning.

  • If you wish to control costs and/or user access, leave auto-resume disabled and instead manually resume the warehouse only when needed.

Scaling up vs scaling out

Snowflake supports two ways to scale warehouses:

  • Scale up by resizing a warehouse.

  • Scale out by adding clusters to a multi-cluster warehouse (requires Snowflake Enterprise Edition orhigher).

Warehouse resizing improves performance

Resizing a warehouse generally improves query performance, particularly for larger, more complex queries. It can also help reduce thequeuing that occurs if a warehouse does not have enough compute resources to process all the queries that are submitted concurrently. Notethat warehouse resizing is not intended for handling concurrency issues; instead, use additional warehouses to handle the workload or use amulti-cluster warehouse (if this feature is available for your account).

Snowflake supports resizing a warehouse at any time, even while running. If a query is running slowly and you have additional queries of similar size and complexity that you want to run on the samewarehouse, you might choose to resize the warehouse while it is running; however, note the following:

  • As stated earlier about warehouse size, larger is not necessarily faster; for smaller, basic queries that are already executing quickly,you may not see any significant improvement after resizing.

  • Resizing a running warehouse does not impact queries that are already being processed by the warehouse; the additional compute resources,once fully provisioned, are only used for queued and new queries.

  • Resizing between a 5XL or 6XL warehouse to a 4XL or smaller warehouse results in a brief period during which the customer is chargedfor both the new warehouse and the old warehouse while the old warehouse is quiesced.

Tip

Decreasing the size of a running warehouse removes compute resources from the warehouse. When the computer resources are removed, thecache associated with those resources is dropped, which can impact performance in the same way that suspending the warehouse can impactperformance after it is resumed.

Keep this in mind when choosing whether to decrease the size of a running warehouse or keep it at the current size. In other words, thereis a trade-off with regards to saving credits versus maintaining the cache.

Multi-cluster warehouses improve concurrency

Multi-cluster warehouses are designed specifically for handling queuing and performance issues related to large numbers of concurrent users and/orqueries. In addition, multi-cluster warehouses can help automate this process if your number of users/queries tend to fluctuate.

When deciding whether to use multi-cluster warehouses and the number of clusters to use per multi-cluster warehouse, consider thefollowing:

  • If you are using Snowflake Enterprise Edition (or a higher edition), all your warehouses should be configured as multi-cluster warehouses.

  • Unless you have a specific requirement for running in Maximized mode, multi-cluster warehouses should be configured to run in Auto-scalemode, which enables Snowflake to automatically start and stop clusters as needed.

  • When choosing the minimum and maximum number of clusters for a multi-cluster warehouse:

    Minimum:

    Keep the default value of 1; this ensures that additional clusters are only started as needed. However, ifhigh-availability of the warehouse is a concern, set the value higher than 1. This helps ensure multi-cluster warehouse availabilityand continuity in the unlikely event that a cluster fails.

    Maximum:

    Set this value as large as possible, while being mindful of the warehouse size and corresponding credit costs. For example, anX-Large multi-cluster warehouse with maximum clusters = 10 will consume 160 credits in an hour if all 10 clusters runcontinuously for the hour.

Warehouse considerations | Snowflake Documentation (2024)

FAQs

What are the key considerations for using warehouse effectively and efficiently? ›

Optimizing Storage Layout for Easy Access and Space Utilization. Efficient storage layout design is critical for maximizing warehouse space utilization and ensuring easy access to inventory. A well-organized storage layout minimizes travel time, reduces picking errors, and enhances overall operational efficiency.

What happens to the incoming queries when a warehouse does not have enough resources to process them? ›

If the warehouse does not have enough remaining resources to process a query, the query is queued, pending resources that become available as other running queries complete.

What does suspended warehouse mean in Snowflake? ›

Suspending a warehouse does not abort any queries being processed by the warehouse at the time it is suspended. Instead, the warehouse completes the queries, then shuts down the compute resources used to process the queries.

What is the best way to analyze the optimum warehouse size? ›

In general, you should try to match the size of the warehouse to the expected size and complexity of the queries to be processed by the warehouse. Experiment by running the same queries against warehouses of multiple sizes (e.g. X-Large, Large, Medium).

What are the 5S principles of warehouse management? ›

5S is a Japanese lean concept of warehouse organization and improvement based on 5 principles – sort, set, shine, standardize, and sustain – to create a “clean and organized” workplace/warehouse. The concepts help create a more organized and clean workspace for efficiency.

What are some problems that can be encountered during the warehousing process? ›

Top 10 Warehouse Management Challenges & Their Solutions
  • Inaccurate Inventory Information. ...
  • Inefficient Space Utilization. ...
  • Improper Labor Management. ...
  • Following Rudimentary Processes. ...
  • Adapting to Seasonal Demand. ...
  • Substandard Picking Process. ...
  • Flawed Order Management. ...
  • Managing Heaps of Data.
Jul 26, 2024

What are the biggest challenges a company faces when trying to implement a data warehouse and use data mining? ›

Common Issues Data Teams Face With Traditional Data Warehousing
  • Data Quality. It can be difficult to maintain data quality in a traditional data warehouse structure. ...
  • Manual Data Processing. ...
  • Testing. ...
  • Data Accuracy. ...
  • Performance. ...
  • Non-technical Users.
May 2, 2024

How does the data warehouse handle queries? ›

Typical data warehouse queries are usually generated by on-line analytical processing (OLAP) or data mining software components. They show an extremely complex structure and usually address a large number of rows of the underlying database.

What happens when a suspended virtual warehouse is resized in Snowflake? ›

Resizing a suspended warehouse does not provision any new compute resources for the warehouse. It simply instructs Snowflake to provision the additional compute resources when the warehouse is next resumed, at which time all the usage and credit rules associated with starting a warehouse apply.

What is snowflaking in data warehouse? ›

In data warehousing, snowflaking is a form of dimensional modeling in which dimensions are stored in multiple related dimension tables. A snowflake schema is a variation of the star schema that normalizes the dimension tables to increase data integrity, simplify data maintenance and reduce the amount of disk space.

What is the maximum number of warehouses in Snowflake? ›

With multi-cluster warehouses, Snowflake supports allocating, either statically or dynamically, additional clusters to make a larger pool of compute resources available. A multi-cluster warehouse is defined by specifying the following properties: Maximum number of clusters, greater than 1 (up to 10).

How many nodes does a large Snowflake warehouse have? ›

This doubling of processing power with each unit increase in size halves the elapsed query processing time every time the warehouse size is scaled up. For instance, a Medium warehouse has 4 nodes in Snowflake, while a Large warehouse has 8 nodes.

How many threads does a Snowflake warehouse have? ›

Threads: Snowflake designates eight threads or sessions per cluster. While copy statements are generally not memory-intensive, it's crucial to consider this factor for optimal load performance. In this context, adhering to Snowflake's recommendation of using eight threads is advisable.

What are some ways to make a warehouse more efficient? ›

8 Ways to Improve Warehouse Efficiency
  1. Optimize Your Layout. ...
  2. Reevaluate Your Customer Service. ...
  3. Use a Warehouse Management System. ...
  4. Talk to Your Employees. ...
  5. Look for Ways to Reuse or Recycle. ...
  6. Look Over Freight Claims. ...
  7. Automate Where Possible. ...
  8. Engage Employees in the Process.
Sep 22, 2023

Which of the following is a key consideration for effective warehouse inventory management? ›

Here are the key factors to consider: Proximity to Suppliers and Customers: The location of your warehouse to your suppliers and customers is paramount. Being closer to suppliers minimizes the cost and time of transporting goods to your warehouse.

How can we use warehouse space efficiently? ›

Organize your inventory
  1. Implement efficient inventory management. Keep a detailed inventory of all items stored in the warehouse to help identify those not being used and can be removed or sold. ...
  2. Group similar items together. ...
  3. Use labels. ...
  4. Implement a FIFO system. ...
  5. Establish a cycle counting program. ...
  6. Train your staff.

What is the most essential factors that influence effective use of a warehouse? ›

Factors that affect warehouse efficiency include the size and scale of the warehouse, the use of intelligent robots for picking operations, the skill set and experience of warehouse operators, the customization and complexity of the Warehouse Management Systems (WMS) system, technological infrastructure limitations, ...

Top Articles
Brk.a Stock Price In 1992 | StatMuse Money
Coping with scans - The Miscarriage Association
Spectrum Gdvr-2007
Skyward Houston County
Food King El Paso Ads
How To Do A Springboard Attack In Wwe 2K22
Meer klaarheid bij toewijzing rechter
Women's Beauty Parlour Near Me
P2P4U Net Soccer
123 Movies Black Adam
Midway Antique Mall Consignor Access
Campaign Homecoming Queen Posters
What Does Dwb Mean In Instagram
Raid Guides - Hardstuck
How Many Slices Are In A Large Pizza? | Number Of Pizzas To Order For Your Next Party
Mile Split Fl
Idaho Harvest Statistics
Tnt Forum Activeboard
Ups Access Point Lockers
Air Force Chief Results
Missouri Highway Patrol Crash
Gentle Dental Northpointe
Www.publicsurplus.com Motor Pool
Hermitcraft Texture Pack
1989 Chevy Caprice For Sale Craigslist
Putin advierte que si se permite a Ucrania usar misiles de largo alcance, los países de la OTAN estarán en guerra con Rusia - BBC News Mundo
Reborn Rich Kissasian
Employee Health Upmc
Greenville Sc Greyhound
Lexus Credit Card Login
Kohls Lufkin Tx
Dr. Nicole Arcy Dvm Married To Husband
Die wichtigsten E-Nummern
The Monitor Recent Obituaries: All Of The Monitor's Recent Obituaries
Halsted Bus Tracker
Mrstryst
Craigslist In Myrtle Beach
Timothy Kremchek Net Worth
Devotion Showtimes Near Mjr Universal Grand Cinema 16
Chilangos Hillsborough Nj
Best Restaurants In Blacksburg
2020 Can-Am DS 90 X Vs 2020 Honda TRX90X: By the Numbers
Tryst Houston Tx
Verizon Outage Cuyahoga Falls Ohio
The Largest Banks - ​​How to Transfer Money With Only Card Number and CVV (2024)
Craigslist Food And Beverage Jobs Chicago
Truck Works Dothan Alabama
Enr 2100
Jigidi Free Jigsaw
ESPN's New Standalone Streaming Service Will Be Available Through Disney+ In 2025
Congressional hopeful Aisha Mills sees district as an economical model
28 Mm Zwart Spaanplaat Gemelamineerd (U999 ST9 Matte | RAL9005) Op Maat | Zagen Op Mm + ABS Kantenband
Latest Posts
Article information

Author: The Hon. Margery Christiansen

Last Updated:

Views: 6020

Rating: 5 / 5 (70 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: The Hon. Margery Christiansen

Birthday: 2000-07-07

Address: 5050 Breitenberg Knoll, New Robert, MI 45409

Phone: +2556892639372

Job: Investor Mining Engineer

Hobby: Sketching, Cosplaying, Glassblowing, Genealogy, Crocheting, Archery, Skateboarding

Introduction: My name is The Hon. Margery Christiansen, I am a bright, adorable, precious, inexpensive, gorgeous, comfortable, happy person who loves writing and wants to share my knowledge and understanding with you.