Now that the data is all loaded into your instance, you can start running queries on the database. A number of queries have been selected for you from the TPC-H specifications in this guide. For the complete workload of 22 queries that can be run against the TPC-H database, click here.
First, open up your SQL Editor to begin querying the data.
Query 1: Pricing Summary Report
This query reports the amount of business that was billed, shipped, and returned within 60 - 120 days of the greatest ship data contained in the database.
use tpch; select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, sum(l_extendedprice) as sum_base_price, sum(l_extendedprice * (1 - l_discount)) as sum_disc_price, sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge, avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, avg(l_discount) as avg_disc, count(*) as count_order from lineitem where l_shipdate <= date('1998-12-01' - interval '90' day) group by l_returnflag, l_linestatus order by l_returnflag, l_linestatus;
Run the query again and observe the execution time difference. You will see a dramatic improvement with query execution time. This is due to MemSQL’s Code Generation technology.
Query 2: Shipping Priority
This query retrieves the 10 unshipped orders with the highest value.
use tpch; select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'BUILDING' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date('1995-03-15') and l_shipdate > date('1995-03-15') group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate limit 10;
Query 3: Returned Item Reporting
This query identifies top 20 customers who might be having problems with the parts that are shipped to them.
use tpch; select c_custkey, c_name, sum(l_extendedprice * (1 - l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone, c_comment from customer, orders, lineitem, nation where c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate >= date('1993-10-01') and o_orderdate < date('1993-10-01') + interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkey group by c_custkey, c_name, c_acctbal, c_phone, n_name, c_address, c_comment order by revenue desc limit 20;
Query 4: Product Type Profit Measure
This query determines how much profit is made on a given line of parts, broken out by supplier nation and year.
use tpch; select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%green%' ) as profit group by nation, o_year order by nation, o_year desc;
Want more queries? Click here to get a full set of the 22 queries for a complete TPC-H workload.
Analyze Performance of Queries
MemSQL Studio has a convenient feature that allows you to record and visually analyze workload profiles at either the query or node level.
In SQL Editor, open up the Cluster icon and click Profile.
The Visual Explain window will open up. You can now explore the query and navigate around different operations.