Transcribed TextTranscribed Text

Module 6: Query Optimization This module will cover methodologies for optimizing distributed queryplans. Apply cost model and heuristics to query treesto find hear optimal query plans. Discuss the benefits of semi join algorithm. Chapter Principles of Distributed Database Systems, Ozsuand Valduriez, 2011 1. Describet the conditions under which semi join better choice than traditional join on distributed tables. Assume databese represents the following tables CUSTOMER (CNO NAME AGE STATE BALANCE) ORDER (CNO ORDERNO, DATE, COST QTY) Calculale de cost ofexeculing the following qualy using de SELECT NAME AGE STATE BALANCE QTY FROM CUSTOMER ORDER WHERE CUSTOMERCNO =ORDERCNO AND AGE AND COST<=400 The following stahstics about the database: CUSI: CNO NAME bytes AGF (valursi rangy from 20m 59) BAL ANCE hytrs STATE hytes ORDER CNO hytrs ORDERNO hytrs DATE Shytes COST Shytes (valurs from $2000) QTY 4hytrs The CUSTOMER has hashed indes on CNO and has The ORDER table has alhashed index on CNO The cost of freading record from table The cost of witing arecord into: table The cost of setting uo transfer The cost of transferring bytei Tra Assume the semijoin will be implemented as CUSTOMER ORDER CUSTOMER] X (ORDER CUSTOMER) de cost join between tables and B dial aue resident on de samme sile. wlane Lias almahed muex am: lias no midex canle calculatedas follews Serially read the recests from Band for cach recond determine ifit particinates the result Tfit does prejert the sired attributes from the meord of l and performa constant time read for the correspondunz joun record from table AL 1) Assume the following tables and attributes: CUST: cid int bytes) name char(20) bytes) age int bytes) range 60 address char(50) bytes) state char(2) 2 bytes) ITEM: cid int (4 bytes) itemid int bytes) price int 4 bytes) range $100 description char(50) (50 bytes) CUST located on site 1: ITEM is located on There one to-many between CUST and ITEM where cid is the foreign key of CUST (on in the ITEM table. CUSThas hashed indexes on cid. There is NO index on theITEM table. here are 100 CUSTrecords and 1000ITEMrecords Tra is the network transfer cost per byte. Topu the cost of CPU instruction (including disk I/O) the cost of initiating and receiving message. There are only twodistinct states in the CUST table records (MD and VA) Given the query: SELECT name, state, description FROM CUST,ITEM WHERE CUST.cid-ITEM.cid: and CUST.age 50 and ITEM price? 30and ITEM price <= 50 See next pages for questions a) Estimate the total cost of the query by moving the appropriate parts of the CUST table from site to and performing the join at site To estimate the total cost, sum the cost of the estimates of each intermediate step. Assume that you are donc when you have completed the join. (This NOT semi-join.) b) Estimate the total cost of the query by moving the appropriate parts of the ITEM table from site 2to and performing the join at site 1. To estimate the total cost. sum the cost of the estimates of each intermediat step. Assume that you are donc when you have completed the join. (This NOT a semi-join.) c) What single modification to the database would enable an even lower cost query plan? a am not looking for semi-join.)

Solution PreviewSolution Preview

These solutions may offer step-by-step problem-solving explanations or good writing examples that include modern styles of formatting and construction of bibliographies out of text citations and references. Students may use these solutions for personal skill-building and practice. Unethical use is strictly forbidden.

1. Question 1: Describe the conditions under which a semi-join is a better choice than a traditional join on distributed tables.
a. Semi join is used when we want to reduce the number of tuple in a relation before transferring it to another site
b. Example:
i. We want to join 2 tables, S and R, over attribute A. R is stored at site 1 and S is stored at site 2.
ii. We assume that size(S) > size(R)
iii. We note that
R ⋈A S  (R ⋉A S) ⋈A S  R ⋈A (S ⋉A R)  (R ⋉A S) ⋈A (S ⋉A R)
1. Tradition join:
a. We have to move all R from site 1 to site 2. It costs TMSG * TTR*(size(R)) ...

By purchasing this solution you'll be able to access the following files:

for this solution

or FREE if you
register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

Find A Tutor

View available Database Development Tutors

Get College Homework Help.

Are you sure you don't want to upload any files?

Fast tutor response requires as much info as possible.

Upload a file
Continue without uploading

We couldn't find that subject.
Please select the best match from the list below.

We'll send you an email right away. If it's not in your inbox, check your spam folder.

  • 1
  • 2
  • 3
Live Chats