Relational algebra for banking scenario

Given statements. Every table/relation has a statement parameterized by columns/attributes. (Its "characteristic predicate".) The rows/tuples that make the statement true go in the table/relation. First find the statements for the given tables/relations:

// customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
Deposit (Branch, Acc-No, Cust-Name, Balance)

// customer [Cust-Name] loan [Loan-No] balance is £[Balance] at branch [Branch]
Loan(Branch, Loan-No, Cust-Name, Balance)

. . .

Notice that the table/relation definition is shorthand for the statement.

Query statements. Now put these given statements together to get a statement that only the rows we want satisfy. Use AND, OR, AND NOT, AND condition. Keep or drop names. Use a new name if you need one.

I will do an example like part of your assignment:

-- informal style version
branch, customer name, account balance and account number for
    customers that have a loan bigger than £2000
    at Romford branch

I want those rows. So I want a statement that exactly those rows make true. So I make statements that get closer and closer to the one I want. So I start:

-- columns/attributes Cust-Name, Loan-No, Balance, Branch
customer [Cust-Name] loan [Loan-No] balance is £[Balance] at branch [Branch]

Now I want to use a different name for the loan balances because I want to end up using Balance for account balances only:

-- columns/attributes Cust-Name, Loan-No, Loan-Balance, Branch
  customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]

Now I want account balances too:

-- columns/attributes Cust-Name, Loan-No, Loan-Balance, Branch, Balance, Acc-No
    customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]

If I had only used one name Balance then it would have had to be a loan balance and an account balance. The rows/tuples would have been for customers with a loan balance the same as an account balance. And the Balance column/attribute would have been those values.

Now I want to limit the balances and the branch:

-- columns/attributes Cust-Name, Loan-No, Loan-Balance, Branch, Balance, Acc-No
    customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
AND [Loan-Balance]>2000 AND [Branch]='Romford'

Now I only want some of the columns/attributes:

-- statement style version
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
Keeping Branch, Cust-Name, Balance, Acc-No: (
    customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
AND [Loan-Balance]>2000 AND [Branch]='Romford')

This is a statement for the example rows.

You can use "Keeping names to keep" or "Dropping names to drop. (In logic, Dropping is called FOR SOME or THERE EXISTS. Because we want the statement inside it to be true FOR SOME value(s) for a name, ie we want that THERE EXISTS value(s) for a name that makes the statement true.)

Query shorthand. Now replace each statement by its shorthand.

In my example I get:

-- shorthand style version
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
Keeping Branch, Cust-Name, Balance, Acc-No: (
    Loan (Branch, Loan-No, Cust-Name, Loan-Balance)
AND Deposit (Branch, Acc-No, Cust-Name, Balance)
AND Loan-Balance>2000 AND Branch='Romford')

(Notice that in the second half of your question's attempt you don't need Customer. Because you don't need its statement. Because you can state everything about the rows you want without it. So it just adds Cust-County which you eventually throw away without using.)

Query algebra Now to get the algebra replace:

  • every statement by its table/relation
  • every AND of table/relation statements by ⋈ (natural join)
  • every OR of table/relation statements (which must have the same columns/attributes) by ∪ (union)
  • every AND NOT of statements (which must have the same columns/attributes) by \ (difference)
  • every AND condition by σ condition
  • every Keeping names to keep by π names to keep (projection) (and Dropping by π names to keep)
  • every column/attribute renaming in a given statement by ρ (rename).

∩ (intersection) and x (product) are special cases of ⋈ (∩ for both sides the same columns/attributes and x for no shared columns/attributes).

Remember that column/attribute names get introduced by table/relation statements & tables/relations but removed by Keeping/Dropping & π. Remember that a renaming in a given statement becomes a ρ.

I get:

-- algebra style version
π Branch, Cust-Name, Balance, Acc-No
    σ Branch='Romford' σ Loan-Balance>2000
            ((ρ Loan-Balance/Balance Loan) ⋈ Deposit)

(I don't know what particular algebra notation you are supposed to use. Learn its rules for dotting names and using equijoin vs natural join. Also I don't know what kind of σ conditions it allows.)

Follow the example. So take a description of rows and write a statement that exactly those rows make true. Then convert to given statements. Then replace by shorthands. Then replace by algebra.

  • What are your given statements?
  • What are their shorthands?
  • What are their table/relation names?
  • What are their columns/attributes?
  • What are the columns/attributes of the rows/tuples you want?
  • What is a clear, plain, natural language statement that the rows you want make true but the rows you don't want don't? But avoid pronouns, because they don't translate to algebra; just reuse column/attribute names. And if you need a new name then just invent one and make statements about it.
  • What is one given statement that is part of your overall statement?
  • What is the shorthand version?
  • What is the algebra version?
  • Did you change a name in a given statement? Then rename it in its table/relation via ρ.

Continue for another part of your overall statement.

  • Do you want a combination of given statements? Then use AND, OR, AND NOT and Keeping/Dropping.
  • Do you not want to know the value of a column/attribute that you mentioned? Then use Keeping/dropping (then π).
  • Did you mention too many names? Then keep the ones you want via π (and corresponding Keeping/Dropping).

Keep going.

You will have to find the right order to say things in. Try different orders. Because you have to use NOT via AND NOT of statements/tables or via a condition. And OR and AND NOT of statements/tables must have the same columns/attributes on each side. And a name in in a condition has to be mentioned in a statement it is ANDed with.

Your question. It took me a while to parse and correct the goal you gave:

Show the branch, customer name, balance and account number for all the customers that have a loan bigger than £2000 and all the customers with deposit account with a balance smaller than £150. All these customers should be at Romford branch.

This is:

  • Show the branch, customer name, account balance and account number for ( all the customers that have a loan bigger than £2000 and all the customers with deposit account with a balance smaller than £150 ) . All these customers should be at Romford Branch.

(I had to add a word to make sense of this. But it is unbelievable that this is supposed to mean "branch, customer name, balance and number [labelled what??] where balance and number are loan balance and number for customers with loans > 2000 or balance and number are account balance and number for customers with account balances < 150".)

This has an AND in the middle so you might think it will give an algebra ⋈ (natural join) or ∩ (intersection). But you must describe your columns/attributes only in terms of given statements and conditions. It turns out that the AND gets turned into an OR. Also it turns out that we have to add an extra Deposit statement. So that we have the loan customers account info. Remember that you have to have the same columns/attributes on both side of an OR (or AND NOT).

First "branch, customer name, account balance, account number for" "all the customers that have a loan bigger than £2000". This looks like what we did up above. But this time lets limit the branches later:

-- statement A
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
Keeping Branch, Cust-Name, Balance, Acc-No: (
    customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
AND [Loan-Balance]>2000)

Now "branch, customer name, account balance, account number for" "all the customers with deposit account with a balance smaller than £150". This is simpler than before so I hope you can understand it directly:

-- statement B
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
    customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
AND [Balance]<150

Now we want the rows that make the statement "statement A OR statement B" true:

-- columns/attributes Cust-Name, Branch, Balance, Acc-No
    Keeping Branch, Cust-Name, Balance, Acc-No: (
        customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
    AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
    AND [Loan-Balance]>2000)
OR  customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
    AND [Balance]<150

Now we limit the branch:

-- statement for goal
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
(   Keeping Branch, Cust-Name, Balance, Acc-No: (
        customer [Cust-Name] loan [Loan-No] balance is £[Loan-Balance] at branch [Branch]
    AND customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
    AND [Loan-Balance]>2000)
OR  customer [Cust-Name] has £[Balance] in account [Acc-No] at branch [Branch]
    AND [Balance]<150
)
AND [Branch]='Romford'

This is a statement for the wanted rows. Now we replace by shorthands:

-- shorthand for goal
-- columns/attributes Cust-Name, Branch, Balance, Acc-No
(   Keeping Branch, Cust-Name, Balance, Acc-No: (
        Loan (Branch, Loan-No, Cust-Name, Loan-Balance)
    AND Deposit (Branch, Acc-No, Cust-Name, Balance)
    AND [Loan-Balance]>2000)
OR  Deposit (Branch, Acc-No, Cust-Name, Balance)
    AND [Balance]<150
)
AND [Branch]='Romford'

An answer. Now we replace by algebra:

-- algebra style version
σ Branch='Romford'
    (   π Branch, Cust-Name, Balance, Acc-No
            σ Loan-Balance>2000
                ((ρ Loan-Balance/Balance Loan) ⋈ Deposit)
    ∪   σ Balance<150 Deposit))

PS: Algebra = loopless calculation The whole point of the relational algebra is that statements exactly correspond to algebraic expressions: statements correspond to tables/relations and (statements') logic operators correspond to algebra operators. But the algebra version is a loopless description that can be automatically calculated. The rows that make a statement true are the value of its algebraic version. We give the rows for the table/relation statements and the algebra calculates the rows for any other statement we combine from them.