jecastrom / data_2.08_activities

SQL Queries Activity

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SQL Activity 2 08

Activity 1

  • In this activity, we will be using the table district from the bank database and according to the description for the different columns:

    • A4: no. of inhabitants

    • A9: no. of cities

    • A10: the ratio of urban inhabitants

    • A11: average salary

    • A12: the unemployment rate

1. Rank districts by different variables

Answer:

  • A4: no. of inhabitants

SELECT
    a2 AS district_name,
    a4 AS no_of_inhabitants,
    rank() over(
        ORDER BY
            a4 DESC
    ) AS ranking
FROM
    district;
147936926 1c13dd31 b52f 4f6d b31b f95a07df1afd
  • A9: no. of cities

SELECT
    a2 AS district_name,
    a9 AS no_of_cities,
    rank() over(
        ORDER BY
            a9 DESC
    ) AS ranking
FROM
    district;
147939661 88d36340 bd7c 4e49 b28d d60875e9f991
ℹ️

Notice that the number of cities can repeat or tie. When this occurs the two values receive the same ranking number, and then a gap is left in the ranking numbers.

The Rank() functions uses a Standard competition ranking strategy (1224 ranking). for example ranks in a "Olympic medalling style", if two people share the gold, there is no silver medalist.[1]

If I didn’t want a "gap" in the ranking numbers, then I should use the Dense_rank() function which uses a Dense ranking strategy (1223 ranking) so no olympic medalling style of ranking with Dense_rank(). In dense ranking, items that compare equally receive the same ranking number, and the next items receive the immediately following ranking number.[2]

 SELECT
    a2 AS district_name,
    a9 AS no_of_cities,
    dense_rank() over(
        ORDER BY
            a9 DESC
    ) AS ranking
FROM
    district;
147939794 9b75ea6d 6a4b 4694 b02f b4ec296d0049
  • A10: the ratio of urban inhabitants

SELECT
    a2 AS district_name,
    a10 AS ratio_of_urban_inhabitants,
    rank() over(
        ORDER BY
            a10 DESC
    ) AS ranking
FROM
    district;
147943046 3d82e44e dca7 4acd 90c2 75b1415259a1
  • A11: average salary

SELECT
    a2 AS district_name,
    a11 AS average_salary,
    rank() over(
        ORDER BY
            a11 DESC
    ) AS ranking
FROM
    district;
147943578 3f9799d0 5f0d 4057 b8d4 9efc6969100d
  • A12: the unemployment rate

SELECT
    a2 AS district_name,
    a12 AS unemployment_rate,
    rank() over(
        ORDER BY
            a12 DESC
    ) AS ranking
FROM
    district;
147962196 64c43272 ae8c 428d bfe8 31f143930c9c
  • An overview ranked by unemployment rate

SELECT
    a2 AS district_name,
    a12 AS unemployment_rate,
    rank() over(
        ORDER BY
            a12 DESC
    ) AS ranking,
    a4 AS no_of_inhabitants,
    rank() over(
        ORDER BY
            a4 DESC
    ) AS ranking,
    a9 AS no_of_cities,
    rank() over(
        ORDER BY
            a9 DESC
    ) AS ranking,
    a10 AS ratio_of_urban_inhabitants,
    rank() over(
        ORDER BY
            a10 DESC
    ) AS ranking,
    a11 AS average_salary,
    rank() over(
        ORDER BY
            a11 DESC
    ) AS ranking
FROM
    district
ORDER BY
    unemployment_rate DESC;
147964235 1d581a9d 5523 45df b62e 2a08b276c4f4

2. Do the same but group by region

Answer:

SELECT
    a3 AS region,
    sum(a4) AS no_of_inhabitants,
    rank() over(
        ORDER BY
            sum(a4) DESC
    ) AS ranking,
    sum(a9) AS no_of_cities,
    rank() over(
        ORDER BY
            sum(a9) DESC
    ) AS ranking,
    avg(a10) AS ratio_of_urban_inhabitants,
    rank() over(
        ORDER BY
            avg(a10) DESC
    ) AS ranking,
    avg(a11) AS average_salary,
    rank() over(
        ORDER BY
            avg(a11) DESC
    ) AS ranking,
    avg(a12) AS unemployment_rate,
    rank() over(
        ORDER BY
            avg(a12) DESC
    ) AS ranking
FROM
    district
GROUP BY
    1
ORDER BY
    no_of_inhabitants DESC;
147972175 2a6f834b 38f0 4da4 9751 c55559675344

Activity 2

1. Use the transactions table in the bank database to find the Top 20 account_ids based on the amount.

Answer:

SELECT
    account_id,
    amount,
    rank() over(
        ORDER BY
            amount DESC
    ) AS top_20_transactions
FROM
    trans
LIMIT
    20;
147998575 c87a88e2 65b7 48a7 8eab 08d4471d0dc1

2. Illustrate the difference between rank() and dense_rank().

Activity 3

  • Keep using the bank database.

1. Get a rank of districts ordered by the number of customers

Answer:

SELECT
    district_id,
    district.a2 AS district_name,
    count(*) AS total_number_of_customers,
    rank() over(
        ORDER BY
            count(*) DESC
    ) AS ranking
FROM
    client
    INNER JOIN district ON `client`.district_id = district.a1
GROUP BY
    1;
148056923 21742939 d8af 4333 8e82 60003af94f54

2. Get a rank of regions ordered by the number of customers

Answer:

SELECT
    district.a3 AS region_name,
    count(*) AS total_number_of_customers,
    rank() over(
        ORDER BY
            count(*) DESC
    ) AS ranking
FROM
    client
    INNER JOIN district ON `client`.district_id = district.a1
GROUP BY
    1;
148058920 f9450c86 3765 4ca6 837f b2b1d6dfde25

3. Get the total amount borrowed by the district together with the average loan in that district

Answer:

SELECT
    district_id,
    district.a2 AS district_name,
    sum(loan.amount) AS total_amount_borrowed,
    floor(avg(loan.amount)) AS average_loan
FROM
    loan
    INNER JOIN account ON loan.account_id = account.account_id
    INNER JOIN district ON account.district_id = district.a1
GROUP BY
    1
ORDER BY
    3 DESC;
148067880 34346710 5ea6 4530 a4fd 402a660d5f9e

4. Get the number of accounts opened by district and year

Answer:

SELECT
    a1 AS district_id,
    a2 AS district_name,
    count(*) AS accounts_opened,
    concat('19', left(`date`, 2)) AS year
FROM
    account
    INNER JOIN district ON account.district_id = district.a1
GROUP BY
    1,
    4
ORDER BY
    1,
    4 DESC;
148075331 2b2d6f2a 4432 484c ac64 5bf67f7bcb2f




1. Strategies for assigning rankings: Standard competition ranking. Wikipedia
2. Strategies for assigning rankings: Dense ranking. Wikipedia

About

SQL Queries Activity