AlejandroDeLaCruz / SQL_Project_Data_Job_Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

πŸ“Š Dive into the data job market! πŸ’Ό Focusing on data analyst roles, this project explores top-paying jobs, in-demand skills, and where high demand meets high salary in data analytics. πŸ’°πŸ”

πŸ” SQL queries? Check them out here: project_sql folder.

Background

Driven by a quest to navigate the data analyst job market more effectively, this project was born from a desire to pinpoint top-paid and in-demand skills, streamlining others work to find optimal jobs.

Data hails from my SQL Course. It's packed with insights on job titles, salaries, locations, and essential skills.

The questions I wanted to answer through my SQL queries were:

  1. What are the top-paying data analyst jobs?
  2. What skills are required for these top-paying jobs?
  3. What skills are most in-demand for data analysts?
  4. Which skills are associated with higher salaries
  5. What are the most optimal skills to learn?

Tools I Used

For my deep dive into the data analyst job market, I harnessed the power of several key tools:

  • SQL: The backbone of my analysis, allowing me to query the database and unearth critical insights.
  • PostgreSQL: The chosen database management system, ideal for handling jobs posting data.
  • Visual Studio Code: My go-to for database management and executing SQL queries.
  • Git and GitHub: Essential for version control and sharing my SQL scripts and analysis, ensuring collaboration and project tracking.

The Analysis

Each query for this project aimed at investigating specififc aspects of the data analyst job market. Here's how I approached each question:

1. Top Paying Data Analyst Jobs

To identify the highest-paying roles, I filtered data analyst positions by average yearly salaries and location, focusing on remote jobs. This query highlights the high paying opportunities in the field.

SELECT
    job_id,
    job_title,
    job_location,
    job_schedule_type,
    salary_year_avg,
    job_posted_date,
    c.name AS company_name
FROM
    job_postings_fact AS jp
LEFT JOIN
    company_dim AS c ON c.company_id = jp.company_id
WHERE
    job_title_short = 'Data Analyst'
    AND salary_year_avg IS NOT NULL
    AND job_location = 'Anywhere'
ORDER BY
    salary_year_avg DESC
LIMIT 10;

Here's the breakdown of the top data analyst jobs in 2023:

  • Wide Salary Range: Top 10 paying data analyst roles span from $184,000 to $650,000, indicating significant salary potential in the field.
  • Diverse Employment: Companies like SmartAsset, Meta, and AT&T are among those offering high salaries, showing a broad interest across different salaries
  • Job Title Variety: There is a high diversity in job titles, from Data Analyst to Director of Analytics, reflecting varied roles and specializations within data analytics.

2. Skills for Top Paying Jobs

To understand what skills are required for the top-paying jobs, I joined the job postings with the skills data, providing insights into what employers value for high-compensation roles.

WITH top_paying_jobs AS (
    SELECT
        job_id,
        job_title,
        job_location,
        job_schedule_type,
        salary_year_avg,
        job_posted_date,
        c.name AS company_name
    FROM
        job_postings_fact AS jp
    LEFT JOIN
        company_dim AS c ON c.company_id = jp.company_id
    WHERE
        job_title_short = 'Data Analyst'
        AND salary_year_avg IS NOT NULL
        AND job_location = 'Anywhere'
    ORDER BY
        salary_year_avg DESC
    LIMIT 10)

SELECT
    tp.*,
    sd.*
FROM
    skills_job_dim AS sj
JOIN
    top_paying_jobs AS tp ON tp.job_id = sj.job_id
JOIN
    skills_dim AS sd ON sd.skill_id = sj.skill_id
ORDER BY
    salary_year_avg DESC;

Here's the breakdown of the most demanded skills for the top 10 highest paying data analyst jobs in 2023:

  • SQL is leading with a bold count of 8.
  • Python follows closely with a bold count of 7.
  • Tableau is also highly sought after, with a bold count of 6. Other skills like R, Snowflake, Pandas, and Excel show varying degrees of demand.

3. In-Demand Skills for Data Analysts

This query helped identify the skills most frequently requested in job postings, directing focus to areas with high demand.

SELECT
    skills,
    COUNT(*) AS demand_count
FROM
    job_postings_fact AS jp
JOIN
    skills_job_dim AS sj ON sj.job_id = jp.job_id
JOIN
    skills_dim AS sd ON sd.skill_id = sj.skill_id
WHERE
    job_title_short = 'Data Analyst'
GROUP BY
    skills
ORDER BY
    2 DESC
LIMIT 5;

Here's the breakdown of the most demanded skills for data analysts in 2023:

  • SQL and Excel remain fundamental, emphasizing the need for strong foundational skills in data processing and spreadsheet manipulation.

  • Programming and Visualization Tools like Python, Tableau, and Power BI are essential, pointing towards the increasing importance of technical skills in data storytelling and decision support.

skills demand_count
SQL 92628
Excel 67031
Python 57326
Tableau 46554
Power BI 39468

Table of the demand for the top 5 skills in data analyst postings

4. Skills Based on Salary

Exploring the average salaries associated with different skills revealed which skills are the highest paying.

SELECT
    skills,
    ROUND(AVG(salary_year_avg), 2) AS average_salary
FROM
    job_postings_fact AS jp
JOIN
    skills_job_dim AS sj ON sj.job_id = jp.job_id
JOIN
    skills_dim AS sd ON sd.skill_id = sj.skill_id
WHERE
    job_title_short = 'Data Analyst'
    AND salary_year_avg IS NOT NULL
GROUP BY
    skills
ORDER BY
    2 DESC
LIMIT 25;

Here's a breakdown of the results for top paying skills for Data Analysts:

  • High Demand for Big Data & ML Skills: Top salaries are commanded by analysts skilled in big data technologies (PySpark, Couchbase), machine learning tools (DataRobot, Jupyter), and Python libraries (Pandas, NumPy), reflecting the industry's high valuation of data processing and predictive modeling capabilities.
  • Software Development & Deployment Proficiency: Knowledge in development and deployment tools (GitLab, Kubernetes, Airflow) indicates a lucrative crossover between data analysis and engineering, with a premium on skills that facilitate automation and efficient data pipeline management.
  • Cloud Computing Expertise: Familiarity with cloud and data engineering tools (Elasticsearch, Databricks, GCP) underscores the growing importance of cloud-based analytics environments, suggesting that cloud proficiency significantly boosts earning potential in data analytics.
Skills Average Salary
SVN 400000.00
Solidity 179000.00
Couchbase 160515.00
Datarobot 155485.50
Golang 155000.00
Mxnet 149000.00
Dplyr 147633.33
VMware 147500.00
Terraform 146733.83
Twilio 138500.00
Gitlab 134126.00
Kafka 129999.16
Puppet 129820.00
Keras 127013.33
Pytorch 125226.20
Perl 124685.75
Ansible 124370.00
Hugging Face 123950.00
Tensorflow 120646.83
Cassandra 118406.68
Notion 118091.67
Atlassian 117965.60
Bitbucket 116711.75
Airflow 116387.26
Scala 115479.53

Table of the average salary for the top 10 paying skills for data analysts

5. Most Optimal Skills to Learn

Combining insights from demand and salary data, this query aimed to pinpoint skills that are both in high demand and have high salaries, offering a strategic focus for skill development.

WITH skills_demand AS (
    SELECT
        sd.skill_id,
        skills,
        COUNT(*) AS demand_count
    FROM
        job_postings_fact AS jp
    JOIN
        skills_job_dim AS sj ON sj.job_id = jp.job_id
    JOIN
        skills_dim AS sd ON sd.skill_id = sj.skill_id
    WHERE
        job_title_short = 'Data Analyst'
        AND salary_year_avg IS NOT NULL
    GROUP BY
        sd.skill_id),

average_salary AS (
    SELECT
        sd.skill_id,
        skills,
        ROUND(AVG(salary_year_avg), 2) AS avg_salary
    FROM
        job_postings_fact AS jp
    JOIN
        skills_job_dim AS sj ON sj.job_id = jp.job_id
    JOIN
        skills_dim AS sd ON sd.skill_id = sj.skill_id
    WHERE
        job_title_short = 'Data Analyst'
        AND salary_year_avg IS NOT NULL
    GROUP BY
        sd.skill_id)

SELECT
    skills_demand.skill_id,
    skills_demand.skills,
    demand_count,
    avg_salary
FROM
    skills_demand
JOIN
    average_salary ON skills_demand.skill_id = average_salary.skill_id
WHERE
    demand_count > 10
ORDER BY
    avg_salary DESC, demand_count DESC;

Here's the data formatted into a table:

skill demand_count average_salary
Kafka 40 129999.16
Pytorch 20 125226.20
Perl 20 124685.75
Tensorflow 24 120646.83
Cassandra 11 118406.68
Atlassian 15 117965.60
Airflow 71 116387.26
Scala 59 115479.53
Linux 58 114883.20
Confluence 62 114153.12

Here's a breakdown of the most optimal skills for Data Analysts in 2023:

  • High-Demand Programming Languages: Python and R stand out for their high demand, with demand counts of 236 and 148 respectively. Despite their high demand, their average salaries are around $101,397 for Python and $100,499 for R, indicating that proficiency in these languages is highly valued but also widely available.
  • Cloud Tools and Technologies: Skills in specialized technologies such as Snowflake, Azure, AWS, and BigQuery show significant demand with relatively high average salaries, pointing towards the growing importance of cloud platforms and big data technologies in data analysis.
  • Business Intelligence and Visualization Tools: Tableau and Looker, with demand counts of 230 and 49 respectively, and average salaries around $99,288 and $103,795, highlight the critical role of data visualization and business intelligence in deriving actionable insights from data.
  • Database Technologies: The demand for skills in traditional and NoSQL databases (Oracle, SQL Server, NoSQL) with average salaries ranging from $97,786 to $104,534, reflects the enduring need for data storage, retrieval, and management expertise.

What I Learned

Throughout this adventure, I've turbocharged my SQL toolkit with some serious firepower:

  • 🧩 Complex Query Crafting: Mastered the art of advanced SQL, merging tables like a pro and wielding WITH clauses for ninja-level temp table maneuvers.
  • πŸ“Š Data Aggregation: Got cozy with GROUP BY and turned aggregate functions like COUNT() and AVG() into my data-summarizing sidekicks.
  • πŸ’‘ Analytical Wizardry: Leveled up my real-world puzzle-solving skills, turning questions into actionable, insightful SQL queries.

Conclusion

Insights

From the analysis, several general insights emerged:

  1. Top-Paying Data Analyst Jobs: The highest-paying jobs for data analysts that allow remote work offer a wide range of salaries, the highest at $650,000!
  2. Skills for Top-Paying Jobs: High-paying data analyst jobs require advanced proficiency in SQL, suggesting it’s a critical skill for earning a top salary.
  3. Most In-Demand Skills: SQL is also the most demanded skill in the data analyst job market, thus making it essential for job seekers.
  4. Skills with Higher Salaries: Specialized skills, such as SVN and Solidity, are associated with the highest average salaries, indicating a premium on niche expertise.
  5. Optimal Skills for Job Market Value: SQL leads in demand and offers for a high average salary, positioning it as one of the most optimal skills for data analysts to learn to maximize their market value.

Closing Thoughts

This project enhanced my SQL skills and provided valuable insights into the data analyst job market. The findings from the analysis serve as a guide to prioritizing skill development and job search efforts. Aspiring data analysts can better position themselves in a competitive job market by focusing on high-demand, high-salary skills. This exploration highlights the importance of continuous learning and adaptation to emerging trends in the field of data analytics.

About