Hussein
The DE GUY

Data Engineer specializing in AWS-based data warehousing, data modeling, and ELT pipelines using Redshift, S3, dbt, SQL, and Python. Linkedin

AWS S3, REDSHIFT
& DBT

๐Ÿฟ Netflix Data Warehouse Project Built a scalable data pipeline on Amazon Redshift using dbt to transform and analyze Netflix datasets.
โœจ Highlights:
๐Ÿ—‚๏ธ Staging, Dimension, and Fact tables
๐Ÿ“ธ Snapshots for historical changes (SCD Type 2)
โœ… Custom tests & macros for data quality
๐Ÿ“Š Analysis-ready queries for reporting and insights ๐Ÿ’ก Goal: Turn raw Netflix data into a maintainable, testable, and analytics-ready warehouse.

Airflow
& PostgreSQL

An ETL pipeline using Apache Airflow to extract customer data from a CSV file, clean it with pandas, and load it into PostgreSQL. Includes automated email alerts for success and failure.

Superstore Project

PowerBI, DAX & Data model

This project involves advanced data analysis using Power BI to gain insights from a Superstore dataset. ๐Ÿ›’ After thorough data cleaning and processing, a dashboard was created using DAX and various visualizations. ๐Ÿ“Š Key insights include: 1๏ธโƒฃ Total Customers ๐Ÿ‘ฅ 2๏ธโƒฃ Total Orders ๐Ÿ“ฆ 3๏ธโƒฃ Products Sold ๐Ÿ›๏ธ 4๏ธโƒฃ Total Returns โ†ฉ๏ธ 5๏ธโƒฃ Top-Performing Products and Regions ๐ŸŒŸ

Store
Data Warehouse

๐Ÿฌ Super Store Data Warehouse A data warehouse project designed to analyze sales, revenue, profit, and orders across regions. Built with MySQL and Pentaho using a star schema for efficient reporting and dashboards..


E-commerce

Pandas, Seaborn & Matplotlib

This project focuses on advanced data analysis of an E-Commerce dataset using Python. ๐Ÿ The data was cleaned by handling null values, correcting datatypes, and removing outliers. ๐Ÿงน Visualization charts were created to reveal key insights, including: 1๏ธโƒฃ Top 5 Highest-Priced Products ๐Ÿ’ฐ 2๏ธโƒฃ Best Month for Sales ๐Ÿ“… 3๏ธโƒฃ Top 5 Countries by Revenue ๐ŸŒ

Apps Store

SQL

This project focuses on analyzing an Apple App Store dataset using SQLite. Despite initial challenges with data import and SQLite's handling of large datasets, I successfully split the data for analysis. ๐Ÿš€ The project involved performing Exploratory Data Analysis (EDA) ๐Ÿง, including checking for unique apps, missing values, app distribution by genre, and analyzing user ratings. Key Insights ๐Ÿ“Š: 1๏ธโƒฃ App Type and Ratings: Examined whether paid apps have higher ratings than free apps. 2๏ธโƒฃ Apps with Multilingual Support: Explored if apps supporting more languages tend to have higher user ratings. 3๏ธโƒฃ Highest-Priced Categories: Identified which app categories have the highest prices. 4๏ธโƒฃ Low-Rated Genres: Checked for genres with low user ratings. 5๏ธโƒฃ Description Length and User Ratings: Investigated if there's a correlation between app description length and user ratings. ๐Ÿ“‰๐Ÿ“ˆ

Web Scraping

Python

This project involves web scraping the Horn Africa Jobs website to gather information on job postings. ๐ŸŒ The goal is to extract essential job details, such as: Job Name ๐Ÿ’ผ Salary ๐Ÿ’ฐ Location ๐Ÿ“ The extracted data is then saved into a CSV file for further analysis and accessibility. ๐Ÿ—‚๏ธ This process helps in tracking job opportunities and analyzing trends in the job market, making it easier for job seekers to find relevant positions.

World's Population

Python, PowerBI & BeautifulSoup

This project involved three key stages: data collection, transformation, and visualization. 1๏ธโƒฃ Data Collection: Using web scraping, I gathered valuable table data from a website, akin to a pig foraging for truffles. ๐Ÿฝ 2๏ธโƒฃ Data Transformation: I refined the dataset by removing null values and changing column data types. 3๏ธโƒฃ Data Visualization: I created three clear visuals showing the top 10 countries by population, countries with the highest median age, and the top 5 countries by land area. ๐ŸŒ๐Ÿ“Š Additionally, I automated the process using Python to send the results via email. This project demonstrates how data analysis can uncover valuable insights effectively." ๐Ÿ“ˆ

Employee Attrition

Excel

๐Ÿ“Š Problem Statement: The goal is to analyze the dataset to extract meaningful insights regarding employee retention. ๐Ÿ” Data Analysis We meticulously cleaned the data by addressing duplicates, null values, and errors. Additionally, we categorized working years into four groups: (1-10 years), (11-20 years), (21-30 years), and (31-40 years). Using pivot tables and pivot charts, we created an interactive dashboard. ๐Ÿ’ก Insights Employee Count: The current number of employees exceeds those who left. ๐Ÿ‘ฅ Education: Departing employees were more likely to hold bachelor's degrees compared to master's degrees. ๐ŸŽ“ Departments: The Research and Development department had the highest turnover, yet it also housed the most employees. ๐Ÿข Job Roles: The Sales Executive role saw the highest number of departures. ๐Ÿ‘” These insights highlight key factors influencing employee retention and turnover. ๐Ÿ“ˆ๐Ÿ“‹๐Ÿ”