Sagnik Roy

QUALIFICATIONS

Education
-
Northeastern University, D'Amore McKim School of Business Sep 2023 - Dec 2024
Master of Science in Business Analytics
-
SRM Institute of Science and Technology Jul 2019 - May 2023
Bachelor's degree in Computer Science Engineering specialization Big Data Analytics
-
National University of Singapore Jun 2021 - Sep 2021
Summer Program in Data Analytics using Deep Learning
Here are my Credentials. Please take a look
Certifications
-
Certified Business Analysis Professional (CBAP)
-
SAS Visual Business Analytics Professional
-
Data Visualization and Communication using Tableau - Duke University
-
Machine Learning - Stanford University
-
Oracle Database Foundations
-
IBM Enterprise Design Thinking Practitioner
-
Project Management Essentials Certified (PMEC)
Experience
Expedia Group Apr 2024 - Jul 2024
Competitive Intelligence Analyst Extern Boston, Massachussetts
-
Conducted trends analysis on digital advertising across the Asia-Pacific region, using python and data mining practices, contributing in 25% increase in ad revenue by tapping into emerging markets
-
Using data analytics, identified the top retail media networks (RMNs) and countries in the region with either evolved or emerging markets for the industry.
-
Scraped and integrated data from 4 major competitors using ETL tools, and managed staging area loading
-
Applied OLAP methodologies to forecast their prospective market share in the region, leveraging Tableau and Python
-
Recommended solutions resulting in an 18% increase in revenue, involving pricing strategies and targeted initiatives
Energy Innovation Capital Feb 2024 - Apr 2024
Data Analytics and VC Industry Research Extern Boston, Massachussetts
-
Extracted and preprocessed data from 7 sources on the Geothermal Technology market using SQL and Python on the Geothermal Technology market by assessing the impact of current market trends
-
Recognized USPs and key investors of 5 emerging startups, and 5 established companies, and conducted SWOT Analysis of the market against possible alternatives like traditional energy sources and other renewable sources
-
Projected finances and profitability of SensorEra, a pre-seeded company, and gathered market and financial insights for the next 5 years after analyzing their product maturity
-
Presented 3 Investment Summaries, who are clients of EIC, using Tableau and PowerPoint
Hewlett-Packard-Enterprise Jun 2021 - Sep 2021
Data Scientist Intern Singapore City, Singapore
-
Conducted in-depth analyses on datasets used in deep learning applications.
-
Benchmarked baseline models to assess network performance, focusing on data optimization and hyperparameter tuning. Also developed data-driven business solutions for the company.
-
Developed AI Solutions for the company including a Sign Language-to-Text Conversion System, using a 2D convolutional neural network, achieving 89% accuracy
-
Performed feature engineering from a self-augmented dataset containing 1000 images of 27 different signs
-
Applied a Gaussian Blur filter and enhanced image processing accuracy by 21%, by adding a layer to the model, to differentiate between similar looking signs
Team 1.618 Jan 2020 - Aug 2021 User Interface Developer/App Developer Team Lead Chennai, Tamil Nadu, India
-
Led a team of 8 members to develop a data acquisition application, that displayed various features of our vehicle, like Speed, Acceleration, Temperature, Range, at real time, resulting in a 15% decrease in total energy usage, and 20% increase in lap time efficiency
-
Strategized acquisition and implementation of 200 sensors and managed data storage capabilities of sensor data
-
Utilized the sensor data to develop a Vehicle Dashboard, using Grafana and InfluxDB, and collaborated with the data science team to deploy a flag detection system, to increase vehicle efficiency
Projects
Business Intelligence
British Airways Review - Tableau
Data Professionals Survey - PowerBI
-
Developed a highly interactive Tableau dashboard, allowing users to seamlessly toggle between metrics such as overall ratings, cabin staff service, food, and entertainment ratings
-
Implemented dynamic filters that enable users to drill down into specific data points by date, traveler type, aircraft, and continent.
-
Integrated geographical data to provide an interactive map feature that allows filtering reviews by countries.
-
Demonstrated a realistic data analysis workflow and added detailed tooltips and summary to enhance the user experience
-
Leveraged survey data from 630 data professionals and collected diverse insights, including job titles, salaries, programming languages, and demographics.
-
Utilized Power Query for data cleaning and transformation
-
Created various visualizations including clustered bar charts, gauges, and tree maps.
-
Visualized average salaries by job title and favorite programming languages among respondents.
-
Integrated multiple visualizations into a cohesive dashboard and enabled easy filtering of data by country for deeper insights.
-
Utilized gauges to measure respondents' satisfaction with work-life balance and salaries.
-
Analyzed key metrics such as average salary by job title and programming language popularity


Cloud Computing
RedFin Housing Market Data -
Amazon Elastic MapReduce
Amazon Best Selling Products - Amazon S3 and QuickSight
-
Configured an EMR cluster with appropriate instance types for primary, core, and optional task nodes, ensuring the cluster is ready for data processing tasks.
-
Established two S3 buckets: one for storing raw data and another for storing transformed data, ensuring organized data management.
-
Set up a Jupyter Notebook within EMR Studio, enabling the use of PySpark for writing and executing data processing scripts.
-
Use Bash commands within the Jupyter Notebook to fetch data from Redfin, storing it in the raw data S3 bucket for further processing.
-
Loaded the raw data into the Jupyter Notebook using PySpark, perform necessary transformations like dropping null values and extracting specific columns.
-
Loaded the transformed data back into the S3 bucket in Parquet format, ensuring efficient storage and retrieval for future use.
-
Create a data visualization dashboard using Amazon QuickSight to analyze a dataset of 50,000 best-selling Amazon products
-
Utilized Amazon S3 for data storage, Amazon QuickSight for data visualization, and Bright Data as the data source
-
Generated various visualizations to explore data attributes such as brand popularity, product prices, and seller information.

-
Developed an automated malaria screening system to aid in effective diagnosis, particularly in rural areas.
-
Utilized Convolutional Neural Networks (CNN) and Vision Transformers to classify cell images .
-
Extracted features from cell images, classifies them with high accuracy, and reduces over-fitting through data augmentation, Dropouts, and Batch Normalization.
-
Compared models using ROC curves, classification reports, and confusion matrices.
-
Malaria Cell Detection System
Machine Learning
Face Detection using PCA
-
Preprocessed the images by resizing and normalizing to ensure consistency and improve model performance.
-
Extracted principal components to capture the most significant features of the facial images.
-
Trained a classifier using the principal components as features. Evaluated the model’s accuracy and fine-tuned it for optimal performance.
-
Used performance metrics such as precision, recall, and F1-score to measure the effectiveness of the face detection model.
-
