Hi, I'm Natasha Bernard

Machine Learning Engineer

As a results-driven machine learning engineer, I bring to the table robust critical thinking and problem-solving abilities. I have a strong background in statistics with extensive skills in mathematics and statistical algorithms, as well as scripting languages such as Python and SQL. My skill set encompasses the technical proficiency required for data extraction, the development of predictive models, and crafting effective solutions for business challenges. I enjoy challenges, and I am always looking for opportunities to learn something new.

Contact Me

About Me

My Introduction

I have hands-on experience in executing impactful projects, including the creation of credit scoring models, price forecasting models, and fraud detection models. Moreover, I excel in the intricate task of extensive data clean-up, ensuring the quality and reliability of company data. My practical experience is underscored by a track record of tangible achievements that have significantly contributed to the overall success of the companies I've worked with.

Skills

My Skills

Data Scientist | Machine Learning Engineer

I am skilled in:

Machine Learning

95%

Python

95%

SQL

95%

Spark

90%

Scikit Learn

95%

Airflow

90%

TensorFlow

85%

BigQuery

95%

Looker

90%

Qualifications

My personal journey
Work
Education
2023 - Present

Machine Learning Engineer

iProcure Limited

Role Description
- Developed and maintained Machine Learning algorithms utilizing iProcure's data, enhancing company operations and resulting in revenue growth.
- Managed algorithms for various company projects within predictive modeling environments, continuously optimizing model performance across diverse applications.
- Extracted, transformed, and loaded data from iProcure's data systems and proactively managed data requests and queries to ensure timely access to critical information for various departments, significantly improving decision-making.

2022 - 2023

Junior Data Scientist

IntelliSOFT Consulting LTD

Role Description
- Extracted data, built predictive models, and found solutions for business problems as part of the data science framework.
- Worked with data in various ICL projects to build and deploy machine learning models.
- Performed data extraction, data mining, data visualization, and predictive modeling using scripting languages such as Python and SQL as required in various ICL projects.

2018 - 2022

Economics and Statistics

University of Nairobi

Grade: Second Class Honors Upper Division

- Pursued a Bachelor's degree in Economics and Statistics. I acquired skills in various statistical areas such as data analysis, probability and hypothesis testing as well as time series analysis.

2013 - 2016

Kenya Certificate of Secondary Education

The Kenya High School

Grade: A-

- I excelled in subjects such as Mathematics and Physics, attaining a grade A in each in my KCSE examination. I was the head of the Junior Achievement (JA) club whose main goal was to prepare students to succeed in the global economy.

Portfolio

Projects

Credit Scoring Model

In a credit offering initiative targeting over 3,500 agro-dealers in Kenya, the primary goal was to establish a robust credit scoring model that accurately assessed the creditworthiness of businesses. The focus was on providing reliable credit limits to mitigate default risks and financial losses for the company. The model innovatively based credit scores on the performance of each business rather than an individual owner, with selected variables reflecting the overall health of their operations. Implementing an unsupervised learning approach, I devised risk grade coefficients for individual businesses, coupled with appropriate credit limits. Through iterative training, the model achieved an impressive accuracy rate of approximately 87%. Since rolling out, the model has not experienced any defaults, showcasing its effectiveness in minimizing financial risk.

Comprehensive Data-cleanup Project

In a comprehensive data cleanup initiative spanning a decade of transaction data, I led the transformation of inconsistent and poor-quality information into a reliable asset for decision-making. Utilizing a PySpark workflow, I systematically cleaned and processed data columns for enhanced consistency and quality. Applying advanced machine learning techniques, including clustering and topic modeling, we achieved a 90% cleanup success rate for transactions totaling approximately 100 billion. The project's impact extended organization-wide, providing swift access to clean data for various departments and significantly enhancing decision-making capabilities, resulting in a notable increase in revenue.

Price Forecasting Model

I developed an innovative price forecasting model whose primary objective was to proactively predict market price shifts, enabling the company to make strategic decisions in real-time and prevent financial losses. Employing an XGBoost algorithm, the model exhibited exceptional precision, outperforming traditional forecasting methods. By incorporating relevant features and historical pricing data, the model proved highly effective in addressing the company's pricing challenges. Anticipating market price changes allowed the company to proactively adjust product prices, preventing losses associated with outdated strategies. Moreover, the model empowered the company to make strategic moves in alignment with market trends, enhancing its competitiveness and optimizing revenue generation.

Credit Card Fraud Detection Model

In the domain of credit card fraud detection, I tackled a classification problem using preprocessed and PCA-transformed data. Employing data mining, I visualized patterns with Matplotlib and Seaborn. I addressed data imbalance through oversampling and undersampling, iteratively testing various models. Additionally, I applied isolation forest for anomaly detection to identify anomalous data points, ensuring a comprehensive approach to safeguard against fraudulent activity. The Random Forest model emerged as the top performer, achieving an F1 score of 0.74.

View on GitHub

Bank Churn Model

In a bank churn prediction project, I engineered a model to foresee customer churn. Employing visualizations using libraries such as matplotlib and seaborn, I meticulously analyzed the data and explored internal correlations to understand intricate dynamics within the dataset. Categorical feature encoding and data standardization were integral in aligning variables and ensuring their meaningful incorporation into the model. To address class imbalance and enhance model robustness, I undertook data balancing before fitting it into a logistic regression model. The discernible result was a noteworthy 0.11 improvement in the ROC AUC score over the base model's performance. This optimization not only bolstered the model's predictive capabilities but also laid the groundwork for more informed decision-making regarding customer churn in the banking context.

View on GitHub

Disaster Tweets Model

In this Natural Language Processing (NLP) project, the objective was to develop a model for predicting tweets related to real disasters versus those that are not. The project commenced with exploratory analysis, examining the distribution of text data and employing visualizations to unveil underlying structures within the dataset. To enhance the model's effectiveness, I implemented text preprocessing techniques, including lower-casing and the removal of punctuation and tags. Leveraging TfidfVectorizer, the text data underwent transformation, subsequently fitting it into a logistic regression model. The result was a model that achieved a ROC AUC score of 0.64, demonstrating its proficiency in distinguishing between tweets related to real disasters and those that are not.

View on GitHub

Contact Me

Get in touch

Call Me

+254-707-156-064

Email

natbernard15@gmail.com

Location

Nairobi - Kenya
Send Message