it’s Vivek Murali

A DATA SCIENTIST AND A DATA ENGINEER

SCROLL DOWN

ABOUT ME

Profile picture of a smiling person in a grayscale filter with yellow diagonal lines in the background
#AIEngineers
#TechPawfessionals
#DataAlchemy
Plumbing Data Pipeline Like A Pro
Status: Working Remote

DESIGN - 83%

83% Complete

DEVELOPMENT - 85%

70% Complete

DATA ANALYSIS - 77%

77% Complete

DATA ENGINEERING - 75%

75% Complete

MACHINE LEARNING - 80%

80% Complete

CLOUD COMPUTE - 70%

70% Complete

Hello there! I'm Vivek, a data professional passionately dedicated to unraveling the mysteries of data using cutting-edge tools and services. With a blend of scientific precision and engineering finesse, I delve into the depths of datasets, extracting insights that illuminate hidden patterns and trends. As a working professional in both data engineering and data science, I navigate the complexities of data with expertise and curiosity. My playground consists of open source tools and cloud services, indispensable companions in the realm of data wizardry. Whether I'm wrangling messy datasets or orchestrating complex analyses, Python serves as my trusty wand, casting spells of data manipulation and exploration.

In this fast-paced digital landscape, I'm constantly on the lookout for the latest tools and technologies to sharpen my skills. I'm also diving into learning Rust and exploring other cloud services to broaden my expertise and stay at the forefront of innovation. Beyond the realm of data, I'm an avid explorer, venturing into uncharted territories of tech to stay ahead of the curve. So, join me on this exhilarating journey as we decode the secrets of data and embark on a quest for knowledge and discovery!

OFFERINGS

Design

Designing the Data Pipelines for a huge volume and velocity of data.

Development

Developing Data Engineering and Data Science complex solutions with understanding of domain.

Data Analysis

Providing complete data insights on complex data problems.

Data Engineering

Application of Open Source data engineering frameworks to desgin and manage data.

Cloud Compute

Deployment and Manage Statefull or Stateless applications with the help of cloud infrastructure, Docker, and Kubernetes.

Machine Learning

Building NLP, LLM, Time-Series, and Audio based solutions to the complex Machine learning problems.

Data Visualization

Data visualization turns numbers into a chart-topping party even spreadsheets want to attend

Achievements

5+

Collaborations

1890+

Hours Worked

10+

Course Completed

30+

Finished Projects

5+

Awards

10+

Workshops Attended

Badges & Skills

external-rust-is-a-multi-paradigm-system-programming-language-logo-bold-tal-revivo Azure Data Engineer

EXPERIENCE

3 Months 2016
Data & Business Analyst[Internship] - 6th Energy Pvt Ltd
May 2016 ~ August 2016

Analyzed sensory data from various locations and created a time-series forecast summary report on electricity consumption of generators, batteries, and alternative current for the individual locations

2 Years 2019
Data Scientist - 73 Strings
August 2019 ~ August 2021

Worked closely with stakeholders across departments to design, build and deploy various initiatives within the data platform.
Developed, deployed, and maintained not only data services using ElasticSearch, Airflow, and Snowflake but also, ETL & ML pipeline for news analysis and earning call transcripts of listed and non-listed companies. Contributed to research, modeling, and implementation of news summarizer, company evaluation platform, andcontact_title NLP-related use cases like financial NER, and financial digest summary automation. Automated data extraction process for job posts, news and reviews for specific company and evaluation of in-house APIs.

3 Years 2021
Data Engineer - Ivoyant Systems Pvt Ltd
September 2021 ~ September 2024

Designed best practices to support continuous process automation for data ingestion and data pipeline workflows. Designed and implemented data solutions best suited to deliver on our customer's needs. Completed software development life cycle experience including design, documentation, implementation, testing, and deployment.
Continually explored new technologies in an effort always to grow and uncover better ways to solve problems. Part of POC implemented an End-To-End solution for a Speech-To-Text annotator for healthcare assignments using Azure, NLP stack, and Docker. Developed, deployed, and maintained data pipeline to trigger anomalies in data using Apache Beam and created KPIs using Metabase, and OLAP data store. Implemented customer segmentation and churn prediction using Apache Spark.

2 Months 2024
Data Scientist - Leo CybSec
September 2024 ~ Present

Joined Recently as a Data Scientist.

Activity

Assignment, where we dive into the fascinating world of research papers, data science/engineering pipelines, and portfolio projects, exploring the latest advancements and uncovering valuable insights to help you navigate these domains with confidence.

LOOKING FOR BIG DATA SOLUTION?

Found anything interesting to talk about? Send me an invite we can have a discussion

Information Weekly



AutoGen - A programming framework for agentic AI.

AutoGen is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks. AutoGen aims to streamline the development and research of agentic AI, much like PyTorch does for Deep Learning.

Check It Out
drawDB - Free, simple, and intuitive database design tool and SQL generator.

DrawDB is a robust and user-friendly database entity relationship (DBER) editor right in your browser. Build diagrams with a few clicks, export sql scripts, customize your editor, and more without creating an account.

Check It Out
rubicon-ml - A Data Science tool that captures and stores model training and execution information.

rubicon-ml is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a repeatable and searchable way. Its git integration associates these inputs and outputs directly with the model code that produced them to ensure full auditability and reproducibility for both developers and stakeholders alike.

Check It Out
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine.

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.

Check It Out
Fabric is an open-source framework for augmenting humans using AI.

Fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.

Check It Out
Deep Lake: Database for AI

Deep Lake is a Database for AI powered by a storage format optimized for deep-learning applications. Deep Lake simplifies the deployment of enterprise-grade LLM-based products by offering storage for all data types .

Check It Out

Blogging

SOCIAL

"A Culinary Voyage Through Asia: Delights at Asia Kitchen by Mainland China" Unleash Adventure with Rajputana Cabs' Jaisalmer Desert Safaris Discover Authentic Japanese Flavors at Oyama in Chennai! Oia in Hennur: Asia's Largest Pub Brings Santorini Magic to Bangalore! Asvah: Where Culinary Craft Meets Coastal Chic! Aeseo: A Culinary Oasis on Greenway's Road - Clean, Delicious, and Unforgettable!

CONTACT

Feel free to contact me for any question. For open source projects, please open an issue or pull request on Github. If you want to follow my work, reach me on Twitter. Otherwise, send me an email.

LOOKING FORWARD TO HEARING FROM YOU!

Call Me

+91 702-216-9336

Address

Peterborough, UK

Links : Resume, CV ect..

E-Mail via Default E-mail Client | Meeting Availability

Frequent ASK Questions

Why did i make this website?
I created and deployed this website on a Digital Ocean Drops using python's Django web framework so that I could learn more about web app design and back-end development.
In the future I will use this website as a nesting ground for web-based computer vision, Data Engineering frameworks implementation Snippets and NLP models workflows. Though I don't expect it to be anything more than a portfolio site, Istrongly suspect that these skills will be crucial to technological development in the years to come.
How long have you been practicing machine learning?
I discovered machine learning at the beginning of 2018 while doing my masters from JAIN University(Bangalore/India). Since then, I have immersed myself in data science communities on Medium and across the web.
I have rich experience in building machine learning solutions for NLP tasks. I have built models to assist in Sentiment Analysis,Keyword Extraction, Classification Of Text, Demographic Clustering, and Topic Modelling.
Do you have any experience working with databases?
I have hosted and deployed my own MongoDB databases with this website. I have a intermediate knowledge of SQL queries, and have had experience working with Azure, AWS and GCP database servers through SSH, also have hands-on experience with MySQL, PostgresSQL, MongoDB, and Cassandra databases. By renting servers from Heroku, GCP, Azure, and AWS I have been able to develop my own back end production environment.
I believe that data scientists who can own every step of the ML lifecycle will be invaluable in the coming years..
Why are you interested in NLP / machine learning?
I am curious by nature and like to explore the details. Machine learning and natural language processing are both excellent tools to understand the world better.
Despite the hype over integrating AI into existing technologies, the implementation of machine learning (especially NLP) is still in its infancy. This field is the wildwest, and I want to work with the pioneers and innovators that create new and amazing things.
Is Data Engineering important in the your workflow?
Data plays a vital role in the development of AI applications, it can be a simple applications such as statistical models or complex deep learning ones.
Unprocessed and Unmanaged data can alter the resuls in a bad way, Data Engineering is not only about the ETL(Etract,Transform,Load) pipeline but also managing vast data with increasing velocity.
What kind of work do you see yourself doing?
I am interested in building modular DataOps and MLOps applications that can communicate with existing production systems.
Position titles that do this work are not well defined. I describe my perfect future position to be a mix of data science, Data Engineering, machine learning engineering, and devops. Essentially a machine learning specialist with the skills and ETL skillset to deploy a model and manage data flow to production enviroment.